All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM
@ 2016-09-30 16:10 Radim Krčmář
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class() Radim Krčmář
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

v2: http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg07790.html

The x2APIC mode works on >=2.8 machine types with this series and <2.7
remain compatible (aka broken).  If the 2.7 compatibility layer, [7/8],
is deemed acceptable, then it should be squashed into [6/8] to avoid a
bisection breaker; see the discussion under [v2 7/7] for details
(http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg07795.html).


Radim Krčmář (8):
  apic: add global apic_get_class()
  apic: add send_msi() to APICCommonClass
  intel_iommu: pass whole remapped addresses to apic
  intel_iommu: redo configuraton check in realize
  intel_iommu: add OnOffAuto intr_eim as "eim" property
  intel_iommu: reject broken EIM
  intel_iommu: keep buggy EIM enabled in 2.7 machine type
  target-i386/kvm: cache the return value of kvm_enable_x2apic()

 hw/i386/intel_iommu.c           | 82 ++++++++++++++++++++++++++++++-----------
 hw/i386/kvm/apic.c              | 19 +++++++---
 hw/i386/pc_q35.c                |  2 +
 hw/i386/xen/xen_apic.c          |  6 +++
 hw/intc/apic.c                  |  8 +++-
 hw/intc/apic_common.c           |  1 +
 include/hw/i386/apic_internal.h |  6 +++
 include/hw/i386/intel_iommu.h   |  1 +
 include/hw/i386/pc.h            |  2 +
 target-i386/cpu.c               | 14 +++++--
 target-i386/kvm-stub.c          |  5 +++
 target-i386/kvm.c               | 30 +++++++++++++++
 target-i386/kvm_i386.h          |  1 +
 13 files changed, 145 insertions(+), 32 deletions(-)

-- 
2.10.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class()
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-03 16:03   ` Eduardo Habkost
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass Radim Krčmář
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Every configuration has only up to one APIC class and we'll be extending
the class with a function that can be called without an instanced
object, so a direct access to the class is convenient.

This patch will break compilation if some code uses apic_get_class()
with CONFIG_USER_ONLY.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v2: assert() instead of error_report() and exit() [Peter]
v3: completely rewrite the mechanism [Eduardo]

It still looks horrible, so I'll be glad for any advice.
And what is CONFIG_USER_ONLY?
---
 hw/intc/apic_common.c           |  1 +
 include/hw/i386/apic_internal.h |  2 ++
 target-i386/cpu.c               | 14 +++++++++++---
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 14ac43c18666..8d01c9c8750e 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -18,6 +18,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>
  */
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu-common.h"
 #include "cpu.h"
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 06c4e9f6f95b..286684857e9f 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -222,4 +222,6 @@ static inline int apic_get_bit(uint32_t *tab, int index)
     return !!(tab[i] & mask);
 }
 
+APICCommonClass *apic_get_class(void);
+
 #endif /* QEMU_APIC_INTERNAL_H */
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 333309b9a70e..6acf9c3c2372 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2845,9 +2845,8 @@ static void mce_init(X86CPU *cpu)
 }
 
 #ifndef CONFIG_USER_ONLY
-static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
+APICCommonClass *apic_get_class(void)
 {
-    APICCommonState *apic;
     const char *apic_type = "apic";
 
     if (kvm_apic_in_kernel()) {
@@ -2856,7 +2855,16 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
         apic_type = "xen-apic";
     }
 
-    cpu->apic_state = DEVICE(object_new(apic_type));
+    return APIC_COMMON_CLASS(object_class_by_name(apic_type));
+}
+
+static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
+{
+    APICCommonState *apic;
+    ObjectClass *apic_object_class = OBJECT_CLASS(apic_get_class());
+
+    assert(apic_object_class);
+    cpu->apic_state = DEVICE(object_new_with_type(apic_object_class->type));
 
     object_property_add_child(OBJECT(cpu), "lapic",
                               OBJECT(cpu->apic_state), &error_abort);
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class() Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 11:06   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The MMIO based interface to APIC doesn't work well with MSIs that have
upper address bits set (remapped x2APIC MSIs).  A specialized interface
is a quick and dirty way to avoid the shortcoming.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v2: change apic_send_msi() to accept MSIMessage [Igor]
---
 hw/i386/kvm/apic.c              | 19 +++++++++++++------
 hw/i386/xen/xen_apic.c          |  6 ++++++
 hw/intc/apic.c                  |  8 ++++++--
 include/hw/i386/apic_internal.h |  4 ++++
 4 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index c016e63fc2ba..be55102c00ca 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -169,6 +169,17 @@ static void kvm_apic_external_nmi(APICCommonState *s)
     run_on_cpu(CPU(s->cpu), do_inject_external_nmi, s);
 }
 
+static void kvm_send_msi(MSIMessage *msg)
+{
+    int ret;
+
+    ret = kvm_irqchip_send_msi(kvm_state, *msg);
+    if (ret < 0) {
+        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
+                strerror(-ret));
+    }
+}
+
 static uint64_t kvm_apic_mem_read(void *opaque, hwaddr addr,
                                   unsigned size)
 {
@@ -179,13 +190,8 @@ static void kvm_apic_mem_write(void *opaque, hwaddr addr,
                                uint64_t data, unsigned size)
 {
     MSIMessage msg = { .address = addr, .data = data };
-    int ret;
 
-    ret = kvm_irqchip_send_msi(kvm_state, msg);
-    if (ret < 0) {
-        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
-                strerror(-ret));
-    }
+    kvm_send_msi(&msg);
 }
 
 static const MemoryRegionOps kvm_apic_io_ops = {
@@ -232,6 +238,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void *data)
     k->enable_tpr_reporting = kvm_apic_enable_tpr_reporting;
     k->vapic_base_update = kvm_apic_vapic_base_update;
     k->external_nmi = kvm_apic_external_nmi;
+    k->send_msi = kvm_send_msi;
 }
 
 static const TypeInfo kvm_apic_info = {
diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
index 21d68ee04b0a..55769eba7ede 100644
--- a/hw/i386/xen/xen_apic.c
+++ b/hw/i386/xen/xen_apic.c
@@ -68,6 +68,11 @@ static void xen_apic_external_nmi(APICCommonState *s)
 {
 }
 
+static void xen_send_msi(MSIMessage *msi)
+{
+    xen_hvm_inject_msi(msi->address, msi->data);
+}
+
 static void xen_apic_class_init(ObjectClass *klass, void *data)
 {
     APICCommonClass *k = APIC_COMMON_CLASS(klass);
@@ -78,6 +83,7 @@ static void xen_apic_class_init(ObjectClass *klass, void *data)
     k->get_tpr = xen_apic_get_tpr;
     k->vapic_base_update = xen_apic_vapic_base_update;
     k->external_nmi = xen_apic_external_nmi;
+    k->send_msi = xen_send_msi;
 }
 
 static const TypeInfo xen_apic_info = {
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 7bd1d279c463..fe15fb602473 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -740,8 +740,10 @@ static uint32_t apic_mem_readl(void *opaque, hwaddr addr)
     return val;
 }
 
-static void apic_send_msi(hwaddr addr, uint32_t data)
+static void apic_send_msi(MSIMessage *msi)
 {
+    uint64_t addr = msi->address;
+    uint32_t data = msi->data;
     uint8_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
     uint8_t vector = (data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
     uint8_t dest_mode = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
@@ -762,7 +764,8 @@ static void apic_mem_writel(void *opaque, hwaddr addr, uint32_t val)
          * APIC is connected directly to the CPU.
          * Mapping them on the global bus happens to work because
          * MSI registers are reserved in APIC MMIO and vice versa. */
-        apic_send_msi(addr, val);
+        MSIMessage msi = { .address = addr, .data = val };
+        apic_send_msi(&msi);
         return;
     }
 
@@ -913,6 +916,7 @@ static void apic_class_init(ObjectClass *klass, void *data)
     k->external_nmi = apic_external_nmi;
     k->pre_save = apic_pre_save;
     k->post_load = apic_post_load;
+    k->send_msi = apic_send_msi;
 }
 
 static const TypeInfo apic_info = {
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 286684857e9f..cdd11fb0938f 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -146,6 +146,10 @@ typedef struct APICCommonClass
     void (*pre_save)(APICCommonState *s);
     void (*post_load)(APICCommonState *s);
     void (*reset)(APICCommonState *s);
+    /* send_msi emulates an APIC bus and its proper place would be in a new
+     * device, but it's convenient to have it here for now.
+     */
+    void (*send_msi)(MSIMessage *msi);
 } APICCommonClass;
 
 struct APICCommonState {
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class() Radim Krčmář
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 11:17   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize Radim Krčmář
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The MMIO interface to APIC only allowed 8 bit addresses, which is not
enough for 32 bit addresses from EIM remapping.
Intel stored upper 24 bits in the high MSI address, so use the same
technique. The technique is also used in KVM MSI interface.
Other APICs are unlikely to handle those upper bits.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
---
 hw/i386/intel_iommu.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9f4e64af1ad5..c39b62b898d8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -31,6 +31,7 @@
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
+#include "hw/i386/apic_internal.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
 static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
                                    hwaddr mesg_data_reg)
 {
-    hwaddr addr;
-    uint32_t data;
+    MSIMessage msi;
 
     assert(mesg_data_reg < DMAR_REG_SIZE);
     assert(mesg_addr_reg < DMAR_REG_SIZE);
 
-    addr = vtd_get_long_raw(s, mesg_addr_reg);
-    data = vtd_get_long_raw(s, mesg_data_reg);
+    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
+    msi.data = vtd_get_long_raw(s, mesg_data_reg);
 
-    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
-    address_space_stl_le(&address_space_memory, addr, data,
-                         MEMTXATTRS_UNSPECIFIED, NULL);
+    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
+                msi.address, msi.data);
+    apic_get_class()->send_msi(&msi);
 }
 
 /* Generate a fault event to software via MSI if conditions are met.
@@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
     msg.dest_mode = irq->dest_mode;
     msg.redir_hint = irq->redir_hint;
     msg.dest = irq->dest;
+    msg.__addr_hi = irq->dest & 0xffffff00;
     msg.__addr_head = cpu_to_le32(0xfee);
     /* Keep this from original MSI address bits */
     msg.__not_used = irq->msi_addr_last_bits;
@@ -2281,11 +2282,7 @@ static MemTxResult vtd_mem_ir_write(void *opaque, hwaddr addr,
                 " for device sid 0x%04x",
                 to.address, to.data, sid);
 
-    if (dma_memory_write(&address_space_memory, to.address,
-                         &to.data, size)) {
-        VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
-                    " value 0x%"PRIx32, to.address, to.data);
-    }
+    apic_get_class()->send_msi(&to);
 
     return MEMTX_OK;
 }
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (2 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 11:40   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

* there no point in configuring the device if realization is going to
  fail, so move the check to the beginning,
* create a separate function for the check,
* use error_setg() instead error_report().

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v3:
* use error_setg [Paolo]
* create a new function [Peter]
---
 hw/i386/intel_iommu.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index c39b62b898d8..5b06b4091f36 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -21,6 +21,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
+#include "qapi/error.h"
 #include "hw/sysbus.h"
 #include "exec/address-spaces.h"
 #include "intel_iommu_internal.h"
@@ -2448,6 +2449,18 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &vtd_as->as;
 }
 
+static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
+{
+    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
+    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
+        !kvm_irqchip_is_split()) {
+        error_setg(errp, "Intel Interrupt Remapping cannot work with "
+                         "kernel-irqchip=on, please use 'split|off'.");
+        return false;
+    }
+    return true;
+}
+
 static void vtd_realize(DeviceState *dev, Error **errp)
 {
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
@@ -2457,6 +2470,11 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;
+
+    if (!vtd_check_config(x86_iommu, errp)) {
+        return;
+    }
+
     memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num));
     memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s,
                           "intel_iommu", DMAR_REG_SIZE);
@@ -2471,14 +2489,6 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
     /* Pseudo address space under root PCI bus. */
     pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
-
-    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
-    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
-        !kvm_irqchip_is_split()) {
-        error_report("Intel Interrupt Remapping cannot work with "
-                     "kernel-irqchip=on, please use 'split|off'.");
-        exit(1);
-    }
 }
 
 static void vtd_class_init(ObjectClass *klass, void *data)
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (3 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 12:34   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM Radim Krčmář
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

The default (auto) emulates the current behavior.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v3:
 * use error_setg [Paolo]
 * shorten the code [Peter]
---
 hw/i386/intel_iommu.c         | 24 +++++++++++++++++++++---
 include/hw/i386/intel_iommu.h |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 5b06b4091f36..17892b8c336b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2012,6 +2012,8 @@ static const MemoryRegionOps vtd_mem_ops = {
 
 static Property vtd_properties[] = {
     DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
+    DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
+                            ON_OFF_AUTO_AUTO),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2368,7 +2370,11 @@ static void vtd_init(IntelIOMMUState *s)
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
     if (x86_iommu->intr_supported) {
-        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
+        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
+        if (s->intr_eim == ON_OFF_AUTO_ON) {
+            s->ecap |= VTD_ECAP_EIM;
+        }
+        assert(s->intr_eim != ON_OFF_AUTO_AUTO);
     }
 
     vtd_reset_context_cache(s);
@@ -2449,8 +2455,10 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &vtd_as->as;
 }
 
-static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
+static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
 {
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+
     /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
     if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
         !kvm_irqchip_is_split()) {
@@ -2458,6 +2466,16 @@ static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
                          "kernel-irqchip=on, please use 'split|off'.");
         return false;
     }
+    if (s->intr_eim == ON_OFF_AUTO_ON && !x86_iommu->intr_supported) {
+        error_setg(errp, "eim=on cannot be selected without intremap=on");
+        return false;
+    }
+
+    if (s->intr_eim == ON_OFF_AUTO_AUTO) {
+        s->intr_eim = x86_iommu->intr_supported ?
+                                              ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
+    }
+
     return true;
 }
 
@@ -2471,7 +2489,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;
 
-    if (!vtd_check_config(x86_iommu, errp)) {
+    if (!vtd_decide_config(s, errp)) {
         return;
     }
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index a42dbd745a70..b5ac60927b1f 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -289,6 +289,7 @@ struct IntelIOMMUState {
     dma_addr_t intr_root;           /* Interrupt remapping table pointer */
     uint32_t intr_size;             /* Number of IR table entries */
     bool intr_eime;                 /* Extended interrupt mode enabled */
+    OnOffAuto intr_eim;             /* Toggle for EIM cabability */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (4 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 13:43   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type Radim Krčmář
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.

There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC;  unconditionally, because making the
option depend on KVM version would be a maintanance burden.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v3:
 * use error_setg [Paolo]
 * shorten the code [Peter]
v2:
 * adapt to new intr_eim parameter
 * provide first linux version that has x2apic api
 * disable QEMU's LAPIC
---
 hw/i386/intel_iommu.c  | 15 ++++++++++++++-
 target-i386/kvm-stub.c |  5 +++++
 target-i386/kvm.c      | 13 +++++++++++++
 target-i386/kvm_i386.h |  1 +
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 17892b8c336b..efb018b85544 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -33,6 +33,7 @@
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
+#include "kvm_i386.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -2472,10 +2473,22 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
     }
 
     if (s->intr_eim == ON_OFF_AUTO_AUTO) {
-        s->intr_eim = x86_iommu->intr_supported ?
+        s->intr_eim = x86_iommu->intr_supported && kvm_irqchip_in_kernel() ?
                                               ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
     }
 
+    if (s->intr_eim == ON_OFF_AUTO_ON) {
+        if (kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {
+            error_setg(errp, "eim=on requires support on the KVM side"
+                             "(X2APIC_API, first shipped in v4.7)");
+            return false;
+        }
+        if (!kvm_irqchip_in_kernel()) {
+            error_setg(errp, "eim=on requires accel=kvm,kernel-irqchip=split");
+            return false;
+        }
+    }
+
     return true;
 }
 
diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c
index cdf15061091d..bda4dc2f0c57 100644
--- a/target-i386/kvm-stub.c
+++ b/target-i386/kvm-stub.c
@@ -25,6 +25,11 @@ bool kvm_has_smm(void)
     return 1;
 }
 
+bool kvm_enable_x2apic(void)
+{
+    return false;
+}
+
 /* This function is only called inside conditionals which we
  * rely on the compiler to optimize out when CONFIG_KVM is not
  * defined.
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index ee1f53e56953..0fd664648665 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -122,6 +122,19 @@ bool kvm_allows_irq0_override(void)
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
 }
 
+static bool kvm_x2apic_api_set_flags(uint64_t flags)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+
+    return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
+}
+
+bool kvm_enable_x2apic(void)
+{
+    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
+                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
+}
+
 static int kvm_get_tsc(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target-i386/kvm_i386.h b/target-i386/kvm_i386.h
index 36407e0a5dc7..5c369b1c5b40 100644
--- a/target-i386/kvm_i386.h
+++ b/target-i386/kvm_i386.h
@@ -43,4 +43,5 @@ int kvm_device_msix_deassign(KVMState *s, uint32_t dev_id);
 
 void kvm_put_apicbase(X86CPU *cpu, uint64_t value);
 
+bool kvm_enable_x2apic(void);
 #endif
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (5 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 12:18   ` Igor Mammedov
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
  2016-09-30 17:22 ` [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM no-reply
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

QEMU 2.7 allowed EIM even in configurations that were forbidden in the
last patch because they were not working, like old KVM or userspace
APIC.  In order to keep backward compatibility, we again allow guests to
misbehave in non-obvious ways, and make it the default.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
v3: shorten the code [Peter]
---
 hw/i386/intel_iommu.c | 12 +++++++-----
 hw/i386/pc_q35.c      |  2 ++
 include/hw/i386/pc.h  |  2 ++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index efb018b85544..13a2ac5a8fb9 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2456,9 +2456,11 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
     return &vtd_as->as;
 }
 
-static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
+static bool vtd_decide_config(IntelIOMMUState *s, PCMachineState *pcms,
+                              Error **errp)
 {
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(MACHINE_GET_CLASS(pcms));
 
     /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
     if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
@@ -2473,11 +2475,11 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
     }
 
     if (s->intr_eim == ON_OFF_AUTO_AUTO) {
-        s->intr_eim = x86_iommu->intr_supported && kvm_irqchip_in_kernel() ?
+        s->intr_eim = (kvm_irqchip_in_kernel() || pcmc->buggy_intel_iommu_eim)
+                      && x86_iommu->intr_supported ?
                                               ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
     }
-
-    if (s->intr_eim == ON_OFF_AUTO_ON) {
+    if (s->intr_eim == ON_OFF_AUTO_ON && !pcmc->buggy_intel_iommu_eim) {
         if (kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {
             error_setg(errp, "eim=on requires support on the KVM side"
                              "(X2APIC_API, first shipped in v4.7)");
@@ -2502,7 +2504,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     VTD_DPRINTF(GENERAL, "");
     x86_iommu->type = TYPE_INTEL;
 
-    if (!vtd_decide_config(s, errp)) {
+    if (!vtd_decide_config(s, pcms, errp)) {
         return;
     }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 0b214f24c977..97f419fbf4dd 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -304,8 +304,10 @@ DEFINE_Q35_MACHINE(v2_8, "pc-q35-2.8", NULL,
 
 static void pc_q35_2_7_machine_options(MachineClass *m)
 {
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_q35_2_8_machine_options(m);
     m->alias = NULL;
+    pcmc->buggy_intel_iommu_eim = true;
     SET_MACHINE_COMPAT(m, PC_COMPAT_2_7);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 47bdf10cfd9b..4bd435f8fa5c 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -143,6 +143,8 @@ struct PCMachineClass {
     bool save_tsc_khz;
     /* generate legacy CPU hotplug AML */
     bool legacy_cpu_hotplug;
+    /* enable buggy Intel-IOMMU EIM by default */
+    bool buggy_intel_iommu_eim;
 };
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic()
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (6 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type Radim Krčmář
@ 2016-09-30 16:10 ` Radim Krčmář
  2016-10-04 11:33   ` Igor Mammedov
  2016-09-30 17:22 ` [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM no-reply
  8 siblings, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-09-30 16:10 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Igor Mammedov, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

Assume that KVM would have returned the same on subsequent runs.
Abstract the memoizaiton pattern into macros.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 target-i386/kvm.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0fd664648665..113c5bf058ba 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -129,10 +129,27 @@ static bool kvm_x2apic_api_set_flags(uint64_t flags)
     return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
 }
 
+#define MEMOIZE_RESULT(_result, _fn) \
+    ({ \
+        static bool _memoized; \
+        if (_memoized) { \
+            return _result; \
+        } \
+        _memoized = true; \
+        _result = _fn; \
+    })
+
+#define MEMOIZE(_fn) \
+    ({ \
+        static typeof(_fn) _result; \
+        MEMOIZE_RESULT(_result, _fn); \
+    })
+
 bool kvm_enable_x2apic(void)
 {
-    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
-                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
+    return MEMOIZE(
+             kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
+                                      KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK));
 }
 
 static int kvm_get_tsc(CPUState *cs)
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM
  2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
                   ` (7 preceding siblings ...)
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
@ 2016-09-30 17:22 ` no-reply
  8 siblings, 0 replies; 26+ messages in thread
From: no-reply @ 2016-09-30 17:22 UTC (permalink / raw)
  To: rkrcmar; +Cc: famz, qemu-devel, ehabkost, mst, peterx, pbonzini, imammedo, rth

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20160930161013.9832-1-rkrcmar@redhat.com
Subject: [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
make J=8 docker-test-quick@centos6
make J=8 docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
9fddd31 target-i386/kvm: cache the return value of kvm_enable_x2apic()
bd7919f intel_iommu: keep buggy EIM enabled in 2.7 machine type
c3d9773 intel_iommu: reject broken EIM
f69115d intel_iommu: add OnOffAuto intr_eim as "eim" property
6889130 intel_iommu: redo configuraton check in realize
565d076 intel_iommu: pass whole remapped addresses to apic
a5babb3 apic: add send_msi() to APICCommonClass
a6d7368 apic: add global apic_get_class()

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPY RUNNER
  RUN test-quick in centos6
=== OUTPUT END ===

Abort: command timeout (>3600 seconds)


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class()
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class() Radim Krčmář
@ 2016-10-03 16:03   ` Eduardo Habkost
  2016-10-04 10:59     ` Igor Mammedov
  2016-10-04 13:38     ` Radim Krčmář
  0 siblings, 2 replies; 26+ messages in thread
From: Eduardo Habkost @ 2016-10-03 16:03 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Igor Mammedov, Paolo Bonzini,
	Richard Henderson, Michael S. Tsirkin

On Fri, Sep 30, 2016 at 06:10:06PM +0200, Radim Krčmář wrote:
> Every configuration has only up to one APIC class and we'll be extending
> the class with a function that can be called without an instanced
> object, so a direct access to the class is convenient.
> 
> This patch will break compilation if some code uses apic_get_class()
> with CONFIG_USER_ONLY.
> 
> Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
> v2: assert() instead of error_report() and exit() [Peter]
> v3: completely rewrite the mechanism [Eduardo]
> 
> It still looks horrible, so I'll be glad for any advice.
> And what is CONFIG_USER_ONLY?
> ---
>  hw/intc/apic_common.c           |  1 +
>  include/hw/i386/apic_internal.h |  2 ++
>  target-i386/cpu.c               | 14 +++++++++++---
>  3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index 14ac43c18666..8d01c9c8750e 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -18,6 +18,7 @@
>   * License along with this library; if not, see <http://www.gnu.org/licenses/>
>   */
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "qemu-common.h"
>  #include "cpu.h"
> diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
> index 06c4e9f6f95b..286684857e9f 100644
> --- a/include/hw/i386/apic_internal.h
> +++ b/include/hw/i386/apic_internal.h
> @@ -222,4 +222,6 @@ static inline int apic_get_bit(uint32_t *tab, int index)
>      return !!(tab[i] & mask);
>  }
>  
> +APICCommonClass *apic_get_class(void);
> +
>  #endif /* QEMU_APIC_INTERNAL_H */
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 333309b9a70e..6acf9c3c2372 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2845,9 +2845,8 @@ static void mce_init(X86CPU *cpu)
>  }
>  
>  #ifndef CONFIG_USER_ONLY
> -static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> +APICCommonClass *apic_get_class(void)
>  {
> -    APICCommonState *apic;
>      const char *apic_type = "apic";
>  
>      if (kvm_apic_in_kernel()) {
> @@ -2856,7 +2855,16 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
>          apic_type = "xen-apic";
>      }
>  
> -    cpu->apic_state = DEVICE(object_new(apic_type));
> +    return APIC_COMMON_CLASS(object_class_by_name(apic_type));
> +}
> +
> +static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> +{
> +    APICCommonState *apic;
> +    ObjectClass *apic_object_class = OBJECT_CLASS(apic_get_class());
> +
> +    assert(apic_object_class);
> +    cpu->apic_state = DEVICE(object_new_with_type(apic_object_class->type));

ObjectClass::type is private. I believe the common idiom is
object_new(object_class_get_name(c)).

Except for that, I believe the interface is OK and matches the
existing logic. We can always make it better later, if
appropriate.

(e.g. I wonder if we could have a container object for all APICs
(icc-bus?), and move the send_msi() method and the
apic_get_class() logic to it).

>  
>      object_property_add_child(OBJECT(cpu), "lapic",
>                                OBJECT(cpu->apic_state), &error_abort);
> -- 
> 2.10.0
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class()
  2016-10-03 16:03   ` Eduardo Habkost
@ 2016-10-04 10:59     ` Igor Mammedov
  2016-10-04 13:38     ` Radim Krčmář
  1 sibling, 0 replies; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 10:59 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Radim Krčmář,
	Michael S. Tsirkin, qemu-devel, Peter Xu, Paolo Bonzini,
	Richard Henderson

On Mon, 3 Oct 2016 13:03:33 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Fri, Sep 30, 2016 at 06:10:06PM +0200, Radim Krčmář wrote:
> > Every configuration has only up to one APIC class and we'll be extending
> > the class with a function that can be called without an instanced
> > object, so a direct access to the class is convenient.
> > 
> > This patch will break compilation if some code uses apic_get_class()
> > with CONFIG_USER_ONLY.
> > 
> > Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
> > Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> > ---
> > v2: assert() instead of error_report() and exit() [Peter]
> > v3: completely rewrite the mechanism [Eduardo]
> > 
> > It still looks horrible, so I'll be glad for any advice.
> > And what is CONFIG_USER_ONLY?
> > ---
> >  hw/intc/apic_common.c           |  1 +
> >  include/hw/i386/apic_internal.h |  2 ++
> >  target-i386/cpu.c               | 14 +++++++++++---
> >  3 files changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> > index 14ac43c18666..8d01c9c8750e 100644
> > --- a/hw/intc/apic_common.c
> > +++ b/hw/intc/apic_common.c
> > @@ -18,6 +18,7 @@
> >   * License along with this library; if not, see <http://www.gnu.org/licenses/>
> >   */
> >  #include "qemu/osdep.h"
> > +#include "qemu/error-report.h"
> >  #include "qapi/error.h"
> >  #include "qemu-common.h"
> >  #include "cpu.h"
> > diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
> > index 06c4e9f6f95b..286684857e9f 100644
> > --- a/include/hw/i386/apic_internal.h
> > +++ b/include/hw/i386/apic_internal.h
> > @@ -222,4 +222,6 @@ static inline int apic_get_bit(uint32_t *tab, int index)
> >      return !!(tab[i] & mask);
> >  }
> >  
> > +APICCommonClass *apic_get_class(void);
> > +
> >  #endif /* QEMU_APIC_INTERNAL_H */
> > diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> > index 333309b9a70e..6acf9c3c2372 100644
> > --- a/target-i386/cpu.c
> > +++ b/target-i386/cpu.c
> > @@ -2845,9 +2845,8 @@ static void mce_init(X86CPU *cpu)
> >  }
> >  
> >  #ifndef CONFIG_USER_ONLY
> > -static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> > +APICCommonClass *apic_get_class(void)
> >  {
> > -    APICCommonState *apic;
> >      const char *apic_type = "apic";
> >  
> >      if (kvm_apic_in_kernel()) {
> > @@ -2856,7 +2855,16 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> >          apic_type = "xen-apic";
> >      }
> >  
> > -    cpu->apic_state = DEVICE(object_new(apic_type));
> > +    return APIC_COMMON_CLASS(object_class_by_name(apic_type));
> > +}
> > +
> > +static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> > +{
> > +    APICCommonState *apic;
> > +    ObjectClass *apic_object_class = OBJECT_CLASS(apic_get_class());
> > +
> > +    assert(apic_object_class);
> > +    cpu->apic_state = DEVICE(object_new_with_type(apic_object_class->type));  
[...]

> (e.g. I wonder if we could have a container object for all APICs
> (icc-bus?), and move the send_msi() method and the
> apic_get_class() logic to it).
That's probably not worth all the boilerplate code it would bring.

> 
> >  
> >      object_property_add_child(OBJECT(cpu), "lapic",
> >                                OBJECT(cpu->apic_state), &error_abort);
> > -- 
> > 2.10.0
> >   
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass Radim Krčmář
@ 2016-10-04 11:06   ` Igor Mammedov
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 11:06 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:07 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> The MMIO based interface to APIC doesn't work well with MSIs that have
> upper address bits set (remapped x2APIC MSIs).  A specialized interface
> is a quick and dirty way to avoid the shortcoming.
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
> v2: change apic_send_msi() to accept MSIMessage [Igor]
> ---
>  hw/i386/kvm/apic.c              | 19 +++++++++++++------
>  hw/i386/xen/xen_apic.c          |  6 ++++++
>  hw/intc/apic.c                  |  8 ++++++--
>  include/hw/i386/apic_internal.h |  4 ++++
>  4 files changed, 29 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
> index c016e63fc2ba..be55102c00ca 100644
> --- a/hw/i386/kvm/apic.c
> +++ b/hw/i386/kvm/apic.c
> @@ -169,6 +169,17 @@ static void kvm_apic_external_nmi(APICCommonState *s)
>      run_on_cpu(CPU(s->cpu), do_inject_external_nmi, s);
>  }
>  
> +static void kvm_send_msi(MSIMessage *msg)
> +{
> +    int ret;
> +
> +    ret = kvm_irqchip_send_msi(kvm_state, *msg);
> +    if (ret < 0) {
> +        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
> +                strerror(-ret));
> +    }
> +}
> +
>  static uint64_t kvm_apic_mem_read(void *opaque, hwaddr addr,
>                                    unsigned size)
>  {
> @@ -179,13 +190,8 @@ static void kvm_apic_mem_write(void *opaque, hwaddr addr,
>                                 uint64_t data, unsigned size)
>  {
>      MSIMessage msg = { .address = addr, .data = data };
> -    int ret;
>  
> -    ret = kvm_irqchip_send_msi(kvm_state, msg);
> -    if (ret < 0) {
> -        fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
> -                strerror(-ret));
> -    }
> +    kvm_send_msi(&msg);
>  }
>  
>  static const MemoryRegionOps kvm_apic_io_ops = {
> @@ -232,6 +238,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void *data)
>      k->enable_tpr_reporting = kvm_apic_enable_tpr_reporting;
>      k->vapic_base_update = kvm_apic_vapic_base_update;
>      k->external_nmi = kvm_apic_external_nmi;
> +    k->send_msi = kvm_send_msi;
>  }
>  
>  static const TypeInfo kvm_apic_info = {
> diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
> index 21d68ee04b0a..55769eba7ede 100644
> --- a/hw/i386/xen/xen_apic.c
> +++ b/hw/i386/xen/xen_apic.c
> @@ -68,6 +68,11 @@ static void xen_apic_external_nmi(APICCommonState *s)
>  {
>  }
>  
> +static void xen_send_msi(MSIMessage *msi)
> +{
> +    xen_hvm_inject_msi(msi->address, msi->data);
> +}
> +
>  static void xen_apic_class_init(ObjectClass *klass, void *data)
>  {
>      APICCommonClass *k = APIC_COMMON_CLASS(klass);
> @@ -78,6 +83,7 @@ static void xen_apic_class_init(ObjectClass *klass, void *data)
>      k->get_tpr = xen_apic_get_tpr;
>      k->vapic_base_update = xen_apic_vapic_base_update;
>      k->external_nmi = xen_apic_external_nmi;
> +    k->send_msi = xen_send_msi;
>  }
>  
>  static const TypeInfo xen_apic_info = {
> diff --git a/hw/intc/apic.c b/hw/intc/apic.c
> index 7bd1d279c463..fe15fb602473 100644
> --- a/hw/intc/apic.c
> +++ b/hw/intc/apic.c
> @@ -740,8 +740,10 @@ static uint32_t apic_mem_readl(void *opaque, hwaddr addr)
>      return val;
>  }
>  
> -static void apic_send_msi(hwaddr addr, uint32_t data)
> +static void apic_send_msi(MSIMessage *msi)
>  {
> +    uint64_t addr = msi->address;
> +    uint32_t data = msi->data;
>      uint8_t dest = (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>      uint8_t vector = (data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>      uint8_t dest_mode = (addr >> MSI_ADDR_DEST_MODE_SHIFT) & 0x1;
> @@ -762,7 +764,8 @@ static void apic_mem_writel(void *opaque, hwaddr addr, uint32_t val)
>           * APIC is connected directly to the CPU.
>           * Mapping them on the global bus happens to work because
>           * MSI registers are reserved in APIC MMIO and vice versa. */
> -        apic_send_msi(addr, val);
> +        MSIMessage msi = { .address = addr, .data = val };
> +        apic_send_msi(&msi);
>          return;
>      }
>  
> @@ -913,6 +916,7 @@ static void apic_class_init(ObjectClass *klass, void *data)
>      k->external_nmi = apic_external_nmi;
>      k->pre_save = apic_pre_save;
>      k->post_load = apic_post_load;
> +    k->send_msi = apic_send_msi;
>  }
>  
>  static const TypeInfo apic_info = {
> diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
> index 286684857e9f..cdd11fb0938f 100644
> --- a/include/hw/i386/apic_internal.h
> +++ b/include/hw/i386/apic_internal.h
> @@ -146,6 +146,10 @@ typedef struct APICCommonClass
>      void (*pre_save)(APICCommonState *s);
>      void (*post_load)(APICCommonState *s);
>      void (*reset)(APICCommonState *s);
> +    /* send_msi emulates an APIC bus and its proper place would be in a new
> +     * device, but it's convenient to have it here for now.
> +     */
> +    void (*send_msi)(MSIMessage *msi);
>  } APICCommonClass;
>  
>  struct APICCommonState {

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
@ 2016-10-04 11:17   ` Igor Mammedov
  2016-10-08  5:24     ` Peter Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 11:17 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:08 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> The MMIO interface to APIC only allowed 8 bit addresses, which is not
> enough for 32 bit addresses from EIM remapping.
> Intel stored upper 24 bits in the high MSI address, so use the same
> technique. The technique is also used in KVM MSI interface.
> Other APICs are unlikely to handle those upper bits.
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
> v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
> ---
>  hw/i386/intel_iommu.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 9f4e64af1ad5..c39b62b898d8 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -31,6 +31,7 @@
>  #include "hw/i386/x86-iommu.h"
>  #include "hw/pci-host/q35.h"
>  #include "sysemu/kvm.h"
> +#include "hw/i386/apic_internal.h"
>  
>  /*#define DEBUG_INTEL_IOMMU*/
>  #ifdef DEBUG_INTEL_IOMMU
> @@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
>  static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
>                                     hwaddr mesg_data_reg)
>  {
> -    hwaddr addr;
> -    uint32_t data;
> +    MSIMessage msi;
>  
>      assert(mesg_data_reg < DMAR_REG_SIZE);
>      assert(mesg_addr_reg < DMAR_REG_SIZE);
>  
> -    addr = vtd_get_long_raw(s, mesg_addr_reg);
> -    data = vtd_get_long_raw(s, mesg_data_reg);
> +    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
> +    msi.data = vtd_get_long_raw(s, mesg_data_reg);
>  
> -    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
> -    address_space_stl_le(&address_space_memory, addr, data,
> -                         MEMTXATTRS_UNSPECIFIED, NULL);
> +    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
> +                msi.address, msi.data);
> +    apic_get_class()->send_msi(&msi);
>  }
>  
>  /* Generate a fault event to software via MSI if conditions are met.
> @@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
>      msg.dest_mode = irq->dest_mode;
>      msg.redir_hint = irq->redir_hint;
>      msg.dest = irq->dest;
> +    msg.__addr_hi = irq->dest & 0xffffff00;
what about BE host? should it be:

 msg.__addr_hi = cpu_to_le32(irq->dest & 0xffffff00)

Also question to Peter, why __addr_hi is not HOST_WORDS_BIGENDIAN conditioned?
now we have:
struct VTD_MSIMessage {                                                          
    union {                                                                      
        struct {                                                                 
#ifdef HOST_WORDS_BIGENDIAN                                                      
            uint32_t __addr_head:12; /* 0xfee */                                 
[...]                                              
#else                                                                            
[...]
            uint32_t __addr_head:12; /* 0xfee */                                 
#endif                                                                           
            uint32_t __addr_hi:32; 


>      msg.__addr_head = cpu_to_le32(0xfee);
>      /* Keep this from original MSI address bits */
>      msg.__not_used = irq->msi_addr_last_bits;
> @@ -2281,11 +2282,7 @@ static MemTxResult vtd_mem_ir_write(void *opaque, hwaddr addr,
>                  " for device sid 0x%04x",
>                  to.address, to.data, sid);
>  
> -    if (dma_memory_write(&address_space_memory, to.address,
> -                         &to.data, size)) {
> -        VTD_DPRINTF(GENERAL, "error: fail to write 0x%"PRIx64
> -                    " value 0x%"PRIx32, to.address, to.data);
> -    }
> +    apic_get_class()->send_msi(&to);
>  
>      return MEMTX_OK;
>  }

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic()
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
@ 2016-10-04 11:33   ` Igor Mammedov
  2016-10-04 13:45     ` Radim Krčmář
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 11:33 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:13 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> Assume that KVM would have returned the same on subsequent runs.
> Abstract the memoizaiton pattern into macros.
s/memoi/memori/i
Throughout whole patch

> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
>  target-i386/kvm.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 0fd664648665..113c5bf058ba 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -129,10 +129,27 @@ static bool kvm_x2apic_api_set_flags(uint64_t flags)
>      return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
>  }
>  
> +#define MEMOIZE_RESULT(_result, _fn) \
> +    ({ \
> +        static bool _memoized; \
> +        if (_memoized) { \
> +            return _result; \
> +        } \
> +        _memoized = true; \
> +        _result = _fn; \
> +    })
> +
> +#define MEMOIZE(_fn) \
> +    ({ \
> +        static typeof(_fn) _result; \
> +        MEMOIZE_RESULT(_result, _fn); \
> +    })
> +
>  bool kvm_enable_x2apic(void)
>  {
> -    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
> -                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
> +    return MEMOIZE(
> +             kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
> +                                      KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK));
>  }
>  
>  static int kvm_get_tsc(CPUState *cs)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize Radim Krčmář
@ 2016-10-04 11:40   ` Igor Mammedov
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 11:40 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Eduardo Habkost, Michael S. Tsirkin, Peter Xu,
	Paolo Bonzini, Richard Henderson

On Fri, 30 Sep 2016 18:10:09 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> * there no point in configuring the device if realization is going to
>   fail, so move the check to the beginning,
> * create a separate function for the check,
> * use error_setg() instead error_report().
Reviewed-by: Igor Mammedov <imammedo@redhat.com>


> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
> v3:
> * use error_setg [Paolo]
> * create a new function [Peter]
> ---
>  hw/i386/intel_iommu.c | 26 ++++++++++++++++++--------
>  1 file changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index c39b62b898d8..5b06b4091f36 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -21,6 +21,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/error-report.h"
> +#include "qapi/error.h"
>  #include "hw/sysbus.h"
>  #include "exec/address-spaces.h"
>  #include "intel_iommu_internal.h"
> @@ -2448,6 +2449,18 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &vtd_as->as;
>  }
>  
> +static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
> +{
> +    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
> +    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
> +        !kvm_irqchip_is_split()) {
> +        error_setg(errp, "Intel Interrupt Remapping cannot work with "
> +                         "kernel-irqchip=on, please use 'split|off'.");
> +        return false;
> +    }
> +    return true;
> +}
> +
>  static void vtd_realize(DeviceState *dev, Error **errp)
>  {
>      PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> @@ -2457,6 +2470,11 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>  
>      VTD_DPRINTF(GENERAL, "");
>      x86_iommu->type = TYPE_INTEL;
> +
> +    if (!vtd_check_config(x86_iommu, errp)) {
> +        return;
> +    }
> +
>      memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num));
>      memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s,
>                            "intel_iommu", DMAR_REG_SIZE);
> @@ -2471,14 +2489,6 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>      pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
>      /* Pseudo address space under root PCI bus. */
>      pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
> -
> -    /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
> -    if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
> -        !kvm_irqchip_is_split()) {
> -        error_report("Intel Interrupt Remapping cannot work with "
> -                     "kernel-irqchip=on, please use 'split|off'.");
> -        exit(1);
> -    }
>  }
>  
>  static void vtd_class_init(ObjectClass *klass, void *data)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type Radim Krčmář
@ 2016-10-04 12:18   ` Igor Mammedov
  2016-10-04 13:48     ` Radim Krčmář
  0 siblings, 1 reply; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 12:18 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:12 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> QEMU 2.7 allowed EIM even in configurations that were forbidden in the
> last patch because they were not working, like old KVM or userspace
> APIC.  In order to keep backward compatibility, we again allow guests to
> misbehave in non-obvious ways, and make it the default.
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
> v3: shorten the code [Peter]
> ---
>  hw/i386/intel_iommu.c | 12 +++++++-----
>  hw/i386/pc_q35.c      |  2 ++
>  include/hw/i386/pc.h  |  2 ++
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index efb018b85544..13a2ac5a8fb9 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2456,9 +2456,11 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &vtd_as->as;
>  }
>  
> -static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> +static bool vtd_decide_config(IntelIOMMUState *s, PCMachineState *pcms,
> +                              Error **errp)
>  {
>      X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> +    PCMachineClass *pcmc = PC_MACHINE_CLASS(MACHINE_GET_CLASS(pcms));
It's a bit backwards, for devices we usually add a property to affected device
and use compat macros to do the job. As example look at
commit 048a2e8869cb7e26013e40d860c9ebdf8e28c2ac
    x86: ioapic: boost default version to 0x20


>  
>      /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
>      if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
> @@ -2473,11 +2475,11 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>      }
>  
>      if (s->intr_eim == ON_OFF_AUTO_AUTO) {
> -        s->intr_eim = x86_iommu->intr_supported && kvm_irqchip_in_kernel() ?
> +        s->intr_eim = (kvm_irqchip_in_kernel() || pcmc->buggy_intel_iommu_eim)
> +                      && x86_iommu->intr_supported ?
>                                                ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
>      }
> -
> -    if (s->intr_eim == ON_OFF_AUTO_ON) {
> +    if (s->intr_eim == ON_OFF_AUTO_ON && !pcmc->buggy_intel_iommu_eim) {
>          if (kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {
>              error_setg(errp, "eim=on requires support on the KVM side"
>                               "(X2APIC_API, first shipped in v4.7)");
> @@ -2502,7 +2504,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>      VTD_DPRINTF(GENERAL, "");
>      x86_iommu->type = TYPE_INTEL;
>  
> -    if (!vtd_decide_config(s, errp)) {
> +    if (!vtd_decide_config(s, pcms, errp)) {
>          return;
>      }
>  
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 0b214f24c977..97f419fbf4dd 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -304,8 +304,10 @@ DEFINE_Q35_MACHINE(v2_8, "pc-q35-2.8", NULL,
>  
>  static void pc_q35_2_7_machine_options(MachineClass *m)
>  {
> +    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>      pc_q35_2_8_machine_options(m);
>      m->alias = NULL;
> +    pcmc->buggy_intel_iommu_eim = true;
>      SET_MACHINE_COMPAT(m, PC_COMPAT_2_7);
>  }
>  
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 47bdf10cfd9b..4bd435f8fa5c 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -143,6 +143,8 @@ struct PCMachineClass {
>      bool save_tsc_khz;
>      /* generate legacy CPU hotplug AML */
>      bool legacy_cpu_hotplug;
> +    /* enable buggy Intel-IOMMU EIM by default */
> +    bool buggy_intel_iommu_eim;
>  };
>  
>  #define TYPE_PC_MACHINE "generic-pc-machine"

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
@ 2016-10-04 12:34   ` Igor Mammedov
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 12:34 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:10 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> The default (auto) emulates the current behavior.
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
> v3:
>  * use error_setg [Paolo]
>  * shorten the code [Peter]
> ---
>  hw/i386/intel_iommu.c         | 24 +++++++++++++++++++++---
>  include/hw/i386/intel_iommu.h |  1 +
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 5b06b4091f36..17892b8c336b 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -2012,6 +2012,8 @@ static const MemoryRegionOps vtd_mem_ops = {
>  
>  static Property vtd_properties[] = {
>      DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
> +    DEFINE_PROP_ON_OFF_AUTO("eim", IntelIOMMUState, intr_eim,
> +                            ON_OFF_AUTO_AUTO),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> @@ -2368,7 +2370,11 @@ static void vtd_init(IntelIOMMUState *s)
>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>  
>      if (x86_iommu->intr_supported) {
> -        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
> +        if (s->intr_eim == ON_OFF_AUTO_ON) {
> +            s->ecap |= VTD_ECAP_EIM;
> +        }
> +        assert(s->intr_eim != ON_OFF_AUTO_AUTO);
>      }
>  
>      vtd_reset_context_cache(s);
> @@ -2449,8 +2455,10 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>      return &vtd_as->as;
>  }
>  
> -static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
> +static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>  {
> +    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> +
>      /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */
>      if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() &&
>          !kvm_irqchip_is_split()) {
> @@ -2458,6 +2466,16 @@ static bool vtd_check_config(X86IOMMUState *x86_iommu, Error **errp)
>                           "kernel-irqchip=on, please use 'split|off'.");
>          return false;
>      }
> +    if (s->intr_eim == ON_OFF_AUTO_ON && !x86_iommu->intr_supported) {
> +        error_setg(errp, "eim=on cannot be selected without intremap=on");
> +        return false;
> +    }
> +
> +    if (s->intr_eim == ON_OFF_AUTO_AUTO) {
> +        s->intr_eim = x86_iommu->intr_supported ?
> +                                              ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
> +    }
> +
>      return true;
>  }
>  
> @@ -2471,7 +2489,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>      VTD_DPRINTF(GENERAL, "");
>      x86_iommu->type = TYPE_INTEL;
>  
> -    if (!vtd_check_config(x86_iommu, errp)) {
> +    if (!vtd_decide_config(s, errp)) {
>          return;
>      }
>  
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index a42dbd745a70..b5ac60927b1f 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -289,6 +289,7 @@ struct IntelIOMMUState {
>      dma_addr_t intr_root;           /* Interrupt remapping table pointer */
>      uint32_t intr_size;             /* Number of IR table entries */
>      bool intr_eime;                 /* Extended interrupt mode enabled */
> +    OnOffAuto intr_eim;             /* Toggle for EIM cabability */
>  };
>  
>  /* Find the VTD Address space associated with the given bus pointer,

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class()
  2016-10-03 16:03   ` Eduardo Habkost
  2016-10-04 10:59     ` Igor Mammedov
@ 2016-10-04 13:38     ` Radim Krčmář
  2016-10-04 16:14       ` Eduardo Habkost
  1 sibling, 1 reply; 26+ messages in thread
From: Radim Krčmář @ 2016-10-04 13:38 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michael S. Tsirkin, qemu-devel, Peter Xu, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

2016-10-03 13:03-0300, Eduardo Habkost:
> On Fri, Sep 30, 2016 at 06:10:06PM +0200, Radim Krčmář wrote:
>> Every configuration has only up to one APIC class and we'll be extending
>> the class with a function that can be called without an instanced
>> object, so a direct access to the class is convenient.
>> 
>> This patch will break compilation if some code uses apic_get_class()
>> with CONFIG_USER_ONLY.
>> 
>> Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
>> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
>> ---
>> v2: assert() instead of error_report() and exit() [Peter]
>> v3: completely rewrite the mechanism [Eduardo]
>> 
>> It still looks horrible, so I'll be glad for any advice.
>> And what is CONFIG_USER_ONLY?
>> ---
>>  hw/intc/apic_common.c           |  1 +
>>  include/hw/i386/apic_internal.h |  2 ++
>>  target-i386/cpu.c               | 14 +++++++++++---
>>  3 files changed, 14 insertions(+), 3 deletions(-)
>> 
>> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
>> @@ -2856,7 +2855,16 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
>>          apic_type = "xen-apic";
>>      }
>>  
>> -    cpu->apic_state = DEVICE(object_new(apic_type));
>> +    return APIC_COMMON_CLASS(object_class_by_name(apic_type));
>> +}
>> +
>> +static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
>> +{
>> +    APICCommonState *apic;
>> +    ObjectClass *apic_object_class = OBJECT_CLASS(apic_get_class());
>> +
>> +    assert(apic_object_class);
>> +    cpu->apic_state = DEVICE(object_new_with_type(apic_object_class->type));
> 
> ObjectClass::type is private. I believe the common idiom is
> object_new(object_class_get_name(c)).

Will fix in v4.

object_class_get_name() loses information about the type, so
object_new() has to look it up from the name, which is unnecessary.

Should I follow with a patch/series that adds

  Object *object_new_with_class(ObjectClass *class)
  {
  	return object_new_with_type(class->type);
  }

into qom/object.[ch] and replaces occurrences of
object_new_with_type(object_class_get_name()) with it?

> Except for that, I believe the interface is OK and matches the
> existing logic. We can always make it better later, if
> appropriate.

Thanks.

> (e.g. I wonder if we could have a container object for all APICs
> (icc-bus?), and move the send_msi() method and the
> apic_get_class() logic to it).

Yes, and a QEMU-specific bus for interrupts seems nicer that going the
hardware-emulation route and implementing FSB for q35 and APIC bus for
fx440.

I think that the MMIO area we have should only work from PCI devices
(not when the CPU writes to the area), so it would be a step towards
hardware-like behavior as well.

I don't think it would achieve negative diffstat, though ...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM
  2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM Radim Krčmář
@ 2016-10-04 13:43   ` Igor Mammedov
  0 siblings, 0 replies; 26+ messages in thread
From: Igor Mammedov @ 2016-10-04 13:43 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

On Fri, 30 Sep 2016 18:10:11 +0200
Radim Krčmář <rkrcmar@redhat.com> wrote:

> Cluster x2APIC cannot work without KVM's x2apic API when the maximal
> APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
> forbid other APICs and also the old KVM case with less than 9, to
> simplify the code.
> 
> There is no point in enabling EIM in forbidden APICs, so we keep it
> enabled only for the KVM APIC;  unconditionally, because making the
> option depend on KVM version would be a maintanance burden.
Probably it would be good to mention here that x2APIC API is enabled
if KVM's apic is used and kernel supports x2APIC API + DISABLE_BROADCAST_QUIRK,
and doing that enabling fixes error:

   KVM: injection failed, MSI lost (Operation not permitted)

when guest is started with: kernel-irqchip=split + intremap=on,[eim=on]

> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
> v3:
>  * use error_setg [Paolo]
>  * shorten the code [Peter]
> v2:
>  * adapt to new intr_eim parameter
>  * provide first linux version that has x2apic api
>  * disable QEMU's LAPIC
> ---
>  hw/i386/intel_iommu.c  | 15 ++++++++++++++-
>  target-i386/kvm-stub.c |  5 +++++
>  target-i386/kvm.c      | 13 +++++++++++++
>  target-i386/kvm_i386.h |  1 +
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 17892b8c336b..efb018b85544 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -33,6 +33,7 @@
>  #include "hw/pci-host/q35.h"
>  #include "sysemu/kvm.h"
>  #include "hw/i386/apic_internal.h"
> +#include "kvm_i386.h"
>  
>  /*#define DEBUG_INTEL_IOMMU*/
>  #ifdef DEBUG_INTEL_IOMMU
> @@ -2472,10 +2473,22 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>      }
>  
>      if (s->intr_eim == ON_OFF_AUTO_AUTO) {
> -        s->intr_eim = x86_iommu->intr_supported ?
> +        s->intr_eim = x86_iommu->intr_supported && kvm_irqchip_in_kernel() ?
>                                                ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
>      }
>  
> +    if (s->intr_eim == ON_OFF_AUTO_ON) {
> +        if (kvm_irqchip_in_kernel() && !kvm_enable_x2apic()) {
> +            error_setg(errp, "eim=on requires support on the KVM side"
> +                             "(X2APIC_API, first shipped in v4.7)");
> +            return false;
> +        }
> +        if (!kvm_irqchip_in_kernel()) {
> +            error_setg(errp, "eim=on requires accel=kvm,kernel-irqchip=split");
> +            return false;
> +        }
> +    }
> +
>      return true;
>  }
>  
> diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c
> index cdf15061091d..bda4dc2f0c57 100644
> --- a/target-i386/kvm-stub.c
> +++ b/target-i386/kvm-stub.c
> @@ -25,6 +25,11 @@ bool kvm_has_smm(void)
>      return 1;
>  }
>  
> +bool kvm_enable_x2apic(void)
> +{
> +    return false;
> +}
> +
>  /* This function is only called inside conditionals which we
>   * rely on the compiler to optimize out when CONFIG_KVM is not
>   * defined.
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index ee1f53e56953..0fd664648665 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -122,6 +122,19 @@ bool kvm_allows_irq0_override(void)
>      return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
>  }
>  
> +static bool kvm_x2apic_api_set_flags(uint64_t flags)
> +{
> +    KVMState *s = KVM_STATE(current_machine->accelerator);
> +
> +    return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
> +}
> +
> +bool kvm_enable_x2apic(void)
> +{
> +    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_USE_32BIT_IDS |
> +                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
> +}
> +
>  static int kvm_get_tsc(CPUState *cs)
>  {
>      X86CPU *cpu = X86_CPU(cs);
> diff --git a/target-i386/kvm_i386.h b/target-i386/kvm_i386.h
> index 36407e0a5dc7..5c369b1c5b40 100644
> --- a/target-i386/kvm_i386.h
> +++ b/target-i386/kvm_i386.h
> @@ -43,4 +43,5 @@ int kvm_device_msix_deassign(KVMState *s, uint32_t dev_id);
>  
>  void kvm_put_apicbase(X86CPU *cpu, uint64_t value);
>  
> +bool kvm_enable_x2apic(void);
>  #endif

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic()
  2016-10-04 11:33   ` Igor Mammedov
@ 2016-10-04 13:45     ` Radim Krčmář
  0 siblings, 0 replies; 26+ messages in thread
From: Radim Krčmář @ 2016-10-04 13:45 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Peter Xu, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Michael S. Tsirkin

2016-10-04 13:33+0200, Igor Mammedov:
> On Fri, 30 Sep 2016 18:10:13 +0200
> Radim Krčmář <rkrcmar@redhat.com> wrote:
>> Assume that KVM would have returned the same on subsequent runs.
>> Abstract the memoizaiton pattern into macros.
> s/memoi/memori/i
> Throughout whole patch

The pattern I wanted to describe is known as "memoization" without r,
due to reasons, https://en.wikipedia.org/wiki/Memoization, but I was
expecting the whole idea to be nacked, so the naming didn't get much
thought. :)  I'll change it memorize, to create less confusion.
(Memoizing a non-pure "function" is hacky anyway.)

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type
  2016-10-04 12:18   ` Igor Mammedov
@ 2016-10-04 13:48     ` Radim Krčmář
  0 siblings, 0 replies; 26+ messages in thread
From: Radim Krčmář @ 2016-10-04 13:48 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu-devel, Peter Xu,
	Paolo Bonzini, Richard Henderson

2016-10-04 14:18+0200, Igor Mammedov:
> On Fri, 30 Sep 2016 18:10:12 +0200
> Radim Krčmář <rkrcmar@redhat.com> wrote:
>> QEMU 2.7 allowed EIM even in configurations that were forbidden in the
>> last patch because they were not working, like old KVM or userspace
>> APIC.  In order to keep backward compatibility, we again allow guests to
>> misbehave in non-obvious ways, and make it the default.
>> 
>> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
>> ---
>> v3: shorten the code [Peter]
>> ---
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> @@ -2456,9 +2456,11 @@ static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
>> -static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
>> +static bool vtd_decide_config(IntelIOMMUState *s, PCMachineState *pcms,
>> +                              Error **errp)
>>  {
>>      X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
>> +    PCMachineClass *pcmc = PC_MACHINE_CLASS(MACHINE_GET_CLASS(pcms));
> It's a bit backwards, for devices we usually add a property to affected device
> and use compat macros to do the job. As example look at
> commit 048a2e8869cb7e26013e40d860c9ebdf8e28c2ac
>     x86: ioapic: boost default version to 0x20

I see, v4 will have a property.  Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class()
  2016-10-04 13:38     ` Radim Krčmář
@ 2016-10-04 16:14       ` Eduardo Habkost
  0 siblings, 0 replies; 26+ messages in thread
From: Eduardo Habkost @ 2016-10-04 16:14 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Michael S. Tsirkin, qemu-devel, Peter Xu, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

On Tue, Oct 04, 2016 at 03:38:01PM +0200, Radim Krčmář wrote:
> 2016-10-03 13:03-0300, Eduardo Habkost:
> > On Fri, Sep 30, 2016 at 06:10:06PM +0200, Radim Krčmář wrote:
[...]
> >> +static void x86_cpu_apic_create(X86CPU *cpu, Error **errp)
> >> +{
> >> +    APICCommonState *apic;
> >> +    ObjectClass *apic_object_class = OBJECT_CLASS(apic_get_class());
> >> +
> >> +    assert(apic_object_class);
> >> +    cpu->apic_state = DEVICE(object_new_with_type(apic_object_class->type));
> > 
> > ObjectClass::type is private. I believe the common idiom is
> > object_new(object_class_get_name(c)).
> 
> Will fix in v4.
> 
> object_class_get_name() loses information about the type, so
> object_new() has to look it up from the name, which is unnecessary.

True.

> 
> Should I follow with a patch/series that adds
> 
>   Object *object_new_with_class(ObjectClass *class)
>   {
>   	return object_new_with_type(class->type);
>   }
> 
> into qom/object.[ch] and replaces occurrences of
> object_new_with_type(object_class_get_name()) with it?

I think that would be welcome. I don't know why it doesn't exist
yet.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic
  2016-10-04 11:17   ` Igor Mammedov
@ 2016-10-08  5:24     ` Peter Xu
  2016-10-09 20:47       ` Michael S. Tsirkin
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2016-10-08  5:24 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Radim Krčmář,
	qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin

On Tue, Oct 04, 2016 at 01:17:28PM +0200, Igor Mammedov wrote:
> On Fri, 30 Sep 2016 18:10:08 +0200
> Radim Krčmář <rkrcmar@redhat.com> wrote:
> 
> > The MMIO interface to APIC only allowed 8 bit addresses, which is not
> > enough for 32 bit addresses from EIM remapping.
> > Intel stored upper 24 bits in the high MSI address, so use the same
> > technique. The technique is also used in KVM MSI interface.
> > Other APICs are unlikely to handle those upper bits.
> > 
> > Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> > ---
> > v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
> > ---
> >  hw/i386/intel_iommu.c | 21 +++++++++------------
> >  1 file changed, 9 insertions(+), 12 deletions(-)
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 9f4e64af1ad5..c39b62b898d8 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -31,6 +31,7 @@
> >  #include "hw/i386/x86-iommu.h"
> >  #include "hw/pci-host/q35.h"
> >  #include "sysemu/kvm.h"
> > +#include "hw/i386/apic_internal.h"
> >  
> >  /*#define DEBUG_INTEL_IOMMU*/
> >  #ifdef DEBUG_INTEL_IOMMU
> > @@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
> >  static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
> >                                     hwaddr mesg_data_reg)
> >  {
> > -    hwaddr addr;
> > -    uint32_t data;
> > +    MSIMessage msi;
> >  
> >      assert(mesg_data_reg < DMAR_REG_SIZE);
> >      assert(mesg_addr_reg < DMAR_REG_SIZE);
> >  
> > -    addr = vtd_get_long_raw(s, mesg_addr_reg);
> > -    data = vtd_get_long_raw(s, mesg_data_reg);
> > +    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
> > +    msi.data = vtd_get_long_raw(s, mesg_data_reg);
> >  
> > -    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
> > -    address_space_stl_le(&address_space_memory, addr, data,
> > -                         MEMTXATTRS_UNSPECIFIED, NULL);
> > +    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
> > +                msi.address, msi.data);
> > +    apic_get_class()->send_msi(&msi);
> >  }
> >  
> >  /* Generate a fault event to software via MSI if conditions are met.
> > @@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
> >      msg.dest_mode = irq->dest_mode;
> >      msg.redir_hint = irq->redir_hint;
> >      msg.dest = irq->dest;
> > +    msg.__addr_hi = irq->dest & 0xffffff00;
> what about BE host? should it be:
> 
>  msg.__addr_hi = cpu_to_le32(irq->dest & 0xffffff00)
> 
> Also question to Peter, why __addr_hi is not HOST_WORDS_BIGENDIAN conditioned?
> now we have:
> struct VTD_MSIMessage {                                                          
>     union {                                                                      
>         struct {                                                                 
> #ifdef HOST_WORDS_BIGENDIAN                                                      
>             uint32_t __addr_head:12; /* 0xfee */                                 
> [...]                                              
> #else                                                                            
> [...]
>             uint32_t __addr_head:12; /* 0xfee */                                 
> #endif                                                                           
>             uint32_t __addr_hi:32; 

I think __addr_hi is not a bit field at all. It'll be the same if I
put it into the block. E.g., it'll look like:

#ifdef HOST_WORDS_BIGENDIAN
            uint32_t __addr_head:12; /* 0xfee */
            uint32_t dest:8;
            uint32_t __reserved:8;
            uint32_t redir_hint:1;
            uint32_t dest_mode:1;
            uint32_t __not_used:2;
            uint32_t __addr_hi:32;
#else
            uint32_t __not_used:2;
            uint32_t dest_mode:1;
            uint32_t redir_hint:1;
            uint32_t __reserved:8;
            uint32_t dest:8;
            uint32_t __addr_head:12; /* 0xfee */
            uint32_t __addr_hi:32;
#endif

Only real bit fields (like dest_mode, redir_hint, etc.) order is
handled differently on BE/LE machines. Since __addr_hi is not a bit
field (it's typed as u32, and it's 32 bits long), it should always be
using a higher address comparing to above real bit fields, so no
ordering issue AFAIK.

I have patch in my queue that fixes this (change "__addr_hi:32" to
"__addr_hi"). Though it should not be a big problem.

-- peterx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic
  2016-10-08  5:24     ` Peter Xu
@ 2016-10-09 20:47       ` Michael S. Tsirkin
  2016-10-09 22:46         ` Peter Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Michael S. Tsirkin @ 2016-10-09 20:47 UTC (permalink / raw)
  To: Peter Xu
  Cc: Igor Mammedov, Radim Krčmář,
	qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost

On Sat, Oct 08, 2016 at 01:24:55PM +0800, Peter Xu wrote:
> On Tue, Oct 04, 2016 at 01:17:28PM +0200, Igor Mammedov wrote:
> > On Fri, 30 Sep 2016 18:10:08 +0200
> > Radim Krčmář <rkrcmar@redhat.com> wrote:
> > 
> > > The MMIO interface to APIC only allowed 8 bit addresses, which is not
> > > enough for 32 bit addresses from EIM remapping.
> > > Intel stored upper 24 bits in the high MSI address, so use the same
> > > technique. The technique is also used in KVM MSI interface.
> > > Other APICs are unlikely to handle those upper bits.
> > > 
> > > Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> > > ---
> > > v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
> > > ---
> > >  hw/i386/intel_iommu.c | 21 +++++++++------------
> > >  1 file changed, 9 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 9f4e64af1ad5..c39b62b898d8 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -31,6 +31,7 @@
> > >  #include "hw/i386/x86-iommu.h"
> > >  #include "hw/pci-host/q35.h"
> > >  #include "sysemu/kvm.h"
> > > +#include "hw/i386/apic_internal.h"
> > >  
> > >  /*#define DEBUG_INTEL_IOMMU*/
> > >  #ifdef DEBUG_INTEL_IOMMU
> > > @@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
> > >  static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
> > >                                     hwaddr mesg_data_reg)
> > >  {
> > > -    hwaddr addr;
> > > -    uint32_t data;
> > > +    MSIMessage msi;
> > >  
> > >      assert(mesg_data_reg < DMAR_REG_SIZE);
> > >      assert(mesg_addr_reg < DMAR_REG_SIZE);
> > >  
> > > -    addr = vtd_get_long_raw(s, mesg_addr_reg);
> > > -    data = vtd_get_long_raw(s, mesg_data_reg);
> > > +    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
> > > +    msi.data = vtd_get_long_raw(s, mesg_data_reg);
> > >  
> > > -    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
> > > -    address_space_stl_le(&address_space_memory, addr, data,
> > > -                         MEMTXATTRS_UNSPECIFIED, NULL);
> > > +    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
> > > +                msi.address, msi.data);
> > > +    apic_get_class()->send_msi(&msi);
> > >  }
> > >  
> > >  /* Generate a fault event to software via MSI if conditions are met.
> > > @@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
> > >      msg.dest_mode = irq->dest_mode;
> > >      msg.redir_hint = irq->redir_hint;
> > >      msg.dest = irq->dest;
> > > +    msg.__addr_hi = irq->dest & 0xffffff00;
> > what about BE host? should it be:
> > 
> >  msg.__addr_hi = cpu_to_le32(irq->dest & 0xffffff00)
> > 
> > Also question to Peter, why __addr_hi is not HOST_WORDS_BIGENDIAN conditioned?
> > now we have:
> > struct VTD_MSIMessage {                                                          
> >     union {                                                                      
> >         struct {                                                                 
> > #ifdef HOST_WORDS_BIGENDIAN                                                      
> >             uint32_t __addr_head:12; /* 0xfee */                                 
> > [...]                                              
> > #else                                                                            
> > [...]
> >             uint32_t __addr_head:12; /* 0xfee */                                 
> > #endif                                                                           
> >             uint32_t __addr_hi:32; 
> 
> I think __addr_hi is not a bit field at all. It'll be the same if I
> put it into the block. E.g., it'll look like:
> 
> #ifdef HOST_WORDS_BIGENDIAN
>             uint32_t __addr_head:12; /* 0xfee */
>             uint32_t dest:8;
>             uint32_t __reserved:8;
>             uint32_t redir_hint:1;
>             uint32_t dest_mode:1;
>             uint32_t __not_used:2;
>             uint32_t __addr_hi:32;
> #else
>             uint32_t __not_used:2;
>             uint32_t dest_mode:1;
>             uint32_t redir_hint:1;
>             uint32_t __reserved:8;
>             uint32_t dest:8;
>             uint32_t __addr_head:12; /* 0xfee */
>             uint32_t __addr_hi:32;
> #endif
> 
> Only real bit fields (like dest_mode, redir_hint, etc.) order is
> handled differently on BE/LE machines. Since __addr_hi is not a bit
> field (it's typed as u32, and it's 32 bits long), it should always be
> using a higher address comparing to above real bit fields, so no
> ordering issue AFAIK.
> 
> I have patch in my queue that fixes this (change "__addr_hi:32" to
> "__addr_hi"). Though it should not be a big problem.
> 
> -- peterx

IMHO it's best to avoid bitfields completely. Use two uint32_t fields
and add functions to pack/unpack sub-fields.

-- 
MST

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic
  2016-10-09 20:47       ` Michael S. Tsirkin
@ 2016-10-09 22:46         ` Peter Xu
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2016-10-09 22:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Igor Mammedov, Radim Krčmář,
	qemu-devel, Paolo Bonzini, Richard Henderson, Eduardo Habkost

On Sun, Oct 09, 2016 at 11:47:57PM +0300, Michael S. Tsirkin wrote:
> On Sat, Oct 08, 2016 at 01:24:55PM +0800, Peter Xu wrote:
> > On Tue, Oct 04, 2016 at 01:17:28PM +0200, Igor Mammedov wrote:
> > > On Fri, 30 Sep 2016 18:10:08 +0200
> > > Radim Krčmář <rkrcmar@redhat.com> wrote:
> > > 
> > > > The MMIO interface to APIC only allowed 8 bit addresses, which is not
> > > > enough for 32 bit addresses from EIM remapping.
> > > > Intel stored upper 24 bits in the high MSI address, so use the same
> > > > technique. The technique is also used in KVM MSI interface.
> > > > Other APICs are unlikely to handle those upper bits.
> > > > 
> > > > Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> > > > ---
> > > > v2: fix build with enabled DEBUG_INTEL_IOMMU [Peter]
> > > > ---
> > > >  hw/i386/intel_iommu.c | 21 +++++++++------------
> > > >  1 file changed, 9 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > index 9f4e64af1ad5..c39b62b898d8 100644
> > > > --- a/hw/i386/intel_iommu.c
> > > > +++ b/hw/i386/intel_iommu.c
> > > > @@ -31,6 +31,7 @@
> > > >  #include "hw/i386/x86-iommu.h"
> > > >  #include "hw/pci-host/q35.h"
> > > >  #include "sysemu/kvm.h"
> > > > +#include "hw/i386/apic_internal.h"
> > > >  
> > > >  /*#define DEBUG_INTEL_IOMMU*/
> > > >  #ifdef DEBUG_INTEL_IOMMU
> > > > @@ -279,18 +280,17 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
> > > >  static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
> > > >                                     hwaddr mesg_data_reg)
> > > >  {
> > > > -    hwaddr addr;
> > > > -    uint32_t data;
> > > > +    MSIMessage msi;
> > > >  
> > > >      assert(mesg_data_reg < DMAR_REG_SIZE);
> > > >      assert(mesg_addr_reg < DMAR_REG_SIZE);
> > > >  
> > > > -    addr = vtd_get_long_raw(s, mesg_addr_reg);
> > > > -    data = vtd_get_long_raw(s, mesg_data_reg);
> > > > +    msi.address = vtd_get_long_raw(s, mesg_addr_reg);
> > > > +    msi.data = vtd_get_long_raw(s, mesg_data_reg);
> > > >  
> > > > -    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
> > > > -    address_space_stl_le(&address_space_memory, addr, data,
> > > > -                         MEMTXATTRS_UNSPECIFIED, NULL);
> > > > +    VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32,
> > > > +                msi.address, msi.data);
> > > > +    apic_get_class()->send_msi(&msi);
> > > >  }
> > > >  
> > > >  /* Generate a fault event to software via MSI if conditions are met.
> > > > @@ -2133,6 +2133,7 @@ static void vtd_generate_msi_message(VTDIrq *irq, MSIMessage *msg_out)
> > > >      msg.dest_mode = irq->dest_mode;
> > > >      msg.redir_hint = irq->redir_hint;
> > > >      msg.dest = irq->dest;
> > > > +    msg.__addr_hi = irq->dest & 0xffffff00;
> > > what about BE host? should it be:
> > > 
> > >  msg.__addr_hi = cpu_to_le32(irq->dest & 0xffffff00)
> > > 
> > > Also question to Peter, why __addr_hi is not HOST_WORDS_BIGENDIAN conditioned?
> > > now we have:
> > > struct VTD_MSIMessage {                                                          
> > >     union {                                                                      
> > >         struct {                                                                 
> > > #ifdef HOST_WORDS_BIGENDIAN                                                      
> > >             uint32_t __addr_head:12; /* 0xfee */                                 
> > > [...]                                              
> > > #else                                                                            
> > > [...]
> > >             uint32_t __addr_head:12; /* 0xfee */                                 
> > > #endif                                                                           
> > >             uint32_t __addr_hi:32; 
> > 
> > I think __addr_hi is not a bit field at all. It'll be the same if I
> > put it into the block. E.g., it'll look like:
> > 
> > #ifdef HOST_WORDS_BIGENDIAN
> >             uint32_t __addr_head:12; /* 0xfee */
> >             uint32_t dest:8;
> >             uint32_t __reserved:8;
> >             uint32_t redir_hint:1;
> >             uint32_t dest_mode:1;
> >             uint32_t __not_used:2;
> >             uint32_t __addr_hi:32;
> > #else
> >             uint32_t __not_used:2;
> >             uint32_t dest_mode:1;
> >             uint32_t redir_hint:1;
> >             uint32_t __reserved:8;
> >             uint32_t dest:8;
> >             uint32_t __addr_head:12; /* 0xfee */
> >             uint32_t __addr_hi:32;
> > #endif
> > 
> > Only real bit fields (like dest_mode, redir_hint, etc.) order is
> > handled differently on BE/LE machines. Since __addr_hi is not a bit
> > field (it's typed as u32, and it's 32 bits long), it should always be
> > using a higher address comparing to above real bit fields, so no
> > ordering issue AFAIK.
> > 
> > I have patch in my queue that fixes this (change "__addr_hi:32" to
> > "__addr_hi"). Though it should not be a big problem.
> > 
> > -- peterx
> 
> IMHO it's best to avoid bitfields completely. Use two uint32_t fields
> and add functions to pack/unpack sub-fields.

Yeah I now found that bitfield is error-prone. Will take your advice
in the coming patches when capable. Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-10-09 22:46 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-30 16:10 [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM Radim Krčmář
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 1/8] apic: add global apic_get_class() Radim Krčmář
2016-10-03 16:03   ` Eduardo Habkost
2016-10-04 10:59     ` Igor Mammedov
2016-10-04 13:38     ` Radim Krčmář
2016-10-04 16:14       ` Eduardo Habkost
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 2/8] apic: add send_msi() to APICCommonClass Radim Krčmář
2016-10-04 11:06   ` Igor Mammedov
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 3/8] intel_iommu: pass whole remapped addresses to apic Radim Krčmář
2016-10-04 11:17   ` Igor Mammedov
2016-10-08  5:24     ` Peter Xu
2016-10-09 20:47       ` Michael S. Tsirkin
2016-10-09 22:46         ` Peter Xu
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 4/8] intel_iommu: redo configuraton check in realize Radim Krčmář
2016-10-04 11:40   ` Igor Mammedov
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 5/8] intel_iommu: add OnOffAuto intr_eim as "eim" property Radim Krčmář
2016-10-04 12:34   ` Igor Mammedov
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 6/8] intel_iommu: reject broken EIM Radim Krčmář
2016-10-04 13:43   ` Igor Mammedov
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 7/8] intel_iommu: keep buggy EIM enabled in 2.7 machine type Radim Krčmář
2016-10-04 12:18   ` Igor Mammedov
2016-10-04 13:48     ` Radim Krčmář
2016-09-30 16:10 ` [Qemu-devel] [PATCH v3 8/8] target-i386/kvm: cache the return value of kvm_enable_x2apic() Radim Krčmář
2016-10-04 11:33   ` Igor Mammedov
2016-10-04 13:45     ` Radim Krčmář
2016-09-30 17:22 ` [Qemu-devel] [PATCH v3 0/8] intel_iommu: fix EIM no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.