All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] four zpci patches
@ 2017-08-28  8:04 Yi Min Zhao
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-28  8:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin

This patch set contains four small zpci patches to fixup different issues.
1) fixup calculation of msix boundary
2) remove zpci idx from msix message, instead we could use PCIDevice's id to
   find zpci device in kvm_arch_fixup_msi_route()
3) fixup ind_offset calculation for adapter interrupt routing entry
4) introduce our own iommu_replay callback

Yi Min Zhao (4):
  s390x/pci: fixup trap_msix()
  s390x/pci: remove idx from msix msg data
  s390x/pci: fixup ind_offset of msix routing entry
  s390x/pci: add iommu replay callback

 hw/s390x/s390-pci-bus.c  | 24 +++++++++++++-----------
 hw/s390x/s390-pci-bus.h  |  2 ++
 hw/s390x/s390-pci-inst.c | 28 ++--------------------------
 target/s390x/kvm.c       | 11 ++++++-----
 4 files changed, 23 insertions(+), 42 deletions(-)

-- 
2.11.0 (Apple Git-81)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-28  8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao
@ 2017-08-28  8:04 ` Yi Min Zhao
  2017-08-28 14:51   ` Cornelia Huck
  2017-08-30  9:47   ` Cornelia Huck
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-28  8:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin

The function trap_msix() is to check if pcistg instruction would access
msix table entries. The correct boundary condition should be
[table_offset, table_offset+entries*entry_size). But the current
condition calculated misses the last entry. So let's fixup it.

Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
---
 hw/s390x/s390-pci-inst.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index b7beb8c36a..eba9ffb5f2 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
 {
     if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
         offset >= pbdev->msix.table_offset &&
-        offset <= pbdev->msix.table_offset +
-                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
+        offset < (pbdev->msix.table_offset +
+                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
         return 1;
     } else {
         return 0;
-- 
2.11.0 (Apple Git-81)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data
  2017-08-28  8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
@ 2017-08-28  8:04 ` Yi Min Zhao
  2017-08-28 15:04   ` Cornelia Huck
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao
  3 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-28  8:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin

PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route().
So we don't need to store zpci idx in msix message data to find out the
specific zpci device. Instead, we could use pci device id to find its
corresponding zpci device.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
---
 hw/s390x/s390-pci-bus.c  | 16 +++++-----------
 hw/s390x/s390-pci-bus.h  |  2 ++
 hw/s390x/s390-pci-inst.c | 24 ------------------------
 target/s390x/kvm.c       |  7 +++++--
 4 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 61cfd2138f..9e1f7ff5c5 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -209,8 +209,8 @@ static S390PCIBusDevice *s390_pci_find_dev_by_uid(S390pciState *s, uint16_t uid)
     return NULL;
 }
 
-static S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
-                                                     const char *target)
+S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
+                                              const char *target)
 {
     S390PCIBusDevice *pbdev;
 
@@ -475,19 +475,13 @@ static void s390_msi_ctrl_write(void *opaque, hwaddr addr, uint64_t data,
                                 unsigned int size)
 {
     S390PCIBusDevice *pbdev = opaque;
-    uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
     uint32_t vec = data & ZPCI_MSI_VEC_MASK;
     uint64_t ind_bit;
     uint32_t sum_bit;
-    uint32_t e = 0;
 
-    DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data, idx, vec);
-
-    if (!pbdev) {
-        e |= (vec << ERR_EVENT_MVN_OFFSET);
-        s390_pci_generate_error_event(ERR_EVENT_NOMSI, idx, 0, addr, e);
-        return;
-    }
+    assert(pbdev);
+    DPRINTF("write_msix data 0x%" PRIx64 " idx %d vec 0x%x\n", data,
+            pbdev->idx, vec);
 
     if (pbdev->state != ZPCI_FS_ENABLED) {
         return;
diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
index 67af2c12ff..820c7fa52b 100644
--- a/hw/s390x/s390-pci-bus.h
+++ b/hw/s390x/s390-pci-bus.h
@@ -330,6 +330,8 @@ void s390_pci_generate_error_event(uint16_t pec, uint32_t fh, uint32_t fid,
 S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx);
 S390PCIBusDevice *s390_pci_find_dev_by_fh(S390pciState *s, uint32_t fh);
 S390PCIBusDevice *s390_pci_find_dev_by_fid(S390pciState *s, uint32_t fid);
+S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
+                                              const char *target);
 S390PCIBusDevice *s390_pci_find_next_avail_dev(S390pciState *s,
                                                S390PCIBusDevice *pbdev);
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index eba9ffb5f2..8e088f3dc9 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -413,29 +413,6 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
     return 0;
 }
 
-static void update_msix_table_msg_data(S390PCIBusDevice *pbdev, uint64_t offset,
-                                       uint64_t *data, uint8_t len)
-{
-    uint32_t val;
-    uint8_t *msg_data;
-
-    if (offset % PCI_MSIX_ENTRY_SIZE != 8) {
-        return;
-    }
-
-    if (len != 4) {
-        DPRINTF("access msix table msg data but len is %d\n", len);
-        return;
-    }
-
-    msg_data = (uint8_t *)data - offset % PCI_MSIX_ENTRY_SIZE +
-               PCI_MSIX_ENTRY_VECTOR_CTRL;
-    val = pci_get_long(msg_data) |
-        ((pbdev->fh & FH_MASK_INDEX) << ZPCI_MSI_VEC_BITS);
-    pci_set_long(msg_data, val);
-    DPRINTF("update msix msg_data to 0x%" PRIx64 "\n", *data);
-}
-
 static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
 {
     if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
@@ -508,7 +485,6 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
         if (trap_msix(pbdev, offset, pcias)) {
             offset = offset - pbdev->msix.table_offset;
             mr = &pbdev->pdev->msix_table_mmio;
-            update_msix_table_msg_data(pbdev, offset, &data, len);
         } else {
             mr = pbdev->pdev->io_regions[pcias].memory;
         }
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 1c68c36663..e348bfb7cc 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
     S390PCIBusDevice *pbdev;
-    uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
     uint32_t vec = data & ZPCI_MSI_VEC_MASK;
 
-    pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx);
+    if (!dev) {
+        return -ENODEV;
+    }
+
+    pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id);
     if (!pbdev) {
         DPRINTF("add_msi_route no dev\n");
         return -ENODEV;
-- 
2.11.0 (Apple Git-81)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry
  2017-08-28  8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao
@ 2017-08-28  8:04 ` Yi Min Zhao
  2017-08-28 15:33   ` Cornelia Huck
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao
  3 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-28  8:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin

The aibvo of zpci device should be constant after issued mpcifc
registering irqs instruction. Each msix vector should offset from the
aibvo. But for flic adapter interrupt, we should use the absolute
offset within the aibv. So let's use the aibvo+vector to fixup msix
routing entry.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
---
 target/s390x/kvm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index e348bfb7cc..c08b7757e7 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
         return -ENODEV;
     }
 
-    pbdev->routes.adapter.ind_offset = vec;
-
     route->type = KVM_IRQ_ROUTING_S390_ADAPTER;
     route->flags = 0;
     route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr;
     route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr;
     route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset;
-    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset;
+    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec;
     route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id;
     return 0;
 }
-- 
2.11.0 (Apple Git-81)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-28  8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao
                   ` (2 preceding siblings ...)
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao
@ 2017-08-28  8:04 ` Yi Min Zhao
  2017-08-28 15:57   ` Cornelia Huck
  3 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-28  8:04 UTC (permalink / raw)
  To: qemu-devel
  Cc: borntraeger, pasic, pmorel, cohuck, agraf, richard.henderson, zyimin

Let's introduce iommu replay callback for s390 pci iommu memory region.
Currently we don't need any dma mapping replay. So let it return
directly. This implementation will avoid meaningless loops calling
translation callback.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 9e1f7ff5c5..359509ccea 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
     return ret;
 }
 
+static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
+                                  IOMMUNotifier *notifier)
+{
+    /* we don't need iommu replay currently */
+    return;
+}
+
 static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
                                         int devfn)
 {
@@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data)
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = s390_translate_iommu;
+    imrc->replay = s390_pci_iommu_replay;
 }
 
 static const TypeInfo s390_iommu_memory_region_info = {
-- 
2.11.0 (Apple Git-81)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
@ 2017-08-28 14:51   ` Cornelia Huck
  2017-08-29  4:32     ` Yi Min Zhao
  2017-08-30  9:47   ` Cornelia Huck
  1 sibling, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-28 14:51 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Mon, 28 Aug 2017 10:04:44 +0200
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> The function trap_msix() is to check if pcistg instruction would access
> msix table entries. The correct boundary condition should be
> [table_offset, table_offset+entries*entry_size). But the current
> condition calculated misses the last entry. So let's fixup it.
> 
> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> ---
>  hw/s390x/s390-pci-inst.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index b7beb8c36a..eba9ffb5f2 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>  {
>      if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>          offset >= pbdev->msix.table_offset &&
> -        offset <= pbdev->msix.table_offset +
> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> +        offset < (pbdev->msix.table_offset +
> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>          return 1;
>      } else {
>          return 0;

What happened before due to the miscalculation? Write to wrong memory
region?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao
@ 2017-08-28 15:04   ` Cornelia Huck
  2017-08-29  4:33     ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-28 15:04 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Mon, 28 Aug 2017 10:04:45 +0200
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route().

s/PCIDevcie/PCIDevice/

> So we don't need to store zpci idx in msix message data to find out the
> specific zpci device. Instead, we could use pci device id to find its
> corresponding zpci device.
> 
> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> ---
>  hw/s390x/s390-pci-bus.c  | 16 +++++-----------
>  hw/s390x/s390-pci-bus.h  |  2 ++
>  hw/s390x/s390-pci-inst.c | 24 ------------------------
>  target/s390x/kvm.c       |  7 +++++--
>  4 files changed, 12 insertions(+), 37 deletions(-)

> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index 1c68c36663..e348bfb7cc 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>                               uint64_t address, uint32_t data, PCIDevice *dev)
>  {
>      S390PCIBusDevice *pbdev;
> -    uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
>      uint32_t vec = data & ZPCI_MSI_VEC_MASK;
>  
> -    pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx);
> +    if (!dev) {
> +        return -ENODEV;
> +    }
> +
> +    pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id);

You need to replace the stub for s390_pci_find_dev_by_idx() with a stub
for s390_pci_find_dev_by_target() in s390-pci-stubs.c (on my s390-next
branch), so that it compiles without CONFIG_PCI.

>      if (!pbdev) {
>          DPRINTF("add_msi_route no dev\n");
>          return -ENODEV;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao
@ 2017-08-28 15:33   ` Cornelia Huck
  2017-08-29  4:39     ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-28 15:33 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Mon, 28 Aug 2017 10:04:46 +0200
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> The aibvo of zpci device should be constant after issued mpcifc
> registering irqs instruction. Each msix vector should offset from the
> aibvo. But for flic adapter interrupt, we should use the absolute
> offset within the aibv. So let's use the aibvo+vector to fixup msix
> routing entry.

This makes sense, but I would tweak the description a bit.

"The guest uses the mpcifc instruction to register the aibvo of a zpci
device, which is the starting offset of indicators in the indicator
area and thus remains constant. Each msix vector is an offset from the
aibvo. When we map a msix route to an adapter route, we should not
modify the starting offset, but instead add the vector to the starting
offset to get the absolute offset in the specific route."

I'm wondering how this was ever supposed to work?

> 
> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> ---
>  target/s390x/kvm.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> index e348bfb7cc..c08b7757e7 100644
> --- a/target/s390x/kvm.c
> +++ b/target/s390x/kvm.c
> @@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>          return -ENODEV;
>      }
>  
> -    pbdev->routes.adapter.ind_offset = vec;
> -
>      route->type = KVM_IRQ_ROUTING_S390_ADAPTER;
>      route->flags = 0;
>      route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr;
>      route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr;
>      route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset;
> -    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset;
> +    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec;
>      route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id;
>      return 0;
>  }

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao
@ 2017-08-28 15:57   ` Cornelia Huck
  2017-08-29  4:46     ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-28 15:57 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Mon, 28 Aug 2017 10:04:47 +0200
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> Let's introduce iommu replay callback for s390 pci iommu memory region.
> Currently we don't need any dma mapping replay. So let it return
> directly. This implementation will avoid meaningless loops calling
> translation callback.
> 
> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> ---
>  hw/s390x/s390-pci-bus.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 9e1f7ff5c5..359509ccea 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
>      return ret;
>  }
>  
> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
> +                                  IOMMUNotifier *notifier)
> +{
> +    /* we don't need iommu replay currently */

This really needs to be heavier on the _why_. My guess is that anything
which would require replay goes through the rpcit instruction?

> +    return;
> +}
> +
>  static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
>                                          int devfn)
>  {
> @@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data)
>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>  
>      imrc->translate = s390_translate_iommu;
> +    imrc->replay = s390_pci_iommu_replay;
>  }
>  
>  static const TypeInfo s390_iommu_memory_region_info = {

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-28 14:51   ` Cornelia Huck
@ 2017-08-29  4:32     ` Yi Min Zhao
  2017-08-29  8:00       ` Cornelia Huck
  0 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  4:32 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/28 下午10:51, Cornelia Huck 写道:
> On Mon, 28 Aug 2017 10:04:44 +0200
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> The function trap_msix() is to check if pcistg instruction would access
>> msix table entries. The correct boundary condition should be
>> [table_offset, table_offset+entries*entry_size). But the current
>> condition calculated misses the last entry. So let's fixup it.
>>
>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>> ---
>>   hw/s390x/s390-pci-inst.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index b7beb8c36a..eba9ffb5f2 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>>   {
>>       if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>>           offset >= pbdev->msix.table_offset &&
>> -        offset <= pbdev->msix.table_offset +
>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
>> +        offset < (pbdev->msix.table_offset +
>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>>           return 1;
>>       } else {
>>           return 0;
> What happened before due to the miscalculation? Write to wrong memory
> region?
>
>
We tried to plug virtio-net pci device but failed. After inspected, we
found that the device uses two msix entries but the last one was
missed. Then we cannot register interrupt successfully because we
should call trap_msixi() in order to save some useful and arch
information into msix message. But what about wrong memory region
didn't happen.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data
  2017-08-28 15:04   ` Cornelia Huck
@ 2017-08-29  4:33     ` Yi Min Zhao
  0 siblings, 0 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  4:33 UTC (permalink / raw)
  To: qemu-devel



在 2017/8/28 下午11:04, Cornelia Huck 写道:
> On Mon, 28 Aug 2017 10:04:45 +0200
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> PCIDevcie pointer has been a parameter of kvm_arch_fixup_msi_route().
> s/PCIDevcie/PCIDevice
Thanks!
>
>> So we don't need to store zpci idx in msix message data to find out the
>> specific zpci device. Instead, we could use pci device id to find its
>> corresponding zpci device.
>>
>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c  | 16 +++++-----------
>>   hw/s390x/s390-pci-bus.h  |  2 ++
>>   hw/s390x/s390-pci-inst.c | 24 ------------------------
>>   target/s390x/kvm.c       |  7 +++++--
>>   4 files changed, 12 insertions(+), 37 deletions(-)
>> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
>> index 1c68c36663..e348bfb7cc 100644
>> --- a/target/s390x/kvm.c
>> +++ b/target/s390x/kvm.c
>> @@ -2503,10 +2503,13 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>>                                uint64_t address, uint32_t data, PCIDevice *dev)
>>   {
>>       S390PCIBusDevice *pbdev;
>> -    uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
>>       uint32_t vec = data & ZPCI_MSI_VEC_MASK;
>>   
>> -    pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx);
>> +    if (!dev) {
>> +        return -ENODEV;
>> +    }
>> +
>> +    pbdev = s390_pci_find_dev_by_target(s390_get_phb(), DEVICE(dev)->id);
> You need to replace the stub for s390_pci_find_dev_by_idx() with a stub
> for s390_pci_find_dev_by_target() in s390-pci-stubs.c (on my s390-next
> branch), so that it compiles without CONFIG_PCI.
OK. Got it.
>
>>       if (!pbdev) {
>>           DPRINTF("add_msi_route no dev\n");
>>           return -ENODEV;
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry
  2017-08-28 15:33   ` Cornelia Huck
@ 2017-08-29  4:39     ` Yi Min Zhao
  0 siblings, 0 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  4:39 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/28 下午11:33, Cornelia Huck 写道:
> On Mon, 28 Aug 2017 10:04:46 +0200
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> The aibvo of zpci device should be constant after issued mpcifc
>> registering irqs instruction. Each msix vector should offset from the
>> aibvo. But for flic adapter interrupt, we should use the absolute
>> offset within the aibv. So let's use the aibvo+vector to fixup msix
>> routing entry.
> This makes sense, but I would tweak the description a bit.
>
> "The guest uses the mpcifc instruction to register the aibvo of a zpci
> device, which is the starting offset of indicators in the indicator
> area and thus remains constant. Each msix vector is an offset from the
> aibvo. When we map a msix route to an adapter route, we should not
> modify the starting offset, but instead add the vector to the starting
> offset to get the absolute offset in the specific route."
Much better. Thanks!
>
> I'm wondering how this was ever supposed to work?
I investigated this. Linux kernel always uses 0 as starting offset for
aibvo. And each msix entry is only registered one time. So we didn't
encounter any problem. But the logic here is not right obviously. It
is just a coincidence.

>
>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>> ---
>>   target/s390x/kvm.c | 4 +---
>>   1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
>> index e348bfb7cc..c08b7757e7 100644
>> --- a/target/s390x/kvm.c
>> +++ b/target/s390x/kvm.c
>> @@ -2515,14 +2515,12 @@ int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>>           return -ENODEV;
>>       }
>>   
>> -    pbdev->routes.adapter.ind_offset = vec;
>> -
>>       route->type = KVM_IRQ_ROUTING_S390_ADAPTER;
>>       route->flags = 0;
>>       route->u.adapter.summary_addr = pbdev->routes.adapter.summary_addr;
>>       route->u.adapter.ind_addr = pbdev->routes.adapter.ind_addr;
>>       route->u.adapter.summary_offset = pbdev->routes.adapter.summary_offset;
>> -    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset;
>> +    route->u.adapter.ind_offset = pbdev->routes.adapter.ind_offset + vec;
>>       route->u.adapter.adapter_id = pbdev->routes.adapter.adapter_id;
>>       return 0;
>>   }
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-28 15:57   ` Cornelia Huck
@ 2017-08-29  4:46     ` Yi Min Zhao
  2017-08-29  8:07       ` Cornelia Huck
  0 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  4:46 UTC (permalink / raw)
  To: qemu-devel



在 2017/8/28 下午11:57, Cornelia Huck 写道:
> On Mon, 28 Aug 2017 10:04:47 +0200
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> Let's introduce iommu replay callback for s390 pci iommu memory region.
>> Currently we don't need any dma mapping replay. So let it return
>> directly. This implementation will avoid meaningless loops calling
>> translation callback.
>>
>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 9e1f7ff5c5..359509ccea 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
>>       return ret;
>>   }
>>   
>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
>> +                                  IOMMUNotifier *notifier)
>> +{
>> +    /* we don't need iommu replay currently */
> This really needs to be heavier on the _why_. My guess is that anything
> which would require replay goes through the rpcit instruction?
My understanding is:
Our arch is different from others. Each pci device has its own iommu, not
like other archs' implementation. So currently there must be no existing
mappings belonging to any newly plugged pci device whose iommu doesn't
have any mapping at the time.
In addition, it's also due to vfio blocking migration. If vfio-pci supports
migration in future, we could implement iommu replay by that time.
>
>> +    return;
>> +}
>> +
>>   static S390PCIIOMMU *s390_pci_get_iommu(S390pciState *s, PCIBus *bus,
>>                                           int devfn)
>>   {
>> @@ -1055,6 +1062,7 @@ static void s390_iommu_memory_region_class_init(ObjectClass *klass, void *data)
>>       IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>>   
>>       imrc->translate = s390_translate_iommu;
>> +    imrc->replay = s390_pci_iommu_replay;
>>   }
>>   
>>   static const TypeInfo s390_iommu_memory_region_info = {
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  4:32     ` Yi Min Zhao
@ 2017-08-29  8:00       ` Cornelia Huck
  2017-08-29  8:05         ` Yi Min Zhao
  2017-08-29  8:12         ` Yi Min Zhao
  0 siblings, 2 replies; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  8:00 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 12:32:17 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/28 下午10:51, Cornelia Huck 写道:
> > On Mon, 28 Aug 2017 10:04:44 +0200
> > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >  
> >> The function trap_msix() is to check if pcistg instruction would access
> >> msix table entries. The correct boundary condition should be
> >> [table_offset, table_offset+entries*entry_size). But the current
> >> condition calculated misses the last entry. So let's fixup it.
> >>
> >> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> >> ---
> >>   hw/s390x/s390-pci-inst.c | 4 ++--
> >>   1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index b7beb8c36a..eba9ffb5f2 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
> >>   {
> >>       if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
> >>           offset >= pbdev->msix.table_offset &&
> >> -        offset <= pbdev->msix.table_offset +
> >> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> >> +        offset < (pbdev->msix.table_offset +
> >> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
> >>           return 1;
> >>       } else {
> >>           return 0;  
> > What happened before due to the miscalculation? Write to wrong memory
> > region?
> >
> >  
> We tried to plug virtio-net pci device but failed. After inspected, we
> found that the device uses two msix entries but the last one was
> missed. Then we cannot register interrupt successfully because we
> should call trap_msixi() in order to save some useful and arch
> information into msix message. But what about wrong memory region
> didn't happen.

So, the guest just was not able to use the second msix entry, but did
not get any exception?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  8:00       ` Cornelia Huck
@ 2017-08-29  8:05         ` Yi Min Zhao
  2017-08-29  8:12         ` Yi Min Zhao
  1 sibling, 0 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  8:05 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午4:00, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 12:32:17 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/28 下午10:51, Cornelia Huck 写道:
>>> On Mon, 28 Aug 2017 10:04:44 +0200
>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>   
>>>> The function trap_msix() is to check if pcistg instruction would access
>>>> msix table entries. The correct boundary condition should be
>>>> [table_offset, table_offset+entries*entry_size). But the current
>>>> condition calculated misses the last entry. So let's fixup it.
>>>>
>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>>>> ---
>>>>    hw/s390x/s390-pci-inst.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>> index b7beb8c36a..eba9ffb5f2 100644
>>>> --- a/hw/s390x/s390-pci-inst.c
>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>>>>    {
>>>>        if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>>>>            offset >= pbdev->msix.table_offset &&
>>>> -        offset <= pbdev->msix.table_offset +
>>>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
>>>> +        offset < (pbdev->msix.table_offset +
>>>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>>>>            return 1;
>>>>        } else {
>>>>            return 0;
>>> What happened before due to the miscalculation? Write to wrong memory
>>> region?
>>>
>>>   
>> We tried to plug virtio-net pci device but failed. After inspected, we
>> found that the device uses two msix entries but the last one was
>> missed. Then we cannot register interrupt successfully because we
>> should call trap_msixi() in order to save some useful and arch
>> information into msix message. But what about wrong memory region
>> didn't happen.
> So, the guest just was not able to use the second msix entry, but did
> not get any exception?
>
>
Yes, didn't get any exception. The guest just kept waiting for something
(I guess that might be the response for interrupt register) and then the
system had no response. What I can do is only destroy the guest.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  4:46     ` Yi Min Zhao
@ 2017-08-29  8:07       ` Cornelia Huck
  2017-08-29  8:26         ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  8:07 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

[Restored cc:s. Please remember to do reply-all.]

On Tue, 29 Aug 2017 12:46:51 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/28 下午11:57, Cornelia Huck 写道:
> > On Mon, 28 Aug 2017 10:04:47 +0200
> > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >
> >> Let's introduce iommu replay callback for s390 pci iommu memory region.
> >> Currently we don't need any dma mapping replay. So let it return
> >> directly. This implementation will avoid meaningless loops calling
> >> translation callback.
> >>
> >> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> >> ---
> >>   hw/s390x/s390-pci-bus.c | 8 ++++++++
> >>   1 file changed, 8 insertions(+)
> >>
> >> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> >> index 9e1f7ff5c5..359509ccea 100644
> >> --- a/hw/s390x/s390-pci-bus.c
> >> +++ b/hw/s390x/s390-pci-bus.c
> >> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
> >>       return ret;
> >>   }
> >>   
> >> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
> >> +                                  IOMMUNotifier *notifier)
> >> +{
> >> +    /* we don't need iommu replay currently */
> > This really needs to be heavier on the _why_. My guess is that anything
> > which would require replay goes through the rpcit instruction?
> My understanding is:
> Our arch is different from others. Each pci device has its own iommu, not
> like other archs' implementation. So currently there must be no existing
> mappings belonging to any newly plugged pci device whose iommu doesn't
> have any mapping at the time.

So please put that explanation into the function. (Also, "currently"?
Are we expecting it to change?)

> In addition, it's also due to vfio blocking migration. If vfio-pci supports
> migration in future, we could implement iommu replay by that time.

That's not an argument: This is the base zpci support, it should not
depend on the supported devices and what they do. (What's the status of
virtio-pci, btw? Does it work with your patches applied, or is there
still more to be done?)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  8:00       ` Cornelia Huck
  2017-08-29  8:05         ` Yi Min Zhao
@ 2017-08-29  8:12         ` Yi Min Zhao
  2017-08-29  8:22           ` Cornelia Huck
  1 sibling, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  8:12 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午4:00, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 12:32:17 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/28 下午10:51, Cornelia Huck 写道:
>>> On Mon, 28 Aug 2017 10:04:44 +0200
>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>   
>>>> The function trap_msix() is to check if pcistg instruction would access
>>>> msix table entries. The correct boundary condition should be
>>>> [table_offset, table_offset+entries*entry_size). But the current
>>>> condition calculated misses the last entry. So let's fixup it.
>>>>
>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>>>> ---
>>>>    hw/s390x/s390-pci-inst.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>> index b7beb8c36a..eba9ffb5f2 100644
>>>> --- a/hw/s390x/s390-pci-inst.c
>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>>>>    {
>>>>        if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>>>>            offset >= pbdev->msix.table_offset &&
>>>> -        offset <= pbdev->msix.table_offset +
>>>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
>>>> +        offset < (pbdev->msix.table_offset +
>>>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>>>>            return 1;
>>>>        } else {
>>>>            return 0;
>>> What happened before due to the miscalculation? Write to wrong memory
>>> region?
>>>
>>>   
>> We tried to plug virtio-net pci device but failed. After inspected, we
>> found that the device uses two msix entries but the last one was
>> missed. Then we cannot register interrupt successfully because we
>> should call trap_msixi() in order to save some useful and arch
>> information into msix message. But what about wrong memory region
>> didn't happen.
> So, the guest just was not able to use the second msix entry, but did
> not get any exception?
>
>
Forget one thing. The zpci idx is saved in msix message. The second msix 
entry has not been
trapped so that no idx has been saved, on the other hand idx 0 is saved. So
kvm_arch_fixup_msi_route() will update irq route according to the zpci 
device whose idx is 0.
So the wrong logic in trap_msix() will result that flic mixes different 
pci devices' adapter interrupts.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  8:12         ` Yi Min Zhao
@ 2017-08-29  8:22           ` Cornelia Huck
  2017-08-29  8:33             ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  8:22 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 16:12:26 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/29 下午4:00, Cornelia Huck 写道:
> > On Tue, 29 Aug 2017 12:32:17 +0800
> > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >  
> >> 在 2017/8/28 下午10:51, Cornelia Huck 写道:  
> >>> On Mon, 28 Aug 2017 10:04:44 +0200
> >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >>>     
> >>>> The function trap_msix() is to check if pcistg instruction would access
> >>>> msix table entries. The correct boundary condition should be
> >>>> [table_offset, table_offset+entries*entry_size). But the current
> >>>> condition calculated misses the last entry. So let's fixup it.
> >>>>
> >>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> >>>> ---
> >>>>    hw/s390x/s390-pci-inst.c | 4 ++--
> >>>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >>>> index b7beb8c36a..eba9ffb5f2 100644
> >>>> --- a/hw/s390x/s390-pci-inst.c
> >>>> +++ b/hw/s390x/s390-pci-inst.c
> >>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
> >>>>    {
> >>>>        if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
> >>>>            offset >= pbdev->msix.table_offset &&
> >>>> -        offset <= pbdev->msix.table_offset +
> >>>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> >>>> +        offset < (pbdev->msix.table_offset +
> >>>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
> >>>>            return 1;
> >>>>        } else {
> >>>>            return 0;  
> >>> What happened before due to the miscalculation? Write to wrong memory
> >>> region?
> >>>
> >>>     
> >> We tried to plug virtio-net pci device but failed. After inspected, we
> >> found that the device uses two msix entries but the last one was
> >> missed. Then we cannot register interrupt successfully because we
> >> should call trap_msixi() in order to save some useful and arch
> >> information into msix message. But what about wrong memory region
> >> didn't happen.  
> > So, the guest just was not able to use the second msix entry, but did
> > not get any exception?
> >
> >  
> Forget one thing. The zpci idx is saved in msix message. The second msix 
> entry has not been
> trapped so that no idx has been saved, on the other hand idx 0 is saved. So
> kvm_arch_fixup_msi_route() will update irq route according to the zpci 
> device whose idx is 0.
> So the wrong logic in trap_msix() will result that flic mixes different 
> pci devices' adapter interrupts.

Ouch. So this only ever worked for the small subset of pci devices we
can passthrough (assuming none of them use more than one msix entry)?

I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is
the first version with usable zpci support).

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  8:07       ` Cornelia Huck
@ 2017-08-29  8:26         ` Yi Min Zhao
  2017-08-29  9:33           ` Cornelia Huck
  0 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  8:26 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午4:07, Cornelia Huck 写道:
> [Restored cc:s. Please remember to do reply-all.]
>
> On Tue, 29 Aug 2017 12:46:51 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/28 下午11:57, Cornelia Huck 写道:
>>> On Mon, 28 Aug 2017 10:04:47 +0200
>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>
>>>> Let's introduce iommu replay callback for s390 pci iommu memory region.
>>>> Currently we don't need any dma mapping replay. So let it return
>>>> directly. This implementation will avoid meaningless loops calling
>>>> translation callback.
>>>>
>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>>>> ---
>>>>    hw/s390x/s390-pci-bus.c | 8 ++++++++
>>>>    1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>>>> index 9e1f7ff5c5..359509ccea 100644
>>>> --- a/hw/s390x/s390-pci-bus.c
>>>> +++ b/hw/s390x/s390-pci-bus.c
>>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
>>>>        return ret;
>>>>    }
>>>>    
>>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
>>>> +                                  IOMMUNotifier *notifier)
>>>> +{
>>>> +    /* we don't need iommu replay currently */
>>> This really needs to be heavier on the _why_. My guess is that anything
>>> which would require replay goes through the rpcit instruction?
>> My understanding is:
>> Our arch is different from others. Each pci device has its own iommu, not
>> like other archs' implementation. So currently there must be no existing
>> mappings belonging to any newly plugged pci device whose iommu doesn't
>> have any mapping at the time.
> So please put that explanation into the function. (Also, "currently"?
> Are we expecting it to change?)
The iommu replay function is originally introduced for vfio. I think 
they want to re-build
the existing mappings because vfio has a copy of mappings in kernel. For 
our case,
the mappings would be cleanup when a pci device unplugged, and new mappings
would be created when a pci device plugged. I think replay only can 
happen during
vfio-pci device migration.
>
>> In addition, it's also due to vfio blocking migration. If vfio-pci supports
>> migration in future, we could implement iommu replay by that time.
> That's not an argument: This is the base zpci support, it should not
> depend on the supported devices and what they do. (What's the status of
> virtio-pci, btw? Does it work with your patches applied, or is there
> still more to be done?)
>
>
My understanding is virtio-pci doesn't need replay. All mappings of any 
pci device are existing in
guest memory. Once the mappings is built, all of them could be migrated 
along with the guest
system. But I might misunderstand it. Please correct me.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  8:22           ` Cornelia Huck
@ 2017-08-29  8:33             ` Yi Min Zhao
  2017-08-29  8:58               ` Cornelia Huck
  0 siblings, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  8:33 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午4:22, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 16:12:26 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/29 下午4:00, Cornelia Huck 写道:
>>> On Tue, 29 Aug 2017 12:32:17 +0800
>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>   
>>>> 在 2017/8/28 下午10:51, Cornelia Huck 写道:
>>>>> On Mon, 28 Aug 2017 10:04:44 +0200
>>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>>>      
>>>>>> The function trap_msix() is to check if pcistg instruction would access
>>>>>> msix table entries. The correct boundary condition should be
>>>>>> [table_offset, table_offset+entries*entry_size). But the current
>>>>>> condition calculated misses the last entry. So let's fixup it.
>>>>>>
>>>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
>>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>     hw/s390x/s390-pci-inst.c | 4 ++--
>>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>>>>>> index b7beb8c36a..eba9ffb5f2 100644
>>>>>> --- a/hw/s390x/s390-pci-inst.c
>>>>>> +++ b/hw/s390x/s390-pci-inst.c
>>>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>>>>>>     {
>>>>>>         if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>>>>>>             offset >= pbdev->msix.table_offset &&
>>>>>> -        offset <= pbdev->msix.table_offset +
>>>>>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
>>>>>> +        offset < (pbdev->msix.table_offset +
>>>>>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>>>>>>             return 1;
>>>>>>         } else {
>>>>>>             return 0;
>>>>> What happened before due to the miscalculation? Write to wrong memory
>>>>> region?
>>>>>
>>>>>      
>>>> We tried to plug virtio-net pci device but failed. After inspected, we
>>>> found that the device uses two msix entries but the last one was
>>>> missed. Then we cannot register interrupt successfully because we
>>>> should call trap_msixi() in order to save some useful and arch
>>>> information into msix message. But what about wrong memory region
>>>> didn't happen.
>>> So, the guest just was not able to use the second msix entry, but did
>>> not get any exception?
>>>
>>>   
>> Forget one thing. The zpci idx is saved in msix message. The second msix
>> entry has not been
>> trapped so that no idx has been saved, on the other hand idx 0 is saved. So
>> kvm_arch_fixup_msi_route() will update irq route according to the zpci
>> device whose idx is 0.
>> So the wrong logic in trap_msix() will result that flic mixes different
>> pci devices' adapter interrupts.
> Ouch. So this only ever worked for the small subset of pci devices we
> can passthrough (assuming none of them use more than one msix entry)?
Because any passthroughed pci devices which I tested has more than 2 
msix entries. And not all
entries will be used. I find that the last entry is never touched. But 
virtio pci is much fancy and only
uses two entries. So I encountered the issue when I tested virtio-pci 
device.
>
> I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is
> the first version with usable zpci support).
>
>
Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-29  8:33             ` Yi Min Zhao
@ 2017-08-29  8:58               ` Cornelia Huck
  0 siblings, 0 replies; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  8:58 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 16:33:52 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/29 下午4:22, Cornelia Huck 写道:
> > On Tue, 29 Aug 2017 16:12:26 +0800
> > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >  
> >> 在 2017/8/29 下午4:00, Cornelia Huck 写道:  
> >>> On Tue, 29 Aug 2017 12:32:17 +0800
> >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >>>     
> >>>> 在 2017/8/28 下午10:51, Cornelia Huck 写道:  
> >>>>> On Mon, 28 Aug 2017 10:04:44 +0200
> >>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >>>>>        
> >>>>>> The function trap_msix() is to check if pcistg instruction would access
> >>>>>> msix table entries. The correct boundary condition should be
> >>>>>> [table_offset, table_offset+entries*entry_size). But the current
> >>>>>> condition calculated misses the last entry. So let's fixup it.
> >>>>>>
> >>>>>> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> >>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> >>>>>> ---
> >>>>>>     hw/s390x/s390-pci-inst.c | 4 ++--
> >>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >>>>>> index b7beb8c36a..eba9ffb5f2 100644
> >>>>>> --- a/hw/s390x/s390-pci-inst.c
> >>>>>> +++ b/hw/s390x/s390-pci-inst.c
> >>>>>> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
> >>>>>>     {
> >>>>>>         if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
> >>>>>>             offset >= pbdev->msix.table_offset &&
> >>>>>> -        offset <= pbdev->msix.table_offset +
> >>>>>> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> >>>>>> +        offset < (pbdev->msix.table_offset +
> >>>>>> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
> >>>>>>             return 1;
> >>>>>>         } else {
> >>>>>>             return 0;  
> >>>>> What happened before due to the miscalculation? Write to wrong memory
> >>>>> region?
> >>>>>
> >>>>>        
> >>>> We tried to plug virtio-net pci device but failed. After inspected, we
> >>>> found that the device uses two msix entries but the last one was
> >>>> missed. Then we cannot register interrupt successfully because we
> >>>> should call trap_msixi() in order to save some useful and arch
> >>>> information into msix message. But what about wrong memory region
> >>>> didn't happen.  
> >>> So, the guest just was not able to use the second msix entry, but did
> >>> not get any exception?
> >>>
> >>>     
> >> Forget one thing. The zpci idx is saved in msix message. The second msix
> >> entry has not been
> >> trapped so that no idx has been saved, on the other hand idx 0 is saved. So
> >> kvm_arch_fixup_msi_route() will update irq route according to the zpci
> >> device whose idx is 0.
> >> So the wrong logic in trap_msix() will result that flic mixes different
> >> pci devices' adapter interrupts.  
> > Ouch. So this only ever worked for the small subset of pci devices we
> > can passthrough (assuming none of them use more than one msix entry)?  
> Because any passthroughed pci devices which I tested has more than 2 
> msix entries. And not all
> entries will be used. I find that the last entry is never touched. But 
> virtio pci is much fancy and only
> uses two entries. So I encountered the issue when I tested virtio-pci 
> device.

So that really sounds to me like "we've been lucky"...

> >
> > I'm tempted to have this cc:ed to stable so we can fixup 2.10 (which is
> > the first version with usable zpci support).

...and I'll add cc:stable, as we don't really have any control from
qemu which kind of devices are handled by vfio.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  8:26         ` Yi Min Zhao
@ 2017-08-29  9:33           ` Cornelia Huck
  2017-08-29  9:49             ` Cornelia Huck
  2017-08-29  9:51             ` Yi Min Zhao
  0 siblings, 2 replies; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  9:33 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 16:26:10 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/29 下午4:07, Cornelia Huck 写道:
> > [Restored cc:s. Please remember to do reply-all.]
> >
> > On Tue, 29 Aug 2017 12:46:51 +0800
> > Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >  
> >> 在 2017/8/28 下午11:57, Cornelia Huck 写道:  
> >>> On Mon, 28 Aug 2017 10:04:47 +0200
> >>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
> >>>  
> >>>> Let's introduce iommu replay callback for s390 pci iommu memory region.
> >>>> Currently we don't need any dma mapping replay. So let it return
> >>>> directly. This implementation will avoid meaningless loops calling
> >>>> translation callback.
> >>>>
> >>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> >>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> >>>> ---
> >>>>    hw/s390x/s390-pci-bus.c | 8 ++++++++
> >>>>    1 file changed, 8 insertions(+)
> >>>>
> >>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> >>>> index 9e1f7ff5c5..359509ccea 100644
> >>>> --- a/hw/s390x/s390-pci-bus.c
> >>>> +++ b/hw/s390x/s390-pci-bus.c
> >>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
> >>>>        return ret;
> >>>>    }
> >>>>    
> >>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
> >>>> +                                  IOMMUNotifier *notifier)
> >>>> +{
> >>>> +    /* we don't need iommu replay currently */  
> >>> This really needs to be heavier on the _why_. My guess is that anything
> >>> which would require replay goes through the rpcit instruction?  
> >> My understanding is:
> >> Our arch is different from others. Each pci device has its own iommu, not
> >> like other archs' implementation. So currently there must be no existing
> >> mappings belonging to any newly plugged pci device whose iommu doesn't
> >> have any mapping at the time.  
> > So please put that explanation into the function. (Also, "currently"?
> > Are we expecting it to change?)  
> The iommu replay function is originally introduced for vfio. I think 
> they want to re-build
> the existing mappings because vfio has a copy of mappings in kernel. For 
> our case,
> the mappings would be cleanup when a pci device unplugged, and new mappings
> would be created when a pci device plugged. I think replay only can 
> happen during
> vfio-pci device migration.

So, the base reason is that it is impossible to plug a pci device on
s390x that already has iommu mappings which need to be replayed, which
is due to the "one iommu per zpci device" construct (and independent of
which devices need replay on other architectures)?

Then this should go into the explanation above. (And I'd still like to
know what "currently" refers to. I don't like to rely on some kind of
implicit assumptions - are we expecting this to break?)

> >  
> >> In addition, it's also due to vfio blocking migration. If vfio-pci supports
> >> migration in future, we could implement iommu replay by that time.  
> > That's not an argument: This is the base zpci support, it should not
> > depend on the supported devices and what they do. (What's the status of
> > virtio-pci, btw? Does it work with your patches applied, or is there
> > still more to be done?)
> >
> >  
> My understanding is virtio-pci doesn't need replay. All mappings of any 
> pci device are existing in
> guest memory. Once the mappings is built, all of them could be migrated 
> along with the guest
> system. But I might misunderstand it. Please correct me.

My question was whether virtio-pci works with your patches on top at
all - last time I checked on master, virtio-pci devices failed to
realize with a "msi-x is mandatory" message.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  9:33           ` Cornelia Huck
@ 2017-08-29  9:49             ` Cornelia Huck
  2017-08-29  9:53               ` Yi Min Zhao
  2017-08-29  9:51             ` Yi Min Zhao
  1 sibling, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  9:49 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 11:33:53 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> My question was whether virtio-pci works with your patches on top at
> all - last time I checked on master, virtio-pci devices failed to
> realize with a "msi-x is mandatory" message.

Just checked again, I still get

qemu-system-s390x: -device virtio-rng-pci: MSI-X support is mandatory in the S390 architecture

Any clue to what is missing?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  9:33           ` Cornelia Huck
  2017-08-29  9:49             ` Cornelia Huck
@ 2017-08-29  9:51             ` Yi Min Zhao
  2017-08-29  9:57               ` Cornelia Huck
  1 sibling, 1 reply; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  9:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午5:33, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 16:26:10 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/29 下午4:07, Cornelia Huck 写道:
>>> [Restored cc:s. Please remember to do reply-all.]
>>>
>>> On Tue, 29 Aug 2017 12:46:51 +0800
>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>   
>>>> 在 2017/8/28 下午11:57, Cornelia Huck 写道:
>>>>> On Mon, 28 Aug 2017 10:04:47 +0200
>>>>> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>>>>>   
>>>>>> Let's introduce iommu replay callback for s390 pci iommu memory region.
>>>>>> Currently we don't need any dma mapping replay. So let it return
>>>>>> directly. This implementation will avoid meaningless loops calling
>>>>>> translation callback.
>>>>>>
>>>>>> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>>>> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>>>> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>     hw/s390x/s390-pci-bus.c | 8 ++++++++
>>>>>>     1 file changed, 8 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>>>>>> index 9e1f7ff5c5..359509ccea 100644
>>>>>> --- a/hw/s390x/s390-pci-bus.c
>>>>>> +++ b/hw/s390x/s390-pci-bus.c
>>>>>> @@ -407,6 +407,13 @@ static IOMMUTLBEntry s390_translate_iommu(IOMMUMemoryRegion *mr, hwaddr addr,
>>>>>>         return ret;
>>>>>>     }
>>>>>>     
>>>>>> +static void s390_pci_iommu_replay(IOMMUMemoryRegion *iommu,
>>>>>> +                                  IOMMUNotifier *notifier)
>>>>>> +{
>>>>>> +    /* we don't need iommu replay currently */
>>>>> This really needs to be heavier on the _why_. My guess is that anything
>>>>> which would require replay goes through the rpcit instruction?
>>>> My understanding is:
>>>> Our arch is different from others. Each pci device has its own iommu, not
>>>> like other archs' implementation. So currently there must be no existing
>>>> mappings belonging to any newly plugged pci device whose iommu doesn't
>>>> have any mapping at the time.
>>> So please put that explanation into the function. (Also, "currently"?
>>> Are we expecting it to change?)
>> The iommu replay function is originally introduced for vfio. I think
>> they want to re-build
>> the existing mappings because vfio has a copy of mappings in kernel. For
>> our case,
>> the mappings would be cleanup when a pci device unplugged, and new mappings
>> would be created when a pci device plugged. I think replay only can
>> happen during
>> vfio-pci device migration.
> So, the base reason is that it is impossible to plug a pci device on
> s390x that already has iommu mappings which need to be replayed, which
> is due to the "one iommu per zpci device" construct (and independent of
> which devices need replay on other architectures)?
Yes.
>
> Then this should go into the explanation above. (And I'd still like to
> know what "currently" refers to. I don't like to rely on some kind of
> implicit assumptions - are we expecting this to break?)
As our discussion during internal review, we don't need to replay 
currently because vfio-pci device
doesn't support migration. 'currently' refers to the time of vfio-pci 
device migration block.
Only when vfio-pci supports migration in future, we should fill the 
replaying code.
>
>>>   
>>>> In addition, it's also due to vfio blocking migration. If vfio-pci supports
>>>> migration in future, we could implement iommu replay by that time.
>>> That's not an argument: This is the base zpci support, it should not
>>> depend on the supported devices and what they do. (What's the status of
>>> virtio-pci, btw? Does it work with your patches applied, or is there
>>> still more to be done?)
>>>
>>>   
>> My understanding is virtio-pci doesn't need replay. All mappings of any
>> pci device are existing in
>> guest memory. Once the mappings is built, all of them could be migrated
>> along with the guest
>> system. But I might misunderstand it. Please correct me.
> My question was whether virtio-pci works with your patches on top at
> all - last time I checked on master, virtio-pci devices failed to
> realize with a "msi-x is mandatory" message.
>
>
I tested. virtio-pci works fine. The message "msix is mandatory" means 
we only support
msix interrupt. If virtio-pci device (like virtio-rng) doesn't support 
msix, we don't allow it
to plug. I thinik this is not related to iommu replay.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  9:49             ` Cornelia Huck
@ 2017-08-29  9:53               ` Yi Min Zhao
  0 siblings, 0 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29  9:53 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午5:49, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 11:33:53 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
>
>> My question was whether virtio-pci works with your patches on top at
>> all - last time I checked on master, virtio-pci devices failed to
>> realize with a "msi-x is mandatory" message.
> Just checked again, I still get
>
> qemu-system-s390x: -device virtio-rng-pci: MSI-X support is mandatory in the S390 architecture
>
> Any clue to what is missing?
>
>
virtio-rng-pci doesn't support msix. But msix is required on s390 arch.
So we avoid plugging any pci device without msix support.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  9:51             ` Yi Min Zhao
@ 2017-08-29  9:57               ` Cornelia Huck
  2017-08-29 10:00                 ` Yi Min Zhao
  0 siblings, 1 reply; 28+ messages in thread
From: Cornelia Huck @ 2017-08-29  9:57 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Tue, 29 Aug 2017 17:51:43 +0800
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> 在 2017/8/29 下午5:33, Cornelia Huck 写道:

> > My question was whether virtio-pci works with your patches on top at
> > all - last time I checked on master, virtio-pci devices failed to
> > realize with a "msi-x is mandatory" message.
> >
> >  
> I tested. virtio-pci works fine. The message "msix is mandatory" means 
> we only support
> msix interrupt. If virtio-pci device (like virtio-rng) doesn't support 
> msix, we don't allow it
> to plug.

Ah, that's it (it's a bit strange that not all virtio-pci devices
support msi-x). I can add a virtio-net-pci device just fine.

(Maybe we can enhance the message so that it is clear that it refers to
that particular device?)

> I thinik this is not related to iommu replay.

This question was unrelated to this particular patch, more to the whole
series :)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback
  2017-08-29  9:57               ` Cornelia Huck
@ 2017-08-29 10:00                 ` Yi Min Zhao
  0 siblings, 0 replies; 28+ messages in thread
From: Yi Min Zhao @ 2017-08-29 10:00 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson



在 2017/8/29 下午5:57, Cornelia Huck 写道:
> On Tue, 29 Aug 2017 17:51:43 +0800
> Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:
>
>> 在 2017/8/29 下午5:33, Cornelia Huck 写道:
>>> My question was whether virtio-pci works with your patches on top at
>>> all - last time I checked on master, virtio-pci devices failed to
>>> realize with a "msi-x is mandatory" message.
>>>
>>>   
>> I tested. virtio-pci works fine. The message "msix is mandatory" means
>> we only support
>> msix interrupt. If virtio-pci device (like virtio-rng) doesn't support
>> msix, we don't allow it
>> to plug.
> Ah, that's it (it's a bit strange that not all virtio-pci devices
> support msi-x). I can add a virtio-net-pci device just fine.
>
> (Maybe we can enhance the message so that it is clear that it refers to
> that particular device?)
Hum, I think so. But it's not urgent. I could post another patch for 
message enhancement
after this series review. Do you agree?
>
>> I thinik this is not related to iommu replay.
> This question was unrelated to this particular patch, more to the whole
> series :)
>
>
Yup.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix()
  2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
  2017-08-28 14:51   ` Cornelia Huck
@ 2017-08-30  9:47   ` Cornelia Huck
  1 sibling, 0 replies; 28+ messages in thread
From: Cornelia Huck @ 2017-08-30  9:47 UTC (permalink / raw)
  To: Yi Min Zhao
  Cc: qemu-devel, borntraeger, pasic, pmorel, agraf, richard.henderson

On Mon, 28 Aug 2017 10:04:44 +0200
Yi Min Zhao <zyimin@linux.vnet.ibm.com> wrote:

> The function trap_msix() is to check if pcistg instruction would access
> msix table entries. The correct boundary condition should be
> [table_offset, table_offset+entries*entry_size). But the current
> condition calculated misses the last entry. So let's fixup it.
> 
> Acked-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
> ---
>  hw/s390x/s390-pci-inst.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index b7beb8c36a..eba9ffb5f2 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -440,8 +440,8 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
>  {
>      if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
>          offset >= pbdev->msix.table_offset &&
> -        offset <= pbdev->msix.table_offset +
> -                  (pbdev->msix.entries - 1) * PCI_MSIX_ENTRY_SIZE) {
> +        offset < (pbdev->msix.table_offset +
> +                  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
>          return 1;
>      } else {
>          return 0;

Added cc:stable and applied to s390-next.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-08-30  9:48 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-28  8:04 [Qemu-devel] [PATCH 0/4] four zpci patches Yi Min Zhao
2017-08-28  8:04 ` [Qemu-devel] [PATCH 1/4] s390x/pci: fixup trap_msix() Yi Min Zhao
2017-08-28 14:51   ` Cornelia Huck
2017-08-29  4:32     ` Yi Min Zhao
2017-08-29  8:00       ` Cornelia Huck
2017-08-29  8:05         ` Yi Min Zhao
2017-08-29  8:12         ` Yi Min Zhao
2017-08-29  8:22           ` Cornelia Huck
2017-08-29  8:33             ` Yi Min Zhao
2017-08-29  8:58               ` Cornelia Huck
2017-08-30  9:47   ` Cornelia Huck
2017-08-28  8:04 ` [Qemu-devel] [PATCH 2/4] s390x/pci: remove idx from msix msg data Yi Min Zhao
2017-08-28 15:04   ` Cornelia Huck
2017-08-29  4:33     ` Yi Min Zhao
2017-08-28  8:04 ` [Qemu-devel] [PATCH 3/4] s390x/pci: fixup ind_offset of msix routing entry Yi Min Zhao
2017-08-28 15:33   ` Cornelia Huck
2017-08-29  4:39     ` Yi Min Zhao
2017-08-28  8:04 ` [Qemu-devel] [PATCH 4/4] s390x/pci: add iommu replay callback Yi Min Zhao
2017-08-28 15:57   ` Cornelia Huck
2017-08-29  4:46     ` Yi Min Zhao
2017-08-29  8:07       ` Cornelia Huck
2017-08-29  8:26         ` Yi Min Zhao
2017-08-29  9:33           ` Cornelia Huck
2017-08-29  9:49             ` Cornelia Huck
2017-08-29  9:53               ` Yi Min Zhao
2017-08-29  9:51             ` Yi Min Zhao
2017-08-29  9:57               ` Cornelia Huck
2017-08-29 10:00                 ` Yi Min Zhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.