All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] s390x/pci: Accomodate vfio DMA limiting
@ 2020-09-14 22:29 Matthew Rosato
  2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-14 22:29 UTC (permalink / raw)
  To: alex.williamson, cohuck
  Cc: thuth, pmorel, david, schnelle, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

Kernel commit 492855939bdb added a limit to the number of outstanding DMA
requests for a type1 vfio container.  However, lazy unmapping in s390 can 
in fact cause quite a large number of outstanding DMA requests to build up
prior to being purged, potentially the entire guest DMA space.  This
results in unexpected 'VFIO_MAP_DMA failed: No space left on device'
conditions seen in QEMU.

This patchset adds support to qemu to retrieve the number of allowable DMA
requests via the VFIO_IOMMU_GET_INFO ioctl.  The patches are separated into
vfio hits which add support for reading in VFIO_IOMMU_GET_INFO capability
chains and getting the per-container dma_avail value, and s390 hits to 
track DMA usage on a per-container basis.

Associated kernel patch:
https://marc.info/?l=kvm&m=160012235303068&w=2

Changes from v2:
- Renamed references of dma_limit to dma_avail based on kernel change as
  the ioctl now reports the amount currently available vs the supposed
  limit.
- Because the ioctl now provides a 'living value' vs the limit, it doesn't
  seem appropriate to put it in the VFIOConatiner structure without
  doing the dma_avail tracking there.  So patch 1 shrinks to vfio helper
  routines to find the capability and leaves vfio_connect_container alone. 
- Subsequently, in patch 2 s390-pci issues its own VFIO_IOMMU_GET_INFO to
  read the dma_avail value and use it.

Matthew Rosato (3):
  vfio: Find DMA available capability
  s390x/pci: Honor DMA limits set by vfio
  vfio: Create shared routine for scanning info capabilities

 hw/s390x/s390-pci-bus.c       | 99 ++++++++++++++++++++++++++++++++++++++++---
 hw/s390x/s390-pci-bus.h       |  9 ++++
 hw/s390x/s390-pci-inst.c      | 29 ++++++++++---
 hw/s390x/s390-pci-inst.h      |  3 ++
 hw/vfio/common.c              | 50 ++++++++++++++++++----
 include/hw/vfio/vfio-common.h |  2 +
 6 files changed, 173 insertions(+), 19 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-14 22:29 [PATCH v2 0/3] s390x/pci: Accomodate vfio DMA limiting Matthew Rosato
@ 2020-09-14 22:29 ` Matthew Rosato
  2020-09-15  6:14   ` Philippe Mathieu-Daudé
  2020-09-15 10:33   ` Cornelia Huck
  2020-09-14 22:29 ` [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio Matthew Rosato
  2020-09-14 22:29 ` [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities Matthew Rosato
  2 siblings, 2 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-14 22:29 UTC (permalink / raw)
  To: alex.williamson, cohuck
  Cc: thuth, pmorel, david, schnelle, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

The underlying host may be limiting the number of outstanding DMA
requests for type 1 IOMMU.  Add helper functions to check for the
DMA available capability and retrieve the current number of DMA
mappings allowed.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h |  2 ++
 2 files changed, 39 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 3335714..7f4a338 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -844,6 +844,43 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
     return NULL;
 }
 
+static struct vfio_info_cap_header *
+vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
+{
+    struct vfio_info_cap_header *hdr;
+    void *ptr = info;
+
+    if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
+        return NULL;
+    }
+
+    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
+        if (hdr->id == id) {
+            return hdr;
+        }
+    }
+
+    return NULL;
+}
+
+bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
+                             unsigned int *avail)
+{
+    struct vfio_info_cap_header *hdr;
+    struct vfio_iommu_type1_info_dma_avail *cap;
+
+    /* If the capability cannot be found, assume no DMA limiting */
+    hdr = vfio_get_iommu_type1_info_cap(info,
+                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);
+    if (hdr == NULL || avail == NULL) {
+        return false;
+    }
+
+    cap = (void *) hdr;
+    *avail = cap->avail;
+    return true;
+}
+
 static int vfio_setup_region_sparse_mmaps(VFIORegion *region,
                                           struct vfio_region_info *info)
 {
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index c78f3ff..661a380 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -178,6 +178,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
 void vfio_put_group(VFIOGroup *group);
 int vfio_get_device(VFIOGroup *group, const char *name,
                     VFIODevice *vbasedev, Error **errp);
+bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
+                             unsigned int *avail);
 
 extern const MemoryRegionOps vfio_region_ops;
 typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-14 22:29 [PATCH v2 0/3] s390x/pci: Accomodate vfio DMA limiting Matthew Rosato
  2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
@ 2020-09-14 22:29 ` Matthew Rosato
  2020-09-15 11:28   ` Cornelia Huck
  2020-09-15 12:54   ` Thomas Huth
  2020-09-14 22:29 ` [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities Matthew Rosato
  2 siblings, 2 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-14 22:29 UTC (permalink / raw)
  To: alex.williamson, cohuck
  Cc: thuth, pmorel, david, schnelle, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

When an s390 guest is using lazy unmapping, it can result in a very
large number of oustanding DMA requests, far beyond the default
limit configured for vfio.  Let's track DMA usage similar to vfio
in the host, and trigger the guest to flush their DMA mappings
before vfio runs out.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
 hw/s390x/s390-pci-bus.h  |  9 +++++
 hw/s390x/s390-pci-inst.c | 29 +++++++++++---
 hw/s390x/s390-pci-inst.h |  3 ++
 4 files changed, 129 insertions(+), 11 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 92146a2..23474cd 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -11,6 +11,8 @@
  * directory.
  */
 
+#include <sys/ioctl.h>
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
@@ -24,6 +26,9 @@
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 
+#include "hw/vfio/pci.h"
+#include "hw/vfio/vfio-common.h"
+
 #ifndef DEBUG_S390PCI_BUS
 #define DEBUG_S390PCI_BUS  0
 #endif
@@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
     object_unref(OBJECT(iommu));
 }
 
+static bool s390_sync_dma_avail(int fd, unsigned int *avail)
+{
+    struct vfio_iommu_type1_info *info;
+    uint32_t argsz;
+    bool rval = false;
+    int ret;
+
+    if (avail == NULL) {
+        return false;
+    }
+
+    argsz = sizeof(struct vfio_iommu_type1_info);
+    info = g_malloc0(argsz);
+    info->argsz = argsz;
+    /*
+     * If the specified argsz is not large enough to contain all
+     * capabilities it will be updated upon return.  In this case
+     * use the updated value to get the entire capability chain.
+     */
+    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
+    if (argsz != info->argsz) {
+        argsz = info->argsz;
+        info = g_realloc(info, argsz);
+        info->argsz = argsz;
+        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
+    }
+
+    if (ret) {
+        goto out;
+    }
+
+    /* If the capability exists, update with the current value */
+    rval = vfio_get_info_dma_avail(info, avail);
+
+out:
+    g_free(info);
+    return rval;
+}
+
+static S390PCIDMACount *s390_start_dma_count(S390pciState *s, VFIODevice *vdev)
+{
+    int id = vdev->group->container->fd;
+    S390PCIDMACount *cnt;
+    uint32_t avail;
+
+    if (!s390_sync_dma_avail(id, &avail)) {
+        return NULL;
+    }
+
+    QTAILQ_FOREACH(cnt, &s->zpci_dma_limit, link) {
+        if (cnt->id  == id) {
+            cnt->users++;
+            return cnt;
+        }
+    }
+
+    cnt = g_new0(S390PCIDMACount, 1);
+    cnt->id = id;
+    cnt->users = 1;
+    cnt->avail = avail;
+    QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
+    return cnt;
+}
+
+static void s390_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
+{
+    if (cnt == NULL) {
+        return;
+    }
+
+    cnt->users--;
+    if (cnt->users == 0) {
+        QTAILQ_REMOVE(&s->zpci_dma_limit, cnt, link);
+    }
+}
+
 static void s390_pcihost_realize(DeviceState *dev, Error **errp)
 {
     PCIBus *b;
@@ -764,6 +845,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
     s->bus_no = 0;
     QTAILQ_INIT(&s->pending_sei);
     QTAILQ_INIT(&s->zpci_devs);
+    QTAILQ_INIT(&s->zpci_dma_limit);
 
     css_register_io_adapters(CSS_IO_ADAPTER_PCI, true, false,
                              S390_ADAPTER_SUPPRESSIBLE, errp);
@@ -902,6 +984,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
+    VFIOPCIDevice *vpdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
@@ -941,17 +1024,20 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             }
         }
 
+        pbdev->pdev = pdev;
+        pbdev->iommu = s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->devfn);
+        pbdev->iommu->pbdev = pbdev;
+        pbdev->state = ZPCI_FS_DISABLED;
+
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
             pbdev->fh |= FH_SHM_VFIO;
+            vpdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+            pbdev->iommu->dma_limit = s390_start_dma_count(s,
+                                                           &vpdev->vbasedev);
         } else {
             pbdev->fh |= FH_SHM_EMUL;
         }
 
-        pbdev->pdev = pdev;
-        pbdev->iommu = s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->devfn);
-        pbdev->iommu->pbdev = pbdev;
-        pbdev->state = ZPCI_FS_DISABLED;
-
         if (s390_pci_msix_init(pbdev)) {
             error_setg(errp, "MSI-X support is mandatory "
                        "in the S390 architecture");
@@ -1004,6 +1090,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
         pbdev->fid = 0;
         QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
         g_hash_table_remove(s->zpci_table, &pbdev->idx);
+        if (pbdev->iommu->dma_limit) {
+            s390_end_dma_count(s, pbdev->iommu->dma_limit);
+        }
         qdev_unrealize(dev);
     }
 }
diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
index 0458059..f166fd9 100644
--- a/hw/s390x/s390-pci-bus.h
+++ b/hw/s390x/s390-pci-bus.h
@@ -270,6 +270,13 @@ typedef struct S390IOTLBEntry {
     uint64_t perm;
 } S390IOTLBEntry;
 
+typedef struct S390PCIDMACount {
+    int id;
+    int users;
+    uint32_t avail;
+    QTAILQ_ENTRY(S390PCIDMACount) link;
+} S390PCIDMACount;
+
 struct S390PCIIOMMU {
     Object parent_obj;
     S390PCIBusDevice *pbdev;
@@ -281,6 +288,7 @@ struct S390PCIIOMMU {
     uint64_t pba;
     uint64_t pal;
     GHashTable *iotlb;
+    S390PCIDMACount *dma_limit;
 };
 
 typedef struct S390PCIIOMMUTable {
@@ -356,6 +364,7 @@ struct S390pciState {
     GHashTable *zpci_table;
     QTAILQ_HEAD(, SeiContainer) pending_sei;
     QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs;
+    QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit;
 };
 
 S390pciState *s390_get_phb(void);
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 2f7a7d7..6af9af4 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -32,6 +32,9 @@
         }                                                          \
     } while (0)
 
+#define INC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail++;
+#define DEC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail--;
+
 static void s390_set_status_code(CPUS390XState *env,
                                  uint8_t r, uint64_t status_code)
 {
@@ -572,7 +575,8 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
     return 0;
 }
 
-static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
+static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu,
+                                      S390IOTLBEntry *entry)
 {
     S390IOTLBEntry *cache = g_hash_table_lookup(iommu->iotlb, &entry->iova);
     IOMMUTLBEntry notify = {
@@ -585,14 +589,15 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
 
     if (entry->perm == IOMMU_NONE) {
         if (!cache) {
-            return;
+            goto out;
         }
         g_hash_table_remove(iommu->iotlb, &entry->iova);
+        INC_DMA_AVAIL(iommu);
     } else {
         if (cache) {
             if (cache->perm == entry->perm &&
                 cache->translated_addr == entry->translated_addr) {
-                return;
+                goto out;
             }
 
             notify.perm = IOMMU_NONE;
@@ -606,9 +611,13 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *entry)
         cache->len = PAGE_SIZE;
         cache->perm = entry->perm;
         g_hash_table_replace(iommu->iotlb, &cache->iova, cache);
+        DEC_DMA_AVAIL(iommu);
     }
 
     memory_region_notify_iommu(&iommu->iommu_mr, 0, notify);
+
+out:
+    return iommu->dma_limit ? iommu->dma_limit->avail : 1;
 }
 
 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
@@ -620,6 +629,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
     S390PCIIOMMU *iommu;
     S390IOTLBEntry entry;
     hwaddr start, end;
+    uint32_t dma_avail;
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
         s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -675,8 +685,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
         }
 
         start += entry.len;
-        while (entry.iova < start && entry.iova < end) {
-            s390_pci_update_iotlb(iommu, &entry);
+        dma_avail = 1; /* Assume non-zero dma_avail to start */
+        while (entry.iova < start && entry.iova < end && dma_avail > 0) {
+            dma_avail = s390_pci_update_iotlb(iommu, &entry);
             entry.iova += PAGE_SIZE;
             entry.translated_addr += PAGE_SIZE;
         }
@@ -689,7 +700,13 @@ err:
         s390_pci_generate_error_event(error, pbdev->fh, pbdev->fid, start, 0);
     } else {
         pbdev->fmb.counter[ZPCI_FMB_CNT_RPCIT]++;
-        setcc(cpu, ZPCI_PCI_LS_OK);
+        if (dma_avail > 0) {
+            setcc(cpu, ZPCI_PCI_LS_OK);
+        } else {
+            /* vfio DMA mappings are exhausted, trigger a RPCIT */
+            setcc(cpu, ZPCI_PCI_LS_ERR);
+            s390_set_status_code(env, r1, ZPCI_RPCIT_ST_INSUFF_RES);
+        }
     }
     return 0;
 }
diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
index fa3bf8b..8ee3a3c 100644
--- a/hw/s390x/s390-pci-inst.h
+++ b/hw/s390x/s390-pci-inst.h
@@ -254,6 +254,9 @@ typedef struct ClpReqRspQueryPciGrp {
 #define ZPCI_STPCIFC_ST_INVAL_DMAAS   28
 #define ZPCI_STPCIFC_ST_ERROR_RECOVER 40
 
+/* Refresh PCI Translations status codes */
+#define ZPCI_RPCIT_ST_INSUFF_RES      16
+
 /* FIB function controls */
 #define ZPCI_FIB_FC_ENABLED     0x80
 #define ZPCI_FIB_FC_ERROR       0x40
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities
  2020-09-14 22:29 [PATCH v2 0/3] s390x/pci: Accomodate vfio DMA limiting Matthew Rosato
  2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
  2020-09-14 22:29 ` [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio Matthew Rosato
@ 2020-09-14 22:29 ` Matthew Rosato
  2020-09-15  6:16   ` Philippe Mathieu-Daudé
  2 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-09-14 22:29 UTC (permalink / raw)
  To: alex.williamson, cohuck
  Cc: thuth, pmorel, david, schnelle, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

Rather than duplicating the same loop in multiple locations,
create a static function to do the work.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/vfio/common.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7f4a338..bfbbfe4 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -825,17 +825,12 @@ static void vfio_listener_release(VFIOContainer *container)
     }
 }
 
-struct vfio_info_cap_header *
-vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
+static struct vfio_info_cap_header *
+vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id)
 {
     struct vfio_info_cap_header *hdr;
-    void *ptr = info;
-
-    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
-        return NULL;
-    }
 
-    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
+    for (hdr = ptr + cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
         if (hdr->id == id) {
             return hdr;
         }
@@ -844,23 +839,25 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
     return NULL;
 }
 
+
+struct vfio_info_cap_header *
+vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
+{
+    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
+        return NULL;
+    }
+
+    return vfio_get_cap((void *)info, info->cap_offset, id);
+}
+
 static struct vfio_info_cap_header *
 vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
 {
-    struct vfio_info_cap_header *hdr;
-    void *ptr = info;
-
     if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
         return NULL;
     }
 
-    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
-        if (hdr->id == id) {
-            return hdr;
-        }
-    }
-
-    return NULL;
+    return vfio_get_cap((void *)info, info->cap_offset, id);
 }
 
 bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
@ 2020-09-15  6:14   ` Philippe Mathieu-Daudé
  2020-09-15 10:10     ` Cornelia Huck
  2020-09-15 13:39     ` Matthew Rosato
  2020-09-15 10:33   ` Cornelia Huck
  1 sibling, 2 replies; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-15  6:14 UTC (permalink / raw)
  To: Matthew Rosato, alex.williamson, cohuck
  Cc: thuth, pmorel, schnelle, david, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

Hi Matthew,

On 9/15/20 12:29 AM, Matthew Rosato wrote:
> The underlying host may be limiting the number of outstanding DMA
> requests for type 1 IOMMU.  Add helper functions to check for the
> DMA available capability and retrieve the current number of DMA
> mappings allowed.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-common.h |  2 ++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 3335714..7f4a338 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -844,6 +844,43 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>      return NULL;
>  }
>  
> +static struct vfio_info_cap_header *
> +vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
> +{
> +    struct vfio_info_cap_header *hdr;
> +    void *ptr = info;
> +
> +    if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
> +        return NULL;
> +    }
> +
> +    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
> +        if (hdr->id == id) {
> +            return hdr;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +
> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> +                             unsigned int *avail)
> +{
> +    struct vfio_info_cap_header *hdr;
> +    struct vfio_iommu_type1_info_dma_avail *cap;
> +
> +    /* If the capability cannot be found, assume no DMA limiting */
> +    hdr = vfio_get_iommu_type1_info_cap(info,
> +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);
> +    if (hdr == NULL || avail == NULL) {

If you expect the caller to use avail=NULL, then why
return false when there is available information?

> +        return false;
> +    }
> +
> +    cap = (void *) hdr;
> +    *avail = cap->avail;
> +    return true;
> +}
> +
>  static int vfio_setup_region_sparse_mmaps(VFIORegion *region,
>                                            struct vfio_region_info *info)
>  {
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index c78f3ff..661a380 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -178,6 +178,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
>  void vfio_put_group(VFIOGroup *group);
>  int vfio_get_device(VFIOGroup *group, const char *name,
>                      VFIODevice *vbasedev, Error **errp);
> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> +                             unsigned int *avail);
>  
>  extern const MemoryRegionOps vfio_region_ops;
>  typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities
  2020-09-14 22:29 ` [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities Matthew Rosato
@ 2020-09-15  6:16   ` Philippe Mathieu-Daudé
  2020-09-15 13:43     ` Matthew Rosato
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-15  6:16 UTC (permalink / raw)
  To: Matthew Rosato, alex.williamson, cohuck
  Cc: thuth, pmorel, schnelle, david, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

On 9/15/20 12:29 AM, Matthew Rosato wrote:
> Rather than duplicating the same loop in multiple locations,
> create a static function to do the work.

Why not do that first in your series?

> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/vfio/common.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 7f4a338..bfbbfe4 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -825,17 +825,12 @@ static void vfio_listener_release(VFIOContainer *container)
>      }
>  }
>  
> -struct vfio_info_cap_header *
> -vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
> +static struct vfio_info_cap_header *
> +vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id)
>  {
>      struct vfio_info_cap_header *hdr;
> -    void *ptr = info;
> -
> -    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
> -        return NULL;
> -    }
>  
> -    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
> +    for (hdr = ptr + cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
>          if (hdr->id == id) {
>              return hdr;
>          }
> @@ -844,23 +839,25 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>      return NULL;
>  }
>  
> +
> +struct vfio_info_cap_header *
> +vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
> +{
> +    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
> +        return NULL;
> +    }
> +
> +    return vfio_get_cap((void *)info, info->cap_offset, id);
> +}
> +
>  static struct vfio_info_cap_header *
>  vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
>  {
> -    struct vfio_info_cap_header *hdr;
> -    void *ptr = info;
> -
>      if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
>          return NULL;
>      }
>  
> -    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
> -        if (hdr->id == id) {
> -            return hdr;
> -        }
> -    }
> -
> -    return NULL;
> +    return vfio_get_cap((void *)info, info->cap_offset, id);
>  }
>  
>  bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-15  6:14   ` Philippe Mathieu-Daudé
@ 2020-09-15 10:10     ` Cornelia Huck
  2020-09-15 13:39     ` Matthew Rosato
  1 sibling, 0 replies; 17+ messages in thread
From: Cornelia Huck @ 2020-09-15 10:10 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: thuth, Matthew Rosato, pmorel, schnelle, qemu-s390x, david,
	qemu-devel, pasic, borntraeger, alex.williamson, rth

On Tue, 15 Sep 2020 08:14:24 +0200
Philippe Mathieu-Daudé <philmd@redhat.com> wrote:

> Hi Matthew,
> 
> On 9/15/20 12:29 AM, Matthew Rosato wrote:
> > The underlying host may be limiting the number of outstanding DMA
> > requests for type 1 IOMMU.  Add helper functions to check for the
> > DMA available capability and retrieve the current number of DMA
> > mappings allowed.
> > 
> > Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > ---
> >  hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
> >  include/hw/vfio/vfio-common.h |  2 ++
> >  2 files changed, 39 insertions(+)

(...)

> > +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> > +                             unsigned int *avail)
> > +{
> > +    struct vfio_info_cap_header *hdr;
> > +    struct vfio_iommu_type1_info_dma_avail *cap;
> > +
> > +    /* If the capability cannot be found, assume no DMA limiting */
> > +    hdr = vfio_get_iommu_type1_info_cap(info,
> > +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);
> > +    if (hdr == NULL || avail == NULL) {  
> 
> If you expect the caller to use avail=NULL, then why
> return false when there is available information?

I agree; if the purpose of this function is to check if limiting is in
place and only optionally return the actual limit, we should return
true for hdr != NULL and avail == NULL.

> 
> > +        return false;
> > +    }
> > +
> > +    cap = (void *) hdr;
> > +    *avail = cap->avail;
> > +    return true;
> > +}
> > +
> >  static int vfio_setup_region_sparse_mmaps(VFIORegion *region,
> >                                            struct vfio_region_info *info)
> >  {

(...)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
  2020-09-15  6:14   ` Philippe Mathieu-Daudé
@ 2020-09-15 10:33   ` Cornelia Huck
  2020-09-15 13:57     ` Matthew Rosato
  1 sibling, 1 reply; 17+ messages in thread
From: Cornelia Huck @ 2020-09-15 10:33 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On Mon, 14 Sep 2020 18:29:28 -0400
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> The underlying host may be limiting the number of outstanding DMA
> requests for type 1 IOMMU.  Add helper functions to check for the
> DMA available capability and retrieve the current number of DMA
> mappings allowed.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
>  include/hw/vfio/vfio-common.h |  2 ++
>  2 files changed, 39 insertions(+)
> 

(...)

> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> +                             unsigned int *avail)
> +{
> +    struct vfio_info_cap_header *hdr;
> +    struct vfio_iommu_type1_info_dma_avail *cap;
> +
> +    /* If the capability cannot be found, assume no DMA limiting */
> +    hdr = vfio_get_iommu_type1_info_cap(info,
> +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);

...don't you need a headers sync first to get the new definitions?

(...)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-14 22:29 ` [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio Matthew Rosato
@ 2020-09-15 11:28   ` Cornelia Huck
  2020-09-15 14:16     ` Matthew Rosato
  2020-09-15 12:54   ` Thomas Huth
  1 sibling, 1 reply; 17+ messages in thread
From: Cornelia Huck @ 2020-09-15 11:28 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On Mon, 14 Sep 2020 18:29:29 -0400
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> When an s390 guest is using lazy unmapping, it can result in a very
> large number of oustanding DMA requests, far beyond the default
> limit configured for vfio.  Let's track DMA usage similar to vfio
> in the host, and trigger the guest to flush their DMA mappings
> before vfio runs out.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
>  hw/s390x/s390-pci-bus.h  |  9 +++++
>  hw/s390x/s390-pci-inst.c | 29 +++++++++++---
>  hw/s390x/s390-pci-inst.h |  3 ++
>  4 files changed, 129 insertions(+), 11 deletions(-)
> 

(...)

> @@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
>      object_unref(OBJECT(iommu));
>  }
>  
> +static bool s390_sync_dma_avail(int fd, unsigned int *avail)

Not sure I like the name. It sounds like the function checks whether
"sync dma" is available. Maybe s390_update_dma_avail()?

> +{
> +    struct vfio_iommu_type1_info *info;
> +    uint32_t argsz;
> +    bool rval = false;
> +    int ret;
> +
> +    if (avail == NULL) {
> +        return false;
> +    }
> +
> +    argsz = sizeof(struct vfio_iommu_type1_info);
> +    info = g_malloc0(argsz);
> +    info->argsz = argsz;
> +    /*
> +     * If the specified argsz is not large enough to contain all
> +     * capabilities it will be updated upon return.  In this case
> +     * use the updated value to get the entire capability chain.
> +     */
> +    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> +    if (argsz != info->argsz) {
> +        argsz = info->argsz;
> +        info = g_realloc(info, argsz);
> +        info->argsz = argsz;
> +        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> +    }
> +
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* If the capability exists, update with the current value */
> +    rval = vfio_get_info_dma_avail(info, avail);

Adding vfio specific things into the generic s390 pci emulation code
looks a bit ugly... I'd prefer to factor that out into a separate file,
especially if you plan to add more vfio-specific things in the future.

> +
> +out:
> +    g_free(info);
> +    return rval;
> +}
> +

(...)

> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 2f7a7d7..6af9af4 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -32,6 +32,9 @@
>          }                                                          \
>      } while (0)
>  
> +#define INC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail++;
> +#define DEC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail--;

Hm... maybe lowercase inline functions might be better here?

> +
>  static void s390_set_status_code(CPUS390XState *env,
>                                   uint8_t r, uint64_t status_code)
>  {

(...)

> @@ -620,6 +629,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
>      S390PCIIOMMU *iommu;
>      S390IOTLBEntry entry;
>      hwaddr start, end;
> +    uint32_t dma_avail;
>  
>      if (env->psw.mask & PSW_MASK_PSTATE) {
>          s390_program_interrupt(env, PGM_PRIVILEGED, ra);
> @@ -675,8 +685,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
>          }
>  
>          start += entry.len;
> -        while (entry.iova < start && entry.iova < end) {
> -            s390_pci_update_iotlb(iommu, &entry);
> +        dma_avail = 1; /* Assume non-zero dma_avail to start */
> +        while (entry.iova < start && entry.iova < end && dma_avail > 0) {
> +            dma_avail = s390_pci_update_iotlb(iommu, &entry);
>              entry.iova += PAGE_SIZE;
>              entry.translated_addr += PAGE_SIZE;
>          }
> @@ -689,7 +700,13 @@ err:
>          s390_pci_generate_error_event(error, pbdev->fh, pbdev->fid, start, 0);
>      } else {
>          pbdev->fmb.counter[ZPCI_FMB_CNT_RPCIT]++;
> -        setcc(cpu, ZPCI_PCI_LS_OK);
> +        if (dma_avail > 0) {

When I compile this (with a headers update), the compiler moans here
about an uninitialized variable.

> +            setcc(cpu, ZPCI_PCI_LS_OK);
> +        } else {
> +            /* vfio DMA mappings are exhausted, trigger a RPCIT */
> +            setcc(cpu, ZPCI_PCI_LS_ERR);
> +            s390_set_status_code(env, r1, ZPCI_RPCIT_ST_INSUFF_RES);
> +        }
>      }
>      return 0;
>  }

(...)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-14 22:29 ` [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio Matthew Rosato
  2020-09-15 11:28   ` Cornelia Huck
@ 2020-09-15 12:54   ` Thomas Huth
  2020-09-15 14:18     ` Matthew Rosato
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Huth @ 2020-09-15 12:54 UTC (permalink / raw)
  To: Matthew Rosato, alex.williamson, cohuck
  Cc: pmorel, schnelle, david, qemu-devel, pasic, borntraeger, qemu-s390x, rth

On 15/09/2020 00.29, Matthew Rosato wrote:
> When an s390 guest is using lazy unmapping, it can result in a very
> large number of oustanding DMA requests, far beyond the default
> limit configured for vfio.  Let's track DMA usage similar to vfio
> in the host, and trigger the guest to flush their DMA mappings
> before vfio runs out.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
>  hw/s390x/s390-pci-bus.h  |  9 +++++
>  hw/s390x/s390-pci-inst.c | 29 +++++++++++---
>  hw/s390x/s390-pci-inst.h |  3 ++
>  4 files changed, 129 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 92146a2..23474cd 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -11,6 +11,8 @@
>   * directory.
>   */
>  
> +#include <sys/ioctl.h>
> +
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
> @@ -24,6 +26,9 @@
>  #include "qemu/error-report.h"
>  #include "qemu/module.h"
>  
> +#include "hw/vfio/pci.h"
> +#include "hw/vfio/vfio-common.h"
> +
>  #ifndef DEBUG_S390PCI_BUS
>  #define DEBUG_S390PCI_BUS  0
>  #endif
> @@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
>      object_unref(OBJECT(iommu));
>  }
>  
> +static bool s390_sync_dma_avail(int fd, unsigned int *avail)
> +{
> +    struct vfio_iommu_type1_info *info;

You could use g_autofree to get rid of the g_free() at the end.

> +    uint32_t argsz;
> +    bool rval = false;
> +    int ret;
> +
> +    if (avail == NULL) {
> +        return false;
> +    }

Since this is a "static" local function, and calling it with avail ==
NULL does not make too much sense, I think I'd rather turn this into an
assert() instead.

> +    argsz = sizeof(struct vfio_iommu_type1_info);
> +    info = g_malloc0(argsz);
> +    info->argsz = argsz;
> +    /*
> +     * If the specified argsz is not large enough to contain all
> +     * capabilities it will be updated upon return.  In this case
> +     * use the updated value to get the entire capability chain.
> +     */
> +    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> +    if (argsz != info->argsz) {
> +        argsz = info->argsz;
> +        info = g_realloc(info, argsz);
> +        info->argsz = argsz;
> +        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> +    }
> +
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* If the capability exists, update with the current value */
> +    rval = vfio_get_info_dma_avail(info, avail);
> +
> +out:
> +    g_free(info);
> +    return rval;
> +}
> +
> +static S390PCIDMACount *s390_start_dma_count(S390pciState *s, VFIODevice *vdev)
> +{
> +    int id = vdev->group->container->fd;
> +    S390PCIDMACount *cnt;
> +    uint32_t avail;
> +
> +    if (!s390_sync_dma_avail(id, &avail)) {
> +        return NULL;
> +    }
> +
> +    QTAILQ_FOREACH(cnt, &s->zpci_dma_limit, link) {
> +        if (cnt->id  == id) {
> +            cnt->users++;
> +            return cnt;
> +        }
> +    }
> +
> +    cnt = g_new0(S390PCIDMACount, 1);
> +    cnt->id = id;
> +    cnt->users = 1;
> +    cnt->avail = avail;
> +    QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
> +    return cnt;
> +}
> +
> +static void s390_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
> +{
> +    if (cnt == NULL) {
> +        return;
> +    }

Either use assert() or drop this completely (since you're checking it at
the caller site already).

> +    cnt->users--;
> +    if (cnt->users == 0) {
> +        QTAILQ_REMOVE(&s->zpci_dma_limit, cnt, link);
> +    }
> +}
> +
>  static void s390_pcihost_realize(DeviceState *dev, Error **errp)
>  {
>      PCIBus *b;
> @@ -764,6 +845,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
>      s->bus_no = 0;
>      QTAILQ_INIT(&s->pending_sei);
>      QTAILQ_INIT(&s->zpci_devs);
> +    QTAILQ_INIT(&s->zpci_dma_limit);
>  
>      css_register_io_adapters(CSS_IO_ADAPTER_PCI, true, false,
>                               S390_ADAPTER_SUPPRESSIBLE, errp);
> @@ -902,6 +984,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  {
>      S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>      PCIDevice *pdev = NULL;
> +    VFIOPCIDevice *vpdev = NULL;
>      S390PCIBusDevice *pbdev = NULL;
>  
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
> @@ -941,17 +1024,20 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>              }
>          }
>  
> +        pbdev->pdev = pdev;
> +        pbdev->iommu = s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->devfn);
> +        pbdev->iommu->pbdev = pbdev;
> +        pbdev->state = ZPCI_FS_DISABLED;
> +
>          if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
>              pbdev->fh |= FH_SHM_VFIO;
> +            vpdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
> +            pbdev->iommu->dma_limit = s390_start_dma_count(s,
> +                                                           &vpdev->vbasedev);
>          } else {
>              pbdev->fh |= FH_SHM_EMUL;
>          }
>  
> -        pbdev->pdev = pdev;
> -        pbdev->iommu = s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->devfn);
> -        pbdev->iommu->pbdev = pbdev;
> -        pbdev->state = ZPCI_FS_DISABLED;
> -
>          if (s390_pci_msix_init(pbdev)) {
>              error_setg(errp, "MSI-X support is mandatory "
>                         "in the S390 architecture");
> @@ -1004,6 +1090,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          pbdev->fid = 0;
>          QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
>          g_hash_table_remove(s->zpci_table, &pbdev->idx);
> +        if (pbdev->iommu->dma_limit) {
> +            s390_end_dma_count(s, pbdev->iommu->dma_limit);
> +        }
>          qdev_unrealize(dev);
>      }
>  }

 Thomas



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-15  6:14   ` Philippe Mathieu-Daudé
  2020-09-15 10:10     ` Cornelia Huck
@ 2020-09-15 13:39     ` Matthew Rosato
  1 sibling, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-15 13:39 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, alex.williamson, cohuck
  Cc: thuth, pmorel, schnelle, david, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

On 9/15/20 2:14 AM, Philippe Mathieu-Daudé wrote:
> Hi Matthew,
> 
> On 9/15/20 12:29 AM, Matthew Rosato wrote:
>> The underlying host may be limiting the number of outstanding DMA
>> requests for type 1 IOMMU.  Add helper functions to check for the
>> DMA available capability and retrieve the current number of DMA
>> mappings allowed.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
>>   include/hw/vfio/vfio-common.h |  2 ++
>>   2 files changed, 39 insertions(+)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 3335714..7f4a338 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -844,6 +844,43 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>>       return NULL;
>>   }
>>   
>> +static struct vfio_info_cap_header *
>> +vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
>> +{
>> +    struct vfio_info_cap_header *hdr;
>> +    void *ptr = info;
>> +
>> +    if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
>> +        return NULL;
>> +    }
>> +
>> +    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
>> +        if (hdr->id == id) {
>> +            return hdr;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>> +                             unsigned int *avail)
>> +{
>> +    struct vfio_info_cap_header *hdr;
>> +    struct vfio_iommu_type1_info_dma_avail *cap;
>> +
>> +    /* If the capability cannot be found, assume no DMA limiting */
>> +    hdr = vfio_get_iommu_type1_info_cap(info,
>> +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);
>> +    if (hdr == NULL || avail == NULL) {
> 
> If you expect the caller to use avail=NULL, then why
> return false when there is available information?

I was not expecting the caller to use avail=NULL as there would be 
nowhere to stash the dma_avail count.  But you're right, we can at least 
still know that the capability is enabled/disabled when avail=NULL.

I can change this by returning true/false solely based upon the 
existence of the capability (whether or not hdr==NULL) while only 
updating the caller's *avail value when avail!=NULL.

If that's no good, then the alternative would be an assert()

> 
>> +        return false;
>> +    }
>> +
>> +    cap = (void *) hdr;
>> +    *avail = cap->avail;
>> +    return true;
>> +}
>> +
>>   static int vfio_setup_region_sparse_mmaps(VFIORegion *region,
>>                                             struct vfio_region_info *info)
>>   {
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index c78f3ff..661a380 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -178,6 +178,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp);
>>   void vfio_put_group(VFIOGroup *group);
>>   int vfio_get_device(VFIOGroup *group, const char *name,
>>                       VFIODevice *vbasedev, Error **errp);
>> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>> +                             unsigned int *avail);
>>   
>>   extern const MemoryRegionOps vfio_region_ops;
>>   typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
>>
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities
  2020-09-15  6:16   ` Philippe Mathieu-Daudé
@ 2020-09-15 13:43     ` Matthew Rosato
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-15 13:43 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, alex.williamson, cohuck
  Cc: thuth, pmorel, schnelle, david, qemu-devel, pasic, borntraeger,
	qemu-s390x, rth

On 9/15/20 2:16 AM, Philippe Mathieu-Daudé wrote:
> On 9/15/20 12:29 AM, Matthew Rosato wrote:
>> Rather than duplicating the same loop in multiple locations,
>> create a static function to do the work.
> 
> Why not do that first in your series?
> 

Fair question.  I did originally do this collapsing first, but I wasn't 
sure if Alex would want it so I split it out and tacked it on the end so 
it could be dropped if desired.

I'd be just fine re-arranging and putting this patch first.

>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/vfio/common.c | 33 +++++++++++++++------------------
>>   1 file changed, 15 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 7f4a338..bfbbfe4 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -825,17 +825,12 @@ static void vfio_listener_release(VFIOContainer *container)
>>       }
>>   }
>>   
>> -struct vfio_info_cap_header *
>> -vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>> +static struct vfio_info_cap_header *
>> +vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id)
>>   {
>>       struct vfio_info_cap_header *hdr;
>> -    void *ptr = info;
>> -
>> -    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
>> -        return NULL;
>> -    }
>>   
>> -    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
>> +    for (hdr = ptr + cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
>>           if (hdr->id == id) {
>>               return hdr;
>>           }
>> @@ -844,23 +839,25 @@ vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>>       return NULL;
>>   }
>>   
>> +
>> +struct vfio_info_cap_header *
>> +vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id)
>> +{
>> +    if (!(info->flags & VFIO_REGION_INFO_FLAG_CAPS)) {
>> +        return NULL;
>> +    }
>> +
>> +    return vfio_get_cap((void *)info, info->cap_offset, id);
>> +}
>> +
>>   static struct vfio_info_cap_header *
>>   vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id)
>>   {
>> -    struct vfio_info_cap_header *hdr;
>> -    void *ptr = info;
>> -
>>       if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) {
>>           return NULL;
>>       }
>>   
>> -    for (hdr = ptr + info->cap_offset; hdr != ptr; hdr = ptr + hdr->next) {
>> -        if (hdr->id == id) {
>> -            return hdr;
>> -        }
>> -    }
>> -
>> -    return NULL;
>> +    return vfio_get_cap((void *)info, info->cap_offset, id);
>>   }
>>   
>>   bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>>
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-15 10:33   ` Cornelia Huck
@ 2020-09-15 13:57     ` Matthew Rosato
  2020-09-15 14:37       ` Cornelia Huck
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-09-15 13:57 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On 9/15/20 6:33 AM, Cornelia Huck wrote:
> On Mon, 14 Sep 2020 18:29:28 -0400
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> The underlying host may be limiting the number of outstanding DMA
>> requests for type 1 IOMMU.  Add helper functions to check for the
>> DMA available capability and retrieve the current number of DMA
>> mappings allowed.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
>>   include/hw/vfio/vfio-common.h |  2 ++
>>   2 files changed, 39 insertions(+)
>>
> 
> (...)
> 
>> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
>> +                             unsigned int *avail)
>> +{
>> +    struct vfio_info_cap_header *hdr;
>> +    struct vfio_iommu_type1_info_dma_avail *cap;
>> +
>> +    /* If the capability cannot be found, assume no DMA limiting */
>> +    hdr = vfio_get_iommu_type1_info_cap(info,
>> +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);
> 
> ...don't you need a headers sync first to get the new definitions?
> 

You are right of course, though the associated header change is not yet 
merged in the kernel so it's a bit flaky.  But bottom line:  yes, we 
need a header sync first, I'll include one in v3.

> (...)
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-15 11:28   ` Cornelia Huck
@ 2020-09-15 14:16     ` Matthew Rosato
  2020-09-15 14:50       ` Cornelia Huck
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-09-15 14:16 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On 9/15/20 7:28 AM, Cornelia Huck wrote:
> On Mon, 14 Sep 2020 18:29:29 -0400
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> When an s390 guest is using lazy unmapping, it can result in a very
>> large number of oustanding DMA requests, far beyond the default
>> limit configured for vfio.  Let's track DMA usage similar to vfio
>> in the host, and trigger the guest to flush their DMA mappings
>> before vfio runs out.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
>>   hw/s390x/s390-pci-bus.h  |  9 +++++
>>   hw/s390x/s390-pci-inst.c | 29 +++++++++++---
>>   hw/s390x/s390-pci-inst.h |  3 ++
>>   4 files changed, 129 insertions(+), 11 deletions(-)
>>
> 
> (...)
> 
>> @@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
>>       object_unref(OBJECT(iommu));
>>   }
>>   
>> +static bool s390_sync_dma_avail(int fd, unsigned int *avail)
> 
> Not sure I like the name. It sounds like the function checks whether
> "sync dma" is available. Maybe s390_update_dma_avail()?
> 

Sounds fine to me.

>> +{
>> +    struct vfio_iommu_type1_info *info;
>> +    uint32_t argsz;
>> +    bool rval = false;
>> +    int ret;
>> +
>> +    if (avail == NULL) {
>> +        return false;
>> +    }
>> +
>> +    argsz = sizeof(struct vfio_iommu_type1_info);
>> +    info = g_malloc0(argsz);
>> +    info->argsz = argsz;
>> +    /*
>> +     * If the specified argsz is not large enough to contain all
>> +     * capabilities it will be updated upon return.  In this case
>> +     * use the updated value to get the entire capability chain.
>> +     */
>> +    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
>> +    if (argsz != info->argsz) {
>> +        argsz = info->argsz;
>> +        info = g_realloc(info, argsz);
>> +        info->argsz = argsz;
>> +        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
>> +    }
>> +
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* If the capability exists, update with the current value */
>> +    rval = vfio_get_info_dma_avail(info, avail);
> 
> Adding vfio specific things into the generic s390 pci emulation code
> looks a bit ugly... I'd prefer to factor that out into a separate file,
> especially if you plan to add more vfio-specific things in the future.
> 

Fair.   hw/s390x/s390-pci-vfio.* ?

>> +
>> +out:
>> +    g_free(info);
>> +    return rval;
>> +}
>> +
> 
> (...)
> 
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index 2f7a7d7..6af9af4 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -32,6 +32,9 @@
>>           }                                                          \
>>       } while (0)
>>   
>> +#define INC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail++;
>> +#define DEC_DMA_AVAIL(iommu) if (iommu->dma_limit) iommu->dma_limit->avail--;
> 
> Hm... maybe lowercase inline functions might be better here?
> 

OK

>> +
>>   static void s390_set_status_code(CPUS390XState *env,
>>                                    uint8_t r, uint64_t status_code)
>>   {
> 
> (...)
> 
>> @@ -620,6 +629,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
>>       S390PCIIOMMU *iommu;
>>       S390IOTLBEntry entry;
>>       hwaddr start, end;
>> +    uint32_t dma_avail;
>>   
>>       if (env->psw.mask & PSW_MASK_PSTATE) {
>>           s390_program_interrupt(env, PGM_PRIVILEGED, ra);
>> @@ -675,8 +685,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
>>           }
>>   
>>           start += entry.len;
>> -        while (entry.iova < start && entry.iova < end) {
>> -            s390_pci_update_iotlb(iommu, &entry);
>> +        dma_avail = 1; /* Assume non-zero dma_avail to start */
>> +        while (entry.iova < start && entry.iova < end && dma_avail > 0) {
>> +            dma_avail = s390_pci_update_iotlb(iommu, &entry);
>>               entry.iova += PAGE_SIZE;
>>               entry.translated_addr += PAGE_SIZE;
>>           }
>> @@ -689,7 +700,13 @@ err:
>>           s390_pci_generate_error_event(error, pbdev->fh, pbdev->fid, start, 0);
>>       } else {
>>           pbdev->fmb.counter[ZPCI_FMB_CNT_RPCIT]++;
>> -        setcc(cpu, ZPCI_PCI_LS_OK);
>> +        if (dma_avail > 0) {
> 
> When I compile this (with a headers update), the compiler moans here
> about an uninitialized variable.
> 

D'oh.  Obviously dma_avail needs to be initialized outside of the 
if/else -- I'll double-check the logic here and fix.

>> +            setcc(cpu, ZPCI_PCI_LS_OK);
>> +        } else {
>> +            /* vfio DMA mappings are exhausted, trigger a RPCIT */
>> +            setcc(cpu, ZPCI_PCI_LS_ERR);
>> +            s390_set_status_code(env, r1, ZPCI_RPCIT_ST_INSUFF_RES);
>> +        }
>>       }
>>       return 0;
>>   }
> 
> (...)
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-15 12:54   ` Thomas Huth
@ 2020-09-15 14:18     ` Matthew Rosato
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-09-15 14:18 UTC (permalink / raw)
  To: Thomas Huth, alex.williamson, cohuck
  Cc: pmorel, schnelle, david, qemu-devel, pasic, borntraeger, qemu-s390x, rth

On 9/15/20 8:54 AM, Thomas Huth wrote:
> On 15/09/2020 00.29, Matthew Rosato wrote:
>> When an s390 guest is using lazy unmapping, it can result in a very
>> large number of oustanding DMA requests, far beyond the default
>> limit configured for vfio.  Let's track DMA usage similar to vfio
>> in the host, and trigger the guest to flush their DMA mappings
>> before vfio runs out.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
>>   hw/s390x/s390-pci-bus.h  |  9 +++++
>>   hw/s390x/s390-pci-inst.c | 29 +++++++++++---
>>   hw/s390x/s390-pci-inst.h |  3 ++
>>   4 files changed, 129 insertions(+), 11 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 92146a2..23474cd 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -11,6 +11,8 @@
>>    * directory.
>>    */
>>   
>> +#include <sys/ioctl.h>
>> +
>>   #include "qemu/osdep.h"
>>   #include "qapi/error.h"
>>   #include "qapi/visitor.h"
>> @@ -24,6 +26,9 @@
>>   #include "qemu/error-report.h"
>>   #include "qemu/module.h"
>>   
>> +#include "hw/vfio/pci.h"
>> +#include "hw/vfio/vfio-common.h"
>> +
>>   #ifndef DEBUG_S390PCI_BUS
>>   #define DEBUG_S390PCI_BUS  0
>>   #endif
>> @@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
>>       object_unref(OBJECT(iommu));
>>   }
>>   
>> +static bool s390_sync_dma_avail(int fd, unsigned int *avail)
>> +{
>> +    struct vfio_iommu_type1_info *info;
> 
> You could use g_autofree to get rid of the g_free() at the end.
> 

OK

>> +    uint32_t argsz;
>> +    bool rval = false;
>> +    int ret;
>> +
>> +    if (avail == NULL) {
>> +        return false;
>> +    }
> 
> Since this is a "static" local function, and calling it with avail ==
> NULL does not make too much sense, I think I'd rather turn this into an
> assert() instead. >

Sure, sounds good.


>> +    argsz = sizeof(struct vfio_iommu_type1_info);
>> +    info = g_malloc0(argsz);
>> +    info->argsz = argsz;
>> +    /*
>> +     * If the specified argsz is not large enough to contain all
>> +     * capabilities it will be updated upon return.  In this case
>> +     * use the updated value to get the entire capability chain.
>> +     */
>> +    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
>> +    if (argsz != info->argsz) {
>> +        argsz = info->argsz;
>> +        info = g_realloc(info, argsz);
>> +        info->argsz = argsz;
>> +        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
>> +    }
>> +
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* If the capability exists, update with the current value */
>> +    rval = vfio_get_info_dma_avail(info, avail);
>> +
>> +out:
>> +    g_free(info);
>> +    return rval;
>> +}
>> +
>> +static S390PCIDMACount *s390_start_dma_count(S390pciState *s, VFIODevice *vdev)
>> +{
>> +    int id = vdev->group->container->fd;
>> +    S390PCIDMACount *cnt;
>> +    uint32_t avail;
>> +
>> +    if (!s390_sync_dma_avail(id, &avail)) {
>> +        return NULL;
>> +    }
>> +
>> +    QTAILQ_FOREACH(cnt, &s->zpci_dma_limit, link) {
>> +        if (cnt->id  == id) {
>> +            cnt->users++;
>> +            return cnt;
>> +        }
>> +    }
>> +
>> +    cnt = g_new0(S390PCIDMACount, 1);
>> +    cnt->id = id;
>> +    cnt->users = 1;
>> +    cnt->avail = avail;
>> +    QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
>> +    return cnt;
>> +}
>> +
>> +static void s390_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
>> +{
>> +    if (cnt == NULL) {
>> +        return;
>> +    }
> 
> Either use assert() or drop this completely (since you're checking it at
> the caller site already).
> 

Fair - I'll assert() here.  Thanks!


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] vfio: Find DMA available capability
  2020-09-15 13:57     ` Matthew Rosato
@ 2020-09-15 14:37       ` Cornelia Huck
  0 siblings, 0 replies; 17+ messages in thread
From: Cornelia Huck @ 2020-09-15 14:37 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On Tue, 15 Sep 2020 09:57:03 -0400
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 9/15/20 6:33 AM, Cornelia Huck wrote:
> > On Mon, 14 Sep 2020 18:29:28 -0400
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> The underlying host may be limiting the number of outstanding DMA
> >> requests for type 1 IOMMU.  Add helper functions to check for the
> >> DMA available capability and retrieve the current number of DMA
> >> mappings allowed.
> >>
> >> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> >> ---
> >>   hw/vfio/common.c              | 37 +++++++++++++++++++++++++++++++++++++
> >>   include/hw/vfio/vfio-common.h |  2 ++
> >>   2 files changed, 39 insertions(+)
> >>  
> > 
> > (...)
> >   
> >> +bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info,
> >> +                             unsigned int *avail)
> >> +{
> >> +    struct vfio_info_cap_header *hdr;
> >> +    struct vfio_iommu_type1_info_dma_avail *cap;
> >> +
> >> +    /* If the capability cannot be found, assume no DMA limiting */
> >> +    hdr = vfio_get_iommu_type1_info_cap(info,
> >> +                                        VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL);  
> > 
> > ...don't you need a headers sync first to get the new definitions?
> >   
> 
> You are right of course, though the associated header change is not yet 
> merged in the kernel so it's a bit flaky.  But bottom line:  yes, we 
> need a header sync first, I'll include one in v3.

Just include a placeholder patch :) It's easy to replace them with a
real update prior to merging.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio
  2020-09-15 14:16     ` Matthew Rosato
@ 2020-09-15 14:50       ` Cornelia Huck
  0 siblings, 0 replies; 17+ messages in thread
From: Cornelia Huck @ 2020-09-15 14:50 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, qemu-s390x, qemu-devel, pasic,
	borntraeger, alex.williamson, rth

On Tue, 15 Sep 2020 10:16:23 -0400
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 9/15/20 7:28 AM, Cornelia Huck wrote:
> > On Mon, 14 Sep 2020 18:29:29 -0400
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> When an s390 guest is using lazy unmapping, it can result in a very
> >> large number of oustanding DMA requests, far beyond the default
> >> limit configured for vfio.  Let's track DMA usage similar to vfio
> >> in the host, and trigger the guest to flush their DMA mappings
> >> before vfio runs out.
> >>
> >> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> >> ---
> >>   hw/s390x/s390-pci-bus.c  | 99 +++++++++++++++++++++++++++++++++++++++++++++---
> >>   hw/s390x/s390-pci-bus.h  |  9 +++++
> >>   hw/s390x/s390-pci-inst.c | 29 +++++++++++---
> >>   hw/s390x/s390-pci-inst.h |  3 ++
> >>   4 files changed, 129 insertions(+), 11 deletions(-)
> >>  
> > 
> > (...)
> >   
> >> @@ -737,6 +742,82 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
> >>       object_unref(OBJECT(iommu));
> >>   }
> >>   
> >> +static bool s390_sync_dma_avail(int fd, unsigned int *avail)  
> > 
> > Not sure I like the name. It sounds like the function checks whether
> > "sync dma" is available. Maybe s390_update_dma_avail()?
> >   
> 
> Sounds fine to me.
> 
> >> +{
> >> +    struct vfio_iommu_type1_info *info;
> >> +    uint32_t argsz;
> >> +    bool rval = false;
> >> +    int ret;
> >> +
> >> +    if (avail == NULL) {
> >> +        return false;
> >> +    }
> >> +
> >> +    argsz = sizeof(struct vfio_iommu_type1_info);
> >> +    info = g_malloc0(argsz);
> >> +    info->argsz = argsz;
> >> +    /*
> >> +     * If the specified argsz is not large enough to contain all
> >> +     * capabilities it will be updated upon return.  In this case
> >> +     * use the updated value to get the entire capability chain.
> >> +     */
> >> +    ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> >> +    if (argsz != info->argsz) {
> >> +        argsz = info->argsz;
> >> +        info = g_realloc(info, argsz);
> >> +        info->argsz = argsz;
> >> +        ret = ioctl(fd, VFIO_IOMMU_GET_INFO, info);
> >> +    }
> >> +
> >> +    if (ret) {
> >> +        goto out;
> >> +    }
> >> +
> >> +    /* If the capability exists, update with the current value */
> >> +    rval = vfio_get_info_dma_avail(info, avail);  
> > 
> > Adding vfio specific things into the generic s390 pci emulation code
> > looks a bit ugly... I'd prefer to factor that out into a separate file,
> > especially if you plan to add more vfio-specific things in the future.
> >   
> 
> Fair.   hw/s390x/s390-pci-vfio.* ?

Sounds good to me.



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-09-15 14:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-14 22:29 [PATCH v2 0/3] s390x/pci: Accomodate vfio DMA limiting Matthew Rosato
2020-09-14 22:29 ` [PATCH v2 1/3] vfio: Find DMA available capability Matthew Rosato
2020-09-15  6:14   ` Philippe Mathieu-Daudé
2020-09-15 10:10     ` Cornelia Huck
2020-09-15 13:39     ` Matthew Rosato
2020-09-15 10:33   ` Cornelia Huck
2020-09-15 13:57     ` Matthew Rosato
2020-09-15 14:37       ` Cornelia Huck
2020-09-14 22:29 ` [PATCH v2 2/3] s390x/pci: Honor DMA limits set by vfio Matthew Rosato
2020-09-15 11:28   ` Cornelia Huck
2020-09-15 14:16     ` Matthew Rosato
2020-09-15 14:50       ` Cornelia Huck
2020-09-15 12:54   ` Thomas Huth
2020-09-15 14:18     ` Matthew Rosato
2020-09-14 22:29 ` [PATCH v2 3/3] vfio: Create shared routine for scanning info capabilities Matthew Rosato
2020-09-15  6:16   ` Philippe Mathieu-Daudé
2020-09-15 13:43     ` Matthew Rosato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.