All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] s390x/pci: rpcit fixes and enhancements
@ 2022-10-28 19:47 Matthew Rosato
  2022-10-28 19:47 ` [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted Matthew Rosato
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Matthew Rosato @ 2022-10-28 19:47 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

The following series fixes an issue with guest RPCIT processing discovered
during development of [1] as well as proposes a few additional optimizations
to the current RPCIT codepath.

[1] https://lore.kernel.org/linux-s390/20221019144435.369902-1-schnelle@linux.ibm.com/

Matthew Rosato (3):
  s390x/pci: RPCIT second pass when mappings exhausted
  s390x/pci: coalesce unmap operations
  s390x/pci: shrink DMA aperture to be bound by vfio DMA limit

 hw/s390x/s390-pci-inst.c        | 80 ++++++++++++++++++++++++++++++---
 hw/s390x/s390-pci-vfio.c        | 11 +++++
 include/hw/s390x/s390-pci-bus.h |  1 +
 3 files changed, 85 insertions(+), 7 deletions(-)

-- 
2.37.3



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted
  2022-10-28 19:47 [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Matthew Rosato
@ 2022-10-28 19:47 ` Matthew Rosato
  2022-11-04 15:50   ` Eric Farman
  2022-10-28 19:47 ` [PATCH 2/3] s390x/pci: coalesce unmap operations Matthew Rosato
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Matthew Rosato @ 2022-10-28 19:47 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

If we encounter a new mapping while the number of available DMA entries
in vfio is 0, we are currently skipping that mapping which is a problem
if we manage to free up DMA space after that within the same RPCIT --
we will return to the guest with CC0 and have not mapped everything
within the specified range.  This issue was uncovered while testing
changes to the s390 linux kernel iommu/dma code, where a different
usage pattern was employed (new mappings start at the end of the
aperture and work back towards the front, making us far more likely
to encounter new mappings before invalidated mappings during a
global refresh).

Fix this by tracking whether any mappings were skipped due to vfio
DMA limit hitting 0; when this occurs, we still continue the range
and unmap/map anything we can - then we must re-run the range again
to pickup anything that was missed.  This must occur in a loop until
all requests are satisfied (success) or we detect that we are still
unable to complete all mappings (return ZPCI_RPCIT_ST_INSUFF_RES).

Link: https://lore.kernel.org/linux-s390/20221019144435.369902-1-schnelle@linux.ibm.com/
Fixes: 37fa32de70 ("s390x/pci: Honor DMA limits set by vfio")
Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 20a9bcc7af..7cc4bcf850 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -677,8 +677,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
     S390PCIBusDevice *pbdev;
     S390PCIIOMMU *iommu;
     S390IOTLBEntry entry;
-    hwaddr start, end;
+    hwaddr start, end, sstart;
     uint32_t dma_avail;
+    bool again;
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
         s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -691,7 +692,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
     }
 
     fh = env->regs[r1] >> 32;
-    start = env->regs[r2];
+    sstart = start = env->regs[r2];
     end = start + env->regs[r2 + 1];
 
     pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
@@ -732,6 +733,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
         goto err;
     }
 
+ retry:
+    start = sstart;
+    again = false;
     while (start < end) {
         error = s390_guest_io_table_walk(iommu->g_iota, start, &entry);
         if (error) {
@@ -739,13 +743,24 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
         }
 
         start += entry.len;
-        while (entry.iova < start && entry.iova < end &&
-               (dma_avail > 0 || entry.perm == IOMMU_NONE)) {
-            dma_avail = s390_pci_update_iotlb(iommu, &entry);
-            entry.iova += TARGET_PAGE_SIZE;
-            entry.translated_addr += TARGET_PAGE_SIZE;
+        while (entry.iova < start && entry.iova < end) {
+            if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
+                dma_avail = s390_pci_update_iotlb(iommu, &entry);
+                entry.iova += TARGET_PAGE_SIZE;
+                entry.translated_addr += TARGET_PAGE_SIZE;
+            } else {
+                /*
+                 * We are unable to make a new mapping at this time, continue
+                 * on and hopefully free up more space.  Then attempt another
+                 * pass.
+                 */
+                again = true;
+                break;
+            }
         }
     }
+    if (again && dma_avail > 0)
+        goto retry;
 err:
     if (error) {
         pbdev->state = ZPCI_FS_ERROR;
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] s390x/pci: coalesce unmap operations
  2022-10-28 19:47 [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Matthew Rosato
  2022-10-28 19:47 ` [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted Matthew Rosato
@ 2022-10-28 19:47 ` Matthew Rosato
  2022-11-04 16:01   ` Eric Farman
  2022-10-28 19:47 ` [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit Matthew Rosato
  2022-12-12  8:37 ` [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Thomas Huth
  3 siblings, 1 reply; 8+ messages in thread
From: Matthew Rosato @ 2022-10-28 19:47 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

Currently, each unmapped page is handled as an individual iommu
region notification.  Attempt to group contiguous unmap operations
into fewer notifications to reduce overhead.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c | 51 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 7cc4bcf850..66e764f901 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -640,6 +640,8 @@ static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu,
         }
         g_hash_table_remove(iommu->iotlb, &entry->iova);
         inc_dma_avail(iommu);
+        /* Don't notify the iommu yet, maybe we can bundle contiguous unmaps */
+        goto out;
     } else {
         if (cache) {
             if (cache->perm == entry->perm &&
@@ -663,15 +665,44 @@ static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu,
         dec_dma_avail(iommu);
     }
 
+    /*
+     * All associated iotlb entries have already been cleared, trigger the
+     * unmaps.
+     */
     memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
 
 out:
     return iommu->dma_limit ? iommu->dma_limit->avail : 1;
 }
 
+static void s390_pci_batch_unmap(S390PCIIOMMU *iommu, uint64_t iova,
+                                 uint64_t len)
+{
+    uint64_t remain = len, start = iova, end = start + len - 1, mask, size;
+    IOMMUTLBEvent event = {
+        .type = IOMMU_NOTIFIER_UNMAP,
+        .entry = {
+            .target_as = &address_space_memory,
+            .translated_addr = 0,
+            .perm = IOMMU_NONE,
+        },
+    };
+
+    while (remain >= TARGET_PAGE_SIZE) {
+        mask = dma_aligned_pow2_mask(start, end, 64);
+        size = mask + 1;
+        event.entry.iova = start;
+        event.entry.addr_mask = mask;
+        memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
+        start += size;
+        remain -= size;
+    }
+}
+
 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
 {
     CPUS390XState *env = &cpu->env;
+    uint64_t iova, coalesce = 0;
     uint32_t fh;
     uint16_t error = 0;
     S390PCIBusDevice *pbdev;
@@ -742,6 +773,21 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
             break;
         }
 
+        /*
+         * If this is an unmap of a PTE, let's try to coalesce multiple unmaps
+         * into as few notifier events as possible.
+         */
+        if (entry.perm == IOMMU_NONE && entry.len == TARGET_PAGE_SIZE) {
+            if (coalesce == 0) {
+                iova = entry.iova;
+            }
+            coalesce += entry.len;
+        } else if (coalesce > 0) {
+            /* Unleash the coalesced unmap before processing a new map */
+            s390_pci_batch_unmap(iommu, iova, coalesce);
+            coalesce = 0;
+        }
+
         start += entry.len;
         while (entry.iova < start && entry.iova < end) {
             if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
@@ -759,6 +805,11 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
             }
         }
     }
+    if (coalesce) {
+            /* Unleash the coalesced unmap before finishing rpcit */
+            s390_pci_batch_unmap(iommu, iova, coalesce);
+            coalesce = 0;
+    }
     if (again && dma_avail > 0)
         goto retry;
 err:
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit
  2022-10-28 19:47 [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Matthew Rosato
  2022-10-28 19:47 ` [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted Matthew Rosato
  2022-10-28 19:47 ` [PATCH 2/3] s390x/pci: coalesce unmap operations Matthew Rosato
@ 2022-10-28 19:47 ` Matthew Rosato
  2022-11-04 16:08   ` Eric Farman
  2022-12-12  8:37 ` [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Thomas Huth
  3 siblings, 1 reply; 8+ messages in thread
From: Matthew Rosato @ 2022-10-28 19:47 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

Currently, s390x-pci performs accounting against the vfio DMA
limit and triggers the guest to clean up mappings when the limit
is reached. Let's go a step further and also limit the size of
the supported DMA aperture reported to the guest based upon the
initial vfio DMA limit reported for the container (if less than
than the size reported by the firmware/host zPCI layer).  This
avoids processing sections of the guest DMA table during global
refresh that, for common use cases, will never be used anway, and
makes exhausting the vfio DMA limit due to mismatch between guest
aperture size and host limit far less likely and more indicitive
of an error.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-vfio.c        | 11 +++++++++++
 include/hw/s390x/s390-pci-bus.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 2aefa508a0..99806e2a84 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -84,6 +84,7 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
     cnt->users = 1;
     cnt->avail = avail;
     QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
+    pbdev->iommu->max_dma_limit = avail;
     return cnt;
 }
 
@@ -103,6 +104,7 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
     struct vfio_info_cap_header *hdr;
     struct vfio_device_info_cap_zpci_base *cap;
     VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+    uint64_t vfio_size;
 
     hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
 
@@ -122,6 +124,15 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
     /* The following values remain 0 until we support other FMB formats */
     pbdev->zpci_fn.fmbl = 0;
     pbdev->zpci_fn.pft = 0;
+
+    /*
+     * If appropriate, reduce the size of the supported DMA aperture reported
+     * to the guest based upon the vfio DMA limit.
+     */
+    vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
+    if (vfio_size < (cap->end_dma - cap->start_dma + 1)) {
+        pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
+    }
 }
 
 static bool get_host_fh(S390PCIBusDevice *pbdev, struct vfio_device_info *info,
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 0605fcea24..1c46e3a269 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -278,6 +278,7 @@ struct S390PCIIOMMU {
     uint64_t g_iota;
     uint64_t pba;
     uint64_t pal;
+    uint64_t max_dma_limit;
     GHashTable *iotlb;
     S390PCIDMACount *dma_limit;
 };
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted
  2022-10-28 19:47 ` [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted Matthew Rosato
@ 2022-11-04 15:50   ` Eric Farman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Farman @ 2022-11-04 15:50 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

On Fri, 2022-10-28 at 15:47 -0400, Matthew Rosato wrote:
> If we encounter a new mapping while the number of available DMA
> entries
> in vfio is 0, we are currently skipping that mapping which is a
> problem
> if we manage to free up DMA space after that within the same RPCIT --
> we will return to the guest with CC0 and have not mapped everything
> within the specified range.  This issue was uncovered while testing
> changes to the s390 linux kernel iommu/dma code, where a different
> usage pattern was employed (new mappings start at the end of the
> aperture and work back towards the front, making us far more likely
> to encounter new mappings before invalidated mappings during a
> global refresh).
> 
> Fix this by tracking whether any mappings were skipped due to vfio
> DMA limit hitting 0; when this occurs, we still continue the range
> and unmap/map anything we can - then we must re-run the range again
> to pickup anything that was missed.  This must occur in a loop until
> all requests are satisfied (success) or we detect that we are still
> unable to complete all mappings (return ZPCI_RPCIT_ST_INSUFF_RES).
> 
> Link:
> https://lore.kernel.org/linux-s390/20221019144435.369902-1-schnelle@linux.ibm.com/
> Fixes: 37fa32de70 ("s390x/pci: Honor DMA limits set by vfio")
> Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>  hw/s390x/s390-pci-inst.c | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 20a9bcc7af..7cc4bcf850 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -677,8 +677,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1,
> uint8_t r2, uintptr_t ra)
>      S390PCIBusDevice *pbdev;
>      S390PCIIOMMU *iommu;
>      S390IOTLBEntry entry;
> -    hwaddr start, end;
> +    hwaddr start, end, sstart;
>      uint32_t dma_avail;
> +    bool again;
>  
>      if (env->psw.mask & PSW_MASK_PSTATE) {
>          s390_program_interrupt(env, PGM_PRIVILEGED, ra);
> @@ -691,7 +692,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1,
> uint8_t r2, uintptr_t ra)
>      }
>  
>      fh = env->regs[r1] >> 32;
> -    start = env->regs[r2];
> +    sstart = start = env->regs[r2];
>      end = start + env->regs[r2 + 1];
>  
>      pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
> @@ -732,6 +733,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1,
> uint8_t r2, uintptr_t ra)
>          goto err;
>      }
>  
> + retry:
> +    start = sstart;
> +    again = false;
>      while (start < end) {
>          error = s390_guest_io_table_walk(iommu->g_iota, start,
> &entry);
>          if (error) {
> @@ -739,13 +743,24 @@ int rpcit_service_call(S390CPU *cpu, uint8_t
> r1, uint8_t r2, uintptr_t ra)
>          }
>  
>          start += entry.len;
> -        while (entry.iova < start && entry.iova < end &&
> -               (dma_avail > 0 || entry.perm == IOMMU_NONE)) {
> -            dma_avail = s390_pci_update_iotlb(iommu, &entry);
> -            entry.iova += TARGET_PAGE_SIZE;
> -            entry.translated_addr += TARGET_PAGE_SIZE;
> +        while (entry.iova < start && entry.iova < end) {
> +            if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
> +                dma_avail = s390_pci_update_iotlb(iommu, &entry);
> +                entry.iova += TARGET_PAGE_SIZE;
> +                entry.translated_addr += TARGET_PAGE_SIZE;
> +            } else {
> +                /*
> +                 * We are unable to make a new mapping at this time,
> continue
> +                 * on and hopefully free up more space.  Then
> attempt another
> +                 * pass.
> +                 */
> +                again = true;
> +                break;
> +            }
>          }
>      }
> +    if (again && dma_avail > 0)
> +        goto retry;
>  err:
>      if (error) {
>          pbdev->state = ZPCI_FS_ERROR;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] s390x/pci: coalesce unmap operations
  2022-10-28 19:47 ` [PATCH 2/3] s390x/pci: coalesce unmap operations Matthew Rosato
@ 2022-11-04 16:01   ` Eric Farman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Farman @ 2022-11-04 16:01 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

On Fri, 2022-10-28 at 15:47 -0400, Matthew Rosato wrote:
> Currently, each unmapped page is handled as an individual iommu
> region notification.  Attempt to group contiguous unmap operations
> into fewer notifications to reduce overhead.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-inst.c | 51
> ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 7cc4bcf850..66e764f901 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -640,6 +640,8 @@ static uint32_t
> s390_pci_update_iotlb(S390PCIIOMMU *iommu,
>          }
>          g_hash_table_remove(iommu->iotlb, &entry->iova);
>          inc_dma_avail(iommu);
> +        /* Don't notify the iommu yet, maybe we can bundle
> contiguous unmaps */
> +        goto out;
>      } else {
>          if (cache) {
>              if (cache->perm == entry->perm &&
> @@ -663,15 +665,44 @@ static uint32_t
> s390_pci_update_iotlb(S390PCIIOMMU *iommu,
>          dec_dma_avail(iommu);
>      }
>  
> +    /*
> +     * All associated iotlb entries have already been cleared,
> trigger the
> +     * unmaps.
> +     */
>      memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
>  
>  out:
>      return iommu->dma_limit ? iommu->dma_limit->avail : 1;
>  }
>  
> +static void s390_pci_batch_unmap(S390PCIIOMMU *iommu, uint64_t iova,
> +                                 uint64_t len)
> +{
> +    uint64_t remain = len, start = iova, end = start + len - 1,
> mask, size;
> +    IOMMUTLBEvent event = {
> +        .type = IOMMU_NOTIFIER_UNMAP,
> +        .entry = {
> +            .target_as = &address_space_memory,
> +            .translated_addr = 0,
> +            .perm = IOMMU_NONE,
> +        },
> +    };
> +
> +    while (remain >= TARGET_PAGE_SIZE) {
> +        mask = dma_aligned_pow2_mask(start, end, 64);
> +        size = mask + 1;
> +        event.entry.iova = start;
> +        event.entry.addr_mask = mask;
> +        memory_region_notify_iommu(&iommu->iommu_mr, 0, event);
> +        start += size;
> +        remain -= size;
> +    }
> +}
> +
>  int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2,
> uintptr_t ra)
>  {
>      CPUS390XState *env = &cpu->env;
> +    uint64_t iova, coalesce = 0;
>      uint32_t fh;
>      uint16_t error = 0;
>      S390PCIBusDevice *pbdev;
> @@ -742,6 +773,21 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1,
> uint8_t r2, uintptr_t ra)
>              break;
>          }
>  
> +        /*
> +         * If this is an unmap of a PTE, let's try to coalesce
> multiple unmaps
> +         * into as few notifier events as possible.
> +         */
> +        if (entry.perm == IOMMU_NONE && entry.len ==
> TARGET_PAGE_SIZE) {
> +            if (coalesce == 0) {
> +                iova = entry.iova;
> +            }
> +            coalesce += entry.len;
> +        } else if (coalesce > 0) {
> +            /* Unleash the coalesced unmap before processing a new
> map */
> +            s390_pci_batch_unmap(iommu, iova, coalesce);
> +            coalesce = 0;
> +        }
> +
>          start += entry.len;
>          while (entry.iova < start && entry.iova < end) {
>              if (dma_avail > 0 || entry.perm == IOMMU_NONE) {
> @@ -759,6 +805,11 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1,
> uint8_t r2, uintptr_t ra)
>              }
>          }
>      }
> +    if (coalesce) {

I'd guess this should be "coalesce > 0" as above. Regardless,

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> +            /* Unleash the coalesced unmap before finishing rpcit */
> +            s390_pci_batch_unmap(iommu, iova, coalesce);
> +            coalesce = 0;
> +    }
>      if (again && dma_avail > 0)
>          goto retry;
>  err:


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit
  2022-10-28 19:47 ` [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit Matthew Rosato
@ 2022-11-04 16:08   ` Eric Farman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Farman @ 2022-11-04 16:08 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: pmorel, schnelle, cohuck, thuth, pasic, borntraeger,
	richard.henderson, david, qemu-devel

On Fri, 2022-10-28 at 15:47 -0400, Matthew Rosato wrote:
> Currently, s390x-pci performs accounting against the vfio DMA
> limit and triggers the guest to clean up mappings when the limit
> is reached. Let's go a step further and also limit the size of
> the supported DMA aperture reported to the guest based upon the
> initial vfio DMA limit reported for the container (if less than
> than the size reported by the firmware/host zPCI layer).  This
> avoids processing sections of the guest DMA table during global
> refresh that, for common use cases, will never be used anway, and
> makes exhausting the vfio DMA limit due to mismatch between guest
> aperture size and host limit far less likely and more indicitive
> of an error.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Eric Farman <farman@linux.ibm.com>

> ---
>  hw/s390x/s390-pci-vfio.c        | 11 +++++++++++
>  include/hw/s390x/s390-pci-bus.h |  1 +
>  2 files changed, 12 insertions(+)
> 
> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
> index 2aefa508a0..99806e2a84 100644
> --- a/hw/s390x/s390-pci-vfio.c
> +++ b/hw/s390x/s390-pci-vfio.c
> @@ -84,6 +84,7 @@ S390PCIDMACount
> *s390_pci_start_dma_count(S390pciState *s,
>      cnt->users = 1;
>      cnt->avail = avail;
>      QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link);
> +    pbdev->iommu->max_dma_limit = avail;
>      return cnt;
>  }
>  
> @@ -103,6 +104,7 @@ static void s390_pci_read_base(S390PCIBusDevice
> *pbdev,
>      struct vfio_info_cap_header *hdr;
>      struct vfio_device_info_cap_zpci_base *cap;
>      VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice,
> pdev);
> +    uint64_t vfio_size;
>  
>      hdr = vfio_get_device_info_cap(info,
> VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
>  
> @@ -122,6 +124,15 @@ static void s390_pci_read_base(S390PCIBusDevice
> *pbdev,
>      /* The following values remain 0 until we support other FMB
> formats */
>      pbdev->zpci_fn.fmbl = 0;
>      pbdev->zpci_fn.pft = 0;
> +
> +    /*
> +     * If appropriate, reduce the size of the supported DMA aperture
> reported
> +     * to the guest based upon the vfio DMA limit.
> +     */
> +    vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
> +    if (vfio_size < (cap->end_dma - cap->start_dma + 1)) {
> +        pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
> +    }
>  }
>  
>  static bool get_host_fh(S390PCIBusDevice *pbdev, struct
> vfio_device_info *info,
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-
> pci-bus.h
> index 0605fcea24..1c46e3a269 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -278,6 +278,7 @@ struct S390PCIIOMMU {
>      uint64_t g_iota;
>      uint64_t pba;
>      uint64_t pal;
> +    uint64_t max_dma_limit;
>      GHashTable *iotlb;
>      S390PCIDMACount *dma_limit;
>  };



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] s390x/pci: rpcit fixes and enhancements
  2022-10-28 19:47 [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Matthew Rosato
                   ` (2 preceding siblings ...)
  2022-10-28 19:47 ` [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit Matthew Rosato
@ 2022-12-12  8:37 ` Thomas Huth
  3 siblings, 0 replies; 8+ messages in thread
From: Thomas Huth @ 2022-12-12  8:37 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, pmorel, schnelle, cohuck, pasic, borntraeger,
	richard.henderson, david, qemu-devel

On 28/10/2022 21.47, Matthew Rosato wrote:
> The following series fixes an issue with guest RPCIT processing discovered
> during development of [1] as well as proposes a few additional optimizations
> to the current RPCIT codepath.
> 
> [1] https://lore.kernel.org/linux-s390/20221019144435.369902-1-schnelle@linux.ibm.com/
> 
> Matthew Rosato (3):
>    s390x/pci: RPCIT second pass when mappings exhausted
>    s390x/pci: coalesce unmap operations
>    s390x/pci: shrink DMA aperture to be bound by vfio DMA limit

Thanks, I've queued patch 2 and 3 now to my s390x-next branch, too:

  https://gitlab.com/thuth/qemu/-/commits/s390x-next/

  Thomas




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-12  8:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 19:47 [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Matthew Rosato
2022-10-28 19:47 ` [PATCH 1/3] s390x/pci: RPCIT second pass when mappings exhausted Matthew Rosato
2022-11-04 15:50   ` Eric Farman
2022-10-28 19:47 ` [PATCH 2/3] s390x/pci: coalesce unmap operations Matthew Rosato
2022-11-04 16:01   ` Eric Farman
2022-10-28 19:47 ` [PATCH 3/3] s390x/pci: shrink DMA aperture to be bound by vfio DMA limit Matthew Rosato
2022-11-04 16:08   ` Eric Farman
2022-12-12  8:37 ` [PATCH 0/3] s390x/pci: rpcit fixes and enhancements Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.