All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Michael S . Tsirkin" <mst@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Eric Auger <eric.auger@redhat.com>,
	peterx@redhat.com, Jason Wang <jasowang@redhat.com>
Subject: [Qemu-devel] [PATCH 5/5] vfio: retry one more time conditionally for type1 unmap
Date: Tue,  8 Jan 2019 19:47:20 +0800	[thread overview]
Message-ID: <20190108114720.21760-6-peterx@redhat.com> (raw)
In-Reply-To: <20190108114720.21760-1-peterx@redhat.com>

In Linux version v4.15+ there's a bug (introduced in 71a7d3d78e3c,
"vfio/type1: silence integer overflow warning", 2017-10-20) that could
potentially reject a valid unmap region that covers exactly the whole
u64 address space (like iova=0xfef00000, size=2^64-0xfef00000).
Besides a fix on the kernel side, QEMU also needs to live well even
with the old kernels.  When booting a guest with both vfio-pci and
intel-iommu device, we can see error dumped:

  qemu-kvm: VFIO_UNMAP_DMA: -22
  qemu-kvm: vfio_dma_unmap(0x561f059948f0, 0xfef00000,
            0xffffffff01100000) = -22 (Invalid argument)

This patch gives another shot of the UNMAP ioctl if the specific error
was detected, while in the second UNMAP ioctl we skip the last page
assuming that it's never used.  In our case, currently only Intel VT-d
is using this code and it should never use the iova address
2^64-4096 (so far largest supported GAW is 57 bits) so ignoring that
page should be fine.

After this patch is applied, the errors go away.

Reported-by: Pei Zhang <pezhang@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/vfio/common.c     | 16 ++++++++++++++++
 hw/vfio/trace-events |  1 +
 2 files changed, 17 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 7c185e5a2e..7f8de5b7a5 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -221,6 +221,22 @@ static int vfio_dma_unmap(VFIOContainer *container,
     };
 
     if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
+        /*
+         * Give it another shot due to a bug in kernel (v4.15-v4.20)
+         * that could potentially reject a region that exactly covers
+         * the whole u64 address space (71a7d3d78e3c, "vfio/type1:
+         * silence integer overflow warning", 2017-10-20).  If that
+         * happens, we retry for one more time assuming that the last
+         * page of the address space (2^64-getpagesize()) has already
+         * been dropped.
+         */
+        if (errno == EINVAL && unmap.size && unmap.iova + unmap.size == 0) {
+            trace_vfio_dma_unmap_workaround_overflow();
+            unmap.size -= getpagesize();
+            if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap) == 0) {
+                return 0;
+            }
+        }
         error_report("VFIO_UNMAP_DMA: %d", -errno);
         return -errno;
     }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index a85e8662ea..2c9db4fb9a 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -110,6 +110,7 @@ vfio_region_mmaps_set_enabled(const char *name, bool enabled) "Region %s mmaps e
 vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Device %s region %d: %d sparse mmap entries"
 vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]"
 vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%0x8"
+vfio_dma_unmap_workaround_overflow(void) ""
 
 # hw/vfio/platform.c
 vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d"
-- 
2.17.1

  parent reply	other threads:[~2019-01-08 11:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-08 11:47 [Qemu-devel] [PATCH 0/5] intel_iommu: misc fixes for error exposed after error_report_once() Peter Xu
2019-01-08 11:47 ` [Qemu-devel] [PATCH 1/5] intel_iommu: fix operator in vtd_switch_address_space Peter Xu
2019-01-11  4:03   ` Jason Wang
2019-01-08 11:47 ` [Qemu-devel] [PATCH 2/5] intel_iommu: reset intr_enabled when system reset Peter Xu
2019-01-11  4:04   ` Jason Wang
2019-01-08 11:47 ` [Qemu-devel] [PATCH 3/5] pci/msi: export msi_is_masked() Peter Xu
2019-01-08 11:47 ` [Qemu-devel] [PATCH 4/5] i386/kvm: ignore masked irqs when update msi routes Peter Xu
2019-01-08 11:47 ` Peter Xu [this message]
2019-01-08 15:23   ` [Qemu-devel] [PATCH 5/5] vfio: retry one more time conditionally for type1 unmap Alex Williamson
2019-01-09  2:53     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190108114720.21760-6-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.