All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: Tian Kevin <kevin.tian@intel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jintack Lim <jintack@cs.columbia.edu>,
	peterx@redhat.com, Jason Wang <jasowang@redhat.com>
Subject: [Qemu-devel] [PATCH v2 09/10] intel-iommu: don't unmap all for shadow page table
Date: Fri,  4 May 2018 11:08:10 +0800	[thread overview]
Message-ID: <20180504030811.28111-10-peterx@redhat.com> (raw)
In-Reply-To: <20180504030811.28111-1-peterx@redhat.com>

IOMMU replay was carried out before in many use cases, e.g., context
cache invalidations, domain flushes.  We used this mechanism to sync the
shadow page table by firstly (1) unmap the whole address space, then
(2) walk the page table to remap what's in the table.

This is very dangerous.

The problem is that we'll have a very small window (in my measurement,
it can be about 3ms) during above step (1) and (2) that the device will
see no (or incomplete) device page table.  Howerver the device never
knows that.  This can cause DMA error of devices, who assumes the page
table is always there.

So the point is that, for MAP typed notifiers (vfio-pci, for example)
they'll need the mapped page entries always be there.  We can never
unmap any existing page entries like what we did in (1) above.

The only solution is to remove step (1).  We can't do that before since
we didn't know what device page was mapped and what was not, so we unmap
them all.  Now with the new IOVA tree QEMU knows what has mapped and
what has not.  We don't need this step (1) any more.  Remove it.

Note that after removing that global unmap flushing, we'll need to
notify unmap now during page walkings.

This should fix the DMA error problem that Jintack Lim reported with
nested device assignment.  This problem won't not happen always, e.g., I
cannot reproduce the error.  However after collecting logs it shows that
this is the possible cause to Jintack's problem.

Reported-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 16b3514afb..9439103cac 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3003,10 +3003,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
 
     /*
      * The replay can be triggered by either a invalidation or a newly
-     * created entry. No matter what, we release existing mappings
-     * (it means flushing caches for UNMAP-only registers).
+     * created entry.
      */
-    vtd_address_space_unmap(vtd_as, n);
 
     if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) {
         trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn),
@@ -3015,8 +3013,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
                                   ce.hi, ce.lo);
         if (vtd_as_notify_mappings(vtd_as)) {
             /* This is required only for MAP typed notifiers */
-            vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false,
+            vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, true,
                           vtd_as);
+        } else {
+            vtd_address_space_unmap(vtd_as, n);
         }
     } else {
         trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn),
-- 
2.17.0

  parent reply	other threads:[~2018-05-04  3:09 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-04  3:08 [Qemu-devel] [PATCH v2 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 01/10] intel-iommu: send PSI always even if across PDEs Peter Xu
2018-05-17 14:42   ` Auger Eric
2018-05-18  3:41     ` Peter Xu
2018-05-18  7:39       ` Auger Eric
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 02/10] intel-iommu: remove IntelIOMMUNotifierNode Peter Xu
2018-05-17  9:46   ` Auger Eric
2018-05-17 10:02     ` Peter Xu
2018-05-17 10:10       ` Auger Eric
2018-05-17 10:14         ` Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 03/10] intel-iommu: add iommu lock Peter Xu
2018-05-17 14:32   ` Auger Eric
2018-05-18  5:32     ` Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 04/10] intel-iommu: only do page walk for MAP notifiers Peter Xu
2018-05-17 13:39   ` Auger Eric
2018-05-18  5:53     ` Peter Xu
2018-05-18  7:38       ` Auger Eric
2018-05-18 10:02         ` Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 05/10] intel-iommu: introduce vtd_page_walk_info Peter Xu
2018-05-17 14:32   ` Auger Eric
2018-05-18  5:59     ` Peter Xu
2018-05-18  7:24       ` Auger Eric
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 06/10] intel-iommu: pass in address space when page walk Peter Xu
2018-05-17 14:32   ` Auger Eric
2018-05-18  6:02     ` Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 07/10] util: implement simple interval tree logic Peter Xu
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 08/10] intel-iommu: maintain per-device iova ranges Peter Xu
2018-05-04  3:08 ` Peter Xu [this message]
2018-05-17 17:23   ` [Qemu-devel] [PATCH v2 09/10] intel-iommu: don't unmap all for shadow page table Auger Eric
2018-05-18  6:06     ` Peter Xu
2018-05-18  7:31       ` Auger Eric
2018-05-04  3:08 ` [Qemu-devel] [PATCH v2 10/10] intel-iommu: remove notify_unmap for page walk Peter Xu
2018-05-04  3:20 ` [Qemu-devel] [PATCH v2 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes no-reply
2018-05-04  3:40   ` Peter Xu
2018-05-08  7:29 ` [Qemu-devel] [PATCH v2 11/10] tests: add interval tree unit test Peter Xu
2018-05-16  6:30 ` [Qemu-devel] [PATCH v2 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-05-16 13:57   ` Jason Wang
2018-05-17  2:45     ` Peter Xu
2018-05-17  3:39       ` Alex Williamson
2018-05-17  4:16         ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180504030811.28111-10-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jintack@cs.columbia.edu \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.