All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org, Jason Wang <jasowang@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Yi Liu <yi.l.liu@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH] intel-iommu: Document iova_tree
Date: Thu, 1 Dec 2022 14:44:24 -0500	[thread overview]
Message-ID: <Y4kEGP3kHeo8HttC@x1n> (raw)
In-Reply-To: <Y4j+8y8EnWkZor2+@x1n>

[-- Attachment #1: Type: text/plain, Size: 3772 bytes --]

On Thu, Dec 01, 2022 at 02:22:27PM -0500, Peter Xu wrote:
> On Thu, Dec 01, 2022 at 07:17:41PM +0100, Eric Auger wrote:
> > Hi Peter
> 
> Hi, Eric,
> 
> > 
> > On 12/1/22 17:25, Peter Xu wrote:
> > > It seems not super clear on when iova_tree is used, and why.  Add a rich
> > > comment above iova_tree to track why we needed the iova_tree, and when we
> > > need it.
> > >
> > > Suggested-by: Jason Wang <jasowang@redhat.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  include/hw/i386/intel_iommu.h | 30 +++++++++++++++++++++++++++++-
> > >  1 file changed, 29 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> > > index 46d973e629..8d130ab2e3 100644
> > > --- a/include/hw/i386/intel_iommu.h
> > > +++ b/include/hw/i386/intel_iommu.h
> > > @@ -109,7 +109,35 @@ struct VTDAddressSpace {
> > >      QLIST_ENTRY(VTDAddressSpace) next;
> > >      /* Superset of notifier flags that this address space has */
> > >      IOMMUNotifierFlag notifier_flags;
> > > -    IOVATree *iova_tree;          /* Traces mapped IOVA ranges */
> > > +    /*
> > > +     * @iova_tree traces mapped IOVA ranges.
> > > +     *
> > > +     * The tree is not needed if no MAP notifiers is registered with
> > 
> > s/no MAP notifiers/no MAP notifier
> 
> Will fix.
> 
> > > +     * current VTD address space, because all UNMAP (including iotlb or
> > > +     * dev-iotlb) events can be transparently delivered to !MAP iommu
> > > +     * notifiers.
> > because all UNMAP notifications (iotlb or dev-iotlb) can be triggered
> > directly, as opposed to MAP notifications. (?)
> 
> What I wanted to say is any PSI or DSI messages we got from the guest can
> be transparently delivered to QEMU's iommu notifiers.  I'm not sure
> "triggered directly" best describe the case here.
> 
> PSI: Page Selective Invalidations
> DSI: Domain Selective Invalidations
> 
> Sorry to mention these terms again, but that's really what the "transparent
> delivery" means here - we get the PSI/DSI messages, then we notify with the
> same ranges in IOMMU notifiers.  They're not the same concept but we do
> that transparently without changing the core of the messages.
> 
> Maybe I should spell out "!MAP" as "UNMAP-only"?  Would that help?
> 
> > > +     *
> > > +     * The tree OTOH is required for MAP typed iommu notifiers for a few
> > > +     * reasons.
> > > +     *
> > > +     * Firstly, there's no way to identify whether an PSI event is MAP or
> > maybe give the decryption of the 'PSI' and 'DSI" acronyms once ;-)
> 
> Please see above. :)
> 
> These are VT-d terms used in multiple places in the .[ch] files, I assume
> I'll just keep using them because otherwise I'll need to comment them
> everytime we use any PSI/DSI terms.  It might become an overkill I'm afraid.
> 
> > > +     * UNMAP within the PSI message itself.  Without having prior knowledge
> > > +     * of existing state vIOMMU doesn't know whether it should notify MAP
> > > +     * or UNMAP for a PSI message it received.
> > > +     *
> > > +     * Secondly, PSI received from guest driver (or even a large PSI can
> > > +     * grow into a DSI at least with Linux intel-iommu driver) can be
> > > +     * larger in range than the newly mapped ranges for either MAP or UNMAP
> > > +     * events. If it directly pass-throughs any such event it may confuse
> > 
> > If it directly notifies the registered device with the unmodified range, it may confuse the drivers ../..
> 
> Will fix.
> 
> > 
> > So the range of the MAP notification can be adapted based on the existing IOVA mappings.  
> 
> Yes, e.g. the iova tree makes sure we don't map something again if it's mapped.

I figured maybe simpler I just attach a v2..

Thanks,

-- 
Peter Xu

[-- Attachment #2: 0001-intel-iommu-Document-iova_tree.patch --]
[-- Type: text/plain, Size: 2862 bytes --]

From 67594a4bfad481a2810b69e2b17e6399f39a18a2 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 1 Dec 2022 11:11:41 -0500
Subject: [PATCH] intel-iommu: Document iova_tree
Content-type: text/plain

It seems not super clear on when iova_tree is used, and why.  Add a rich
comment above iova_tree to track why we needed the iova_tree, and when we
need it.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/hw/i386/intel_iommu.h | 37 ++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 46d973e629..d96da8cc75 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -109,7 +109,42 @@ struct VTDAddressSpace {
     QLIST_ENTRY(VTDAddressSpace) next;
     /* Superset of notifier flags that this address space has */
     IOMMUNotifierFlag notifier_flags;
-    IOVATree *iova_tree;          /* Traces mapped IOVA ranges */
+    /*
+     * @iova_tree traces mapped IOVA ranges.
+     *
+     * The tree is not needed if no MAP notifier is registered with current
+     * VTD address space, because all UNMAP events the vIOMMU receives (for
+     * either iotlb or dev-iotlb) can be transparently delivered to iommu
+     * notifiers.
+     *
+     * The tree OTOH is required for MAP typed iommu notifiers for a few
+     * reasons.
+     *
+     * Firstly, there's no way to identify whether an PSI/DSI event is an
+     * MAP or UNMAP event within the message itself.  Without having prior
+     * knowledge of existing state vIOMMU doesn't know whether it should
+     * notify MAP or UNMAP for a PSI message it received when caching mode
+     * is enabled (for MAP notifiers).
+     *
+     * Secondly, PSI messages received from guest driver can be enlarged in
+     * range, covers but not limited to what the guest driver wanted to
+     * invalidate.  When the range to invalidates gets bigger than the
+     * limit of a PSI message, it can even become a DSI which will
+     * invalidate the whole domain.  If the vIOMMU directly notifies the
+     * registered device with the unmodified range, it may confuse the
+     * registered drivers (e.g. vfio-pci) on either:
+     *
+     *   (1) Trying to map the same region more than once (for
+     *       VFIO_IOMMU_MAP_DMA, -EEXIST will trigger), or,
+     *
+     *   (2) Trying to UNMAP a range that is still partially mapped.
+     *
+     * That accuracy is not required for UNMAP-only notifiers, but it is a
+     * must-to-have for notifiers registered with MAP events, because the
+     * vIOMMU needs to make sure the shadow page table is always in sync
+     * with the guest IOMMU pgtables for a device.
+     */
+    IOVATree *iova_tree;
 };
 
 struct VTDIOTLBEntry {
-- 
2.37.3


  reply	other threads:[~2022-12-01 19:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-01 16:25 [PATCH] intel-iommu: Document iova_tree Peter Xu
2022-12-01 18:17 ` Eric Auger
2022-12-01 19:22   ` Peter Xu
2022-12-01 19:44     ` Peter Xu [this message]
2022-12-06 13:06     ` Eric Auger
2022-12-06 16:02       ` Peter Xu
2022-12-05  4:23 ` Jason Wang
2022-12-05 23:28   ` Peter Xu
2022-12-06  7:04     ` Jason Wang
2022-12-06 13:16     ` Eric Auger
2022-12-06 16:05       ` Peter Xu
2022-12-06 16:28         ` Eric Auger
2022-12-06 22:09           ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4kEGP3kHeo8HttC@x1n \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.