kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Liu, Yi L" <yi.l.liu@intel.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"Tian, Jun J" <jun.j.tian@intel.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Wu, Hao" <hao.wu@intel.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Yi Sun <yi.y.sun@linux.intel.com>
Subject: Re: [RFC v3 02/25] hw/iommu: introduce DualStageIOMMUObject
Date: Wed, 12 Feb 2020 17:32:53 +1100	[thread overview]
Message-ID: <20200212063253.GA22584@umbus.fritz.box> (raw)
In-Reply-To: <A2975661238FB949B60364EF0F2C25743A1992F1@SHSMSX104.ccr.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 6315 bytes --]

On Fri, Jan 31, 2020 at 11:42:06AM +0000, Liu, Yi L wrote:
> Hi David,
> 
> > From: David Gibson [mailto:david@gibson.dropbear.id.au]
> > Sent: Friday, January 31, 2020 11:59 AM
> > To: Liu, Yi L <yi.l.liu@intel.com>
> > Subject: Re: [RFC v3 02/25] hw/iommu: introduce DualStageIOMMUObject
> > 
> > On Wed, Jan 29, 2020 at 04:16:33AM -0800, Liu, Yi L wrote:
> > > From: Liu Yi L <yi.l.liu@intel.com>
> > >
> > > Currently, many platform vendors provide the capability of dual stage
> > > DMA address translation in hardware. For example, nested translation
> > > on Intel VT-d scalable mode, nested stage translation on ARM SMMUv3,
> > > and etc. In dual stage DMA address translation, there are two stages
> > > address translation, stage-1 (a.k.a first-level) and stage-2 (a.k.a
> > > second-level) translation structures. Stage-1 translation results are
> > > also subjected to stage-2 translation structures. Take vSVA (Virtual
> > > Shared Virtual Addressing) as an example, guest IOMMU driver owns
> > > stage-1 translation structures (covers GVA->GPA translation), and host
> > > IOMMU driver owns stage-2 translation structures (covers GPA->HPA
> > > translation). VMM is responsible to bind stage-1 translation structures
> > > to host, thus hardware could achieve GVA->GPA and then GPA->HPA
> > > translation. For more background on SVA, refer the below links.
> > >  - https://www.youtube.com/watch?v=Kq_nfGK5MwQ
> > >  - https://events19.lfasiallc.com/wp-content/uploads/2017/11/\
> > > Shared-Virtual-Memory-in-KVM_Yi-Liu.pdf
> > >
> > > As above, dual stage DMA translation offers two stage address mappings,
> > > which could have better DMA address translation support for passthru
> > > devices. This is also what vIOMMU developers are doing so far. Efforts
> > > includes vSVA enabling from Yi Liu and SMMUv3 Nested Stage Setup from
> > > Eric Auger.
> > > https://www.spinics.net/lists/kvm/msg198556.html
> > > https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg02842.html
> > >
> > > Both efforts are aiming to expose a vIOMMU with dual stage hardware
> > > backed. As so, QEMU needs to have an explicit object to stand for
> > > the dual stage capability from hardware. Such object offers abstract
> > > for the dual stage DMA translation related operations, like:
> > >
> > >  1) PASID allocation (allow host to intercept in PASID allocation)
> > >  2) bind stage-1 translation structures to host
> > >  3) propagate stage-1 cache invalidation to host
> > >  4) DMA address translation fault (I/O page fault) servicing etc.
> > >
> > > This patch introduces DualStageIOMMUObject to stand for the hardware
> > > dual stage DMA translation capability. PASID allocation/free are the
> > > first operation included in it, in future, there will be more operations
> > > like bind_stage1_pgtbl and invalidate_stage1_cache and etc.
> > >
> > > Cc: Kevin Tian <kevin.tian@intel.com>
> > > Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > Cc: Peter Xu <peterx@redhat.com>
> > > Cc: Eric Auger <eric.auger@redhat.com>
> > > Cc: Yi Sun <yi.y.sun@linux.intel.com>
> > > Cc: David Gibson <david@gibson.dropbear.id.au>
> > > Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> > 
> > Several overall queries about this:
> > 
> > 1) Since it's explicitly handling PASIDs, this seems a lot more
> >    specific to SVM than the name suggests.  I'd suggest a rename.
> 
> It is not specific to SVM in future. We have efforts to move guest
> IOVA support based on host IOMMU's dual-stage DMA translation
> capability.

It's assuming the existence of pasids though, which is a rather more
specific model than simply having two translation stages.

> Then, guest IOVA support will also re-use the methods
> provided by this abstract layer. e.g. the bind_guest_pgtbl() and
> flush_iommu_iotlb().
> 
> For the naming, how about HostIOMMUContext? This layer is to provide
> explicit methods for setting up dual-stage DMA translation in host.

Uh.. maybe?  I'm still having trouble figuring out what this object
really represents.

> > 2) Why are you hand rolling structures of pointers, rather than making
> >    this a QOM class or interface and putting those things into methods?
> 
> Maybe the name is not proper. Although I named it as DualStageIOMMUObject,
> it is actually a kind of abstract layer we discussed in previous email. I
> think this is similar with VFIO_MAP/UNMAP. The difference is that VFIO_MAP/
> UNMAP programs mappings to host iommu domain. While the newly added explicit
> method is to link guest page table to host iommu domain. VFIO_MAP/UNMAP
> is exposed to vIOMMU emulators via MemoryRegion layer. right? Maybe adding a
> similar abstract layer is enough. Is adding QOM really necessary for this
> case?

Um... sorry, I'm having a lot of trouble making any sense of that.

> > 3) It's not really clear to me if this is for the case where both
> >    stages of translation are visible to the guest, or only one of
> >    them.
> 
> For this case, vIOMMU will only expose a single stage translation to VM.
> e.g. Intel VT-d, vIOMMU exposes first-level translation to guest. Hardware
> IOMMUs with the dual-stage translation capability lets guest own stage-1
> translation structures and host owns the stage-2 translation structures.
> VMM is responsible to bind guest's translation structures to host and
> enable dual-stage translation. e.g. on Intel VT-d, config translation type
> to be NESTED.

Ok, understood.

> Take guest SVM as an example, guest iommu driver owns the gVA->gPA mappings,
> which is treated as stage-1 translation from host point of view. Host itself
> owns the gPA->hPPA translation and called stage-2 translation when dual-stage
> translation is configured.
> 
> For guest IOVA, it is similar with guest SVM. Guest iommu driver owns the
> gIOVA->gPA mappings, which is treated as stage-1 translation. Host owns the
> gPA->hPA translation.

Ok, that makes sense.  It's still not really clear to me which part of
this setup this object represents.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2020-02-12  8:05 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 12:16 [RFC v3 00/25] intel_iommu: expose Shared Virtual Addressing to VMs Liu, Yi L
2020-01-29 12:16 ` [RFC v3 01/25] hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps Liu, Yi L
2020-01-29 12:16 ` [RFC v3 02/25] hw/iommu: introduce DualStageIOMMUObject Liu, Yi L
2020-01-31  3:59   ` David Gibson
2020-01-31 11:42     ` Liu, Yi L
2020-02-12  6:32       ` David Gibson [this message]
2020-01-29 12:16 ` [RFC v3 03/25] hw/iommu: introduce IOMMUContext Liu, Yi L
2020-01-31  4:06   ` David Gibson
2020-01-31 11:42     ` Liu, Yi L
2020-02-11 16:58       ` Peter Xu
2020-02-12  7:15         ` Liu, Yi L
2020-02-12 15:59           ` Peter Xu
2020-02-13  2:46             ` Liu, Yi L
2020-02-14  5:36           ` David Gibson
2020-02-15  6:25             ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 04/25] hw/pci: introduce pci_device_iommu_context() Liu, Yi L
2020-01-29 12:16 ` [RFC v3 05/25] intel_iommu: provide get_iommu_context() callback Liu, Yi L
2020-01-29 12:16 ` [RFC v3 06/25] scripts/update-linux-headers: Import iommu.h Liu, Yi L
2020-01-29 12:25   ` Cornelia Huck
2020-01-31 11:40     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 07/25] header file update VFIO/IOMMU vSVA APIs Liu, Yi L
2020-01-29 12:28   ` Cornelia Huck
2020-01-31 11:41     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 08/25] vfio: pass IOMMUContext into vfio_get_group() Liu, Yi L
2020-01-29 12:16 ` [RFC v3 09/25] vfio: check VFIO_TYPE1_NESTING_IOMMU support Liu, Yi L
2020-02-11 19:08   ` Peter Xu
2020-02-12  7:16     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 10/25] vfio: register DualStageIOMMUObject to vIOMMU Liu, Yi L
2020-01-29 12:16 ` [RFC v3 11/25] vfio: get stage-1 pasid formats from Kernel Liu, Yi L
2020-02-11 19:30   ` Peter Xu
2020-02-12  7:19     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 12/25] vfio/common: add pasid_alloc/free support Liu, Yi L
2020-02-11 19:31   ` Peter Xu
2020-02-12  7:20     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 13/25] intel_iommu: modify x-scalable-mode to be string option Liu, Yi L
2020-02-11 19:43   ` Peter Xu
2020-02-12  7:28     ` Liu, Yi L
2020-02-12 16:05       ` Peter Xu
2020-02-13  2:44         ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 14/25] intel_iommu: add virtual command capability support Liu, Yi L
2020-02-11 20:16   ` Peter Xu
2020-02-12  7:32     ` Liu, Yi L
2020-02-11 21:56   ` Peter Xu
2020-02-13  2:40     ` Liu, Yi L
2020-02-13 14:31       ` Peter Xu
2020-02-13 15:08         ` Peter Xu
2020-02-15  8:49           ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 15/25] intel_iommu: process pasid cache invalidation Liu, Yi L
2020-02-11 20:17   ` Peter Xu
2020-02-12  7:33     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 16/25] intel_iommu: add PASID cache management infrastructure Liu, Yi L
2020-02-11 23:35   ` Peter Xu
2020-02-12  8:37     ` Liu, Yi L
2020-02-12 15:26       ` Peter Xu
2020-02-13  2:59         ` Liu, Yi L
2020-02-13 15:14           ` Peter Xu
2020-02-15  8:50             ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 17/25] vfio: add bind stage-1 page table support Liu, Yi L
2020-01-29 12:16 ` [RFC v3 18/25] intel_iommu: bind/unbind guest page table to host Liu, Yi L
2020-01-29 12:16 ` [RFC v3 19/25] intel_iommu: replay guest pasid bindings " Liu, Yi L
2020-01-29 12:16 ` [RFC v3 20/25] intel_iommu: replay pasid binds after context cache invalidation Liu, Yi L
2020-01-29 12:16 ` [RFC v3 21/25] intel_iommu: do not pass down pasid bind for PASID #0 Liu, Yi L
2020-01-29 12:16 ` [RFC v3 22/25] vfio: add support for flush iommu stage-1 cache Liu, Yi L
2020-01-29 12:16 ` [RFC v3 23/25] intel_iommu: process PASID-based iotlb invalidation Liu, Yi L
2020-01-29 12:16 ` [RFC v3 24/25] intel_iommu: propagate PASID-based iotlb invalidation to host Liu, Yi L
2020-01-29 12:16 ` [RFC v3 25/25] intel_iommu: process PASID-based Device-TLB invalidation Liu, Yi L
2020-01-29 13:44 ` [RFC v3 00/25] intel_iommu: expose Shared Virtual Addressing to VMs no-reply
2020-01-29 13:48 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200212063253.GA22584@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=alex.williamson@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).