kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Raj, Ashok" <ashok.raj@intel.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Jason Wang <jasowang@redhat.com>, Liu Yi L <yi.l.liu@intel.com>,
	<eric.auger@redhat.com>, <baolu.lu@linux.intel.com>,
	<joro@8bytes.org>, <kevin.tian@intel.com>,
	<jacob.jun.pan@linux.intel.com>, <jun.j.tian@intel.com>,
	<yi.y.sun@intel.com>, <peterx@redhat.com>, <hao.wu@intel.com>,
	<stefanha@gmail.com>, <iommu@lists.linux-foundation.org>,
	<kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs
Date: Mon, 14 Sep 2020 16:33:10 -0600	[thread overview]
Message-ID: <20200914163310.450c8d6e@x1.home> (raw)
In-Reply-To: <20200914190057.GM904879@nvidia.com>

On Mon, 14 Sep 2020 16:00:57 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Sep 14, 2020 at 12:23:28PM -0600, Alex Williamson wrote:
> > On Mon, 14 Sep 2020 14:41:21 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
> > >    
> > > > "its own special way" is arguable, VFIO is just making use of what's
> > > > being proposed as the uapi via its existing IOMMU interface.    
> > > 
> > > I mean, if we have a /dev/sva then it makes no sense to extend the
> > > VFIO interfaces with the same stuff. VFIO should simply accept a PASID
> > > created from /dev/sva and use it just like any other user-DMA driver
> > > would.  
> > 
> > I don't think that's absolutely true.  By the same logic, we could say
> > that pci-sysfs provides access to PCI BAR and config space
> > resources,  
> 
> No, it is the reverse, VFIO is a better version of pci-sysfs, so
> pci-sysfs is the one that is obsoleted by VFIO. Similarly a /dev/sva
> would be the superset interface for PASID, so whatver VFIO has would
> be obsoleted.
> 
> It would be very unusual for the kernel to have to 'preferred'
> interfaces for the same thing, IMHO. The review process for uAPI
> should really prevent that by allowing all interests to be served
> while the uAPI is designed.
> 
> > the VFIO device interface duplicates part of that interface therefore it
> > should be abandoned.  But in reality, VFIO providing access to those
> > resources puts those accesses within the scope and control of the VFIO
> > interface.  
> 
> Not clear to my why VFIO needs that. PASID seems quite orthogonal from
> VFIO to me.


Can you explain that further, or spit-ball what you think this /dev/sva
interface looks like and how a user might interact between vfio and
this new interface?  The interface proposed here definitely does not
seem orthogonal to the vfio IOMMU interface, ie. selecting a specific
IOMMU domain mode during vfio setup, allocating pasids and associating
them with page tables for that two-stage IOMMU setup, performing cache
invalidations based on page table updates, etc.  How does it make more
sense for a vIOMMU to setup some aspects of the IOMMU through vfio and
others through a TBD interface?


> > > This has already happened, the SVA patches generally allow unpriv user
> > > space to allocate a PASID for their process.
> > > 
> > > If a device implements a mdev shared with a kernel driver (like IDXD)
> > > then it will be sharing that PASID pool across both drivers. In this
> > > case it makes no sense that VFIO has PASID quota logic because it has
> > > an incomplete view. It could only make sense if VFIO is the exclusive
> > > owner of the bus/device/function.
> > > 
> > > The tracking logic needs to be global.. Most probably in some kind of
> > > PASID cgroup controller?  
> > 
> > AIUI, that doesn't exist yet, so it makes sense that VFIO, as the
> > mechanism through which a user would allocate a PASID,   
> 
> VFIO is not the exclusive user interface for PASID. Other SVA drivers
> will allocate PASIDs. Any quota has to be implemented by the IOMMU
> layer, and shared across all drivers.


The IOMMU needs to allocate PASIDs, so in that sense it enforces a
quota via the architectural limits, but is the IOMMU layer going to
distinguish in-kernel versus user limits?  A cgroup limit seems like a
good idea, but that's not really at the IOMMU layer either and I don't
see that a /dev/sva and vfio interface couldn't both support a cgroup
type quota.

 
> > space.  Also, "unprivileged user" is a bit of a misnomer in this
> > context as the VFIO user must be privileged with ownership of a device
> > before they can even participate in PASID allocation.  Is truly
> > unprivileged access reasonable for a limited resource?  
> 
> I'm not talking about VFIO, I'm talking about the other SVA drivers. I
> expect some of them will be unpriv safe, like IDXD, for
> instance.
> 
> Some way to manage the limited PASID resource will be necessary beyond
> just VFIO.

And it's not clear that they'll have compatible requirements.  A
userspace idxd driver might have limited needs versus a vIOMMU backend.
Does a single quota model adequately support both or are we back to the
differences between access to a device and ownership of a device?
Maybe a single pasid per user makes sense in the former.  If we could
bring this discussion to some sort of more concrete proposal it might
be easier to weigh the choices.
 
> > QEMU typically runs in a sandbox with limited access, when a device or
> > mdev is assigned to a VM, file permissions are configured to allow that
> > access.  QEMU doesn't get to poke at any random dev file it likes,
> > that's part of how userspace reduces the potential attack surface.  
> 
> Plumbing the exact same APIs through VFIO's uAPI vs /dev/sva doesn't
> reduce the attack surface. qemu can simply include /dev/sva in the
> sandbox when using VFIO with no increase in attack surface from this
> proposed series.

APIs confined to the ownership model that vfio already enforces might
absolutely present a more limited attack surface than some new
interface intended to provide universal sva resource access.  We don't
know until we see it.  The real argument would be whether we have a
more hardened interface due to more review from more users.

 
> > This series is a blueprint within the context of the ownership and
> > permission model that VFIO already provides.  It doesn't seem like we
> > can pluck that out on its own, nor is it necessarily the case that VFIO
> > wouldn't want to provide PASID services within its own API even if we
> > did have this undefined /dev/sva interface.  
> 
> I don't see what you do - VFIO does not own PASID, and in this
> vfio-mdev mode it does not own the PCI device/IOMMU either. So why
> would this need to be part of the VFIO owernship and permission model?

Doesn't the PASID model essentially just augment the requester ID IOMMU
model so as to manage the IOVAs for a subdevice of a RID?  The vfio
model builds on a user's access to a vfio group to entitle them to
allocate IOMMU resources, or in this case PASIDs.  What elevates a user
to be able to allocate such resources in this new proposal?  Do they
need a device at all?  It's not clear to me why RID based IOMMU
management fits within vfio's scope, but PASID based does not.  Seems
like that would chip away at aux domains in general.  Thanks,

Alex


  reply	other threads:[~2020-09-14 22:33 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-10 10:45 [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Liu Yi L
2020-09-10 10:45 ` [PATCH v7 01/16] iommu: Report domain nesting info Liu Yi L
2020-09-11 19:38   ` Alex Williamson
2020-09-10 10:45 ` [PATCH v7 02/16] iommu/smmu: Report empty " Liu Yi L
2021-01-12  6:50   ` Vivek Gautam
2021-01-12  9:21     ` Liu, Yi L
2021-01-12 11:05       ` Vivek Gautam
2021-01-13  5:56         ` Liu, Yi L
2021-01-19 10:03           ` Auger Eric
2021-01-23  8:59             ` Liu, Yi L
2021-02-12  7:14               ` Vivek Gautam
2021-02-12  9:57                 ` Auger Eric
2021-02-12 10:18                   ` Vivek Kumar Gautam
2021-02-12 11:01                     ` Vivek Kumar Gautam
2021-03-03  9:44                   ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace Liu Yi L
2020-09-11 20:16   ` Alex Williamson
2020-09-12  8:24     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 04/16] vfio: Add PASID allocation/free support Liu Yi L
2020-09-11 20:54   ` Alex Williamson
2020-09-15  4:03     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 05/16] iommu/vt-d: Support setting ioasid set to domain Liu Yi L
2020-09-10 10:45 ` [PATCH v7 06/16] iommu/vt-d: Remove get_task_mm() in bind_gpasid() Liu Yi L
2020-09-10 10:45 ` [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free) Liu Yi L
2020-09-11 21:38   ` Alex Williamson
2020-09-10 10:45 ` [PATCH v7 08/16] iommu: Pass domain to sva_unbind_gpasid() Liu Yi L
2020-09-10 10:45 ` [PATCH v7 09/16] iommu/vt-d: Check ownership for PASIDs from user-space Liu Yi L
2020-09-10 10:45 ` [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID Liu Yi L
2020-09-11 22:03   ` Alex Williamson
2020-09-12  6:02     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 11/16] vfio/type1: Allow invalidating first-level/stage IOMMU cache Liu Yi L
2020-09-10 10:45 ` [PATCH v7 12/16] vfio/type1: Add vSVA support for IOMMU-backed mdevs Liu Yi L
2020-09-10 10:45 ` [PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest Liu Yi L
2020-09-11 22:13   ` Alex Williamson
2020-09-12  7:17     ` Liu, Yi L
2020-09-10 10:45 ` [PATCH v7 14/16] vfio: Document dual stage control Liu Yi L
2020-09-10 10:45 ` [PATCH v7 15/16] iommu/vt-d: Only support nesting when nesting caps are consistent across iommu units Liu Yi L
2020-09-10 10:45 ` [PATCH v7 16/16] iommu/vt-d: Support reporting nesting capability info Liu Yi L
2020-09-14  4:20 ` [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs Jason Wang
2020-09-14  8:01   ` Tian, Kevin
2020-09-14  8:57     ` Jason Wang
2020-09-14 10:38       ` Tian, Kevin
2020-09-14 11:38         ` Jason Gunthorpe
2020-09-14 13:31   ` Jean-Philippe Brucker
2020-09-14 13:47     ` Jason Gunthorpe
2020-09-14 16:22       ` Raj, Ashok
2020-09-14 16:33         ` Jason Gunthorpe
2020-09-14 16:58           ` Alex Williamson
2020-09-14 17:41             ` Jason Gunthorpe
2020-09-14 18:23               ` Alex Williamson
2020-09-14 19:00                 ` Jason Gunthorpe
2020-09-14 22:33                   ` Alex Williamson [this message]
2020-09-15 14:29                     ` Jason Gunthorpe
2020-09-16  1:19                       ` Tian, Kevin
2020-09-16  8:32                         ` Jean-Philippe Brucker
2020-09-16 14:51                           ` Jason Gunthorpe
2020-09-16 16:20                             ` Jean-Philippe Brucker
2020-09-16 16:32                               ` Jason Gunthorpe
2020-09-16 16:50                                 ` Auger Eric
2020-09-16 14:44                         ` Jason Gunthorpe
2020-09-17  6:01                           ` Tian, Kevin
     [not found]                   ` <20200914224438.GA65940@otc-nc-03>
2020-09-15 11:33                     ` Jason Gunthorpe
2020-09-15 18:11                       ` Raj, Ashok
2020-09-15 18:45                         ` Jason Gunthorpe
2020-09-15 19:26                           ` Raj, Ashok
2020-09-15 23:45                             ` Jason Gunthorpe
2020-09-16  2:33                             ` Jason Wang
2020-09-15 22:08                           ` Jacob Pan
2020-09-15 23:51                             ` Jason Gunthorpe
     [not found]                               ` <20200915171319.00003f59@linux.intel.com>
2020-09-16  1:46                                 ` Lu Baolu
2020-09-16 15:07                                 ` Jason Gunthorpe
2020-09-16 16:33                                   ` Raj, Ashok
2020-09-16 17:01                                     ` Jason Gunthorpe
2020-09-16 18:21                                       ` Jacob Pan (Jun)
2020-09-16 18:38                                         ` Jason Gunthorpe
2020-09-16 23:09                                           ` Jacob Pan (Jun)
2020-09-17  3:53                                             ` Jason Wang
2020-09-17 17:31                                               ` Jason Gunthorpe
2020-09-17 18:17                                               ` Jacob Pan (Jun)
2020-09-18  3:58                                                 ` Jason Wang
2020-09-16  2:29     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914163310.450c8d6e@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).