intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Yanting@freedesktop.org,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>, "  <lulu@redhat.com>,
	"@freedesktop.org, suravee.suthikulpanit@amd.com,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"Liu,  Yi L" <yi.l.liu@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"joro@8bytes.org" <joro@8bytes.org>,
	Yan@freedesktop.org, "nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"     <intel-gvt-dev@lists.freedesktop.org>,
	 "@freedesktop.org, intel-gfx@lists.freedesktop.org,
	<linux-s390@vger.kernel.org>, ,
	Xudong@freedesktop.org, Zhenzhong@freedesktop.org,
	"  <suravee.suthikulpanit@amd.com>,
	"@freedesktop.org, intel-gvt-dev@lists.freedesktop.org,,
	" <intel-gfx@lists.freedesktop.org>,
	  "@freedesktop.org, linux-s390@vger.kernel.org,
	Terrence@freedesktop.org,
	"yi.y.sun@linux.intel.com" <yi.y.sun@linux.intel.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"shameerali.kolothum.thodi@huawei.com\"         
	<shameerali.kolothum.thodi@huawei.com>,
	"@freedesktop.org, lulu@redhat.com
Subject: Re: [Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
Date: Tue, 11 Apr 2023 11:11:17 -0600	[thread overview]
Message-ID: <20230411111117.0766ad52.alex.williamson@redhat.com> (raw)
In-Reply-To: <20230411095417.240bac39.alex.williamson@redhat.com>

[Appears the list got dropped, replying to my previous message to re-add]

On Tue, 11 Apr 2023 13:32:16 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Apr 11, 2023 at 09:54:17AM -0600, Alex Williamson wrote:
> > On Tue, 11 Apr 2023 10:24:58 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Thu, Apr 06, 2023 at 11:53:47AM -0600, Alex Williamson wrote:
> > >   
> > > > Where whether a device is opened is subject to change outside of the
> > > > user's control.  This essentially allows the user to perform hot-resets
> > > > of devices outside of their ownership so long as the device is not
> > > > used elsewhere, versus the current requirement that the user own all the
> > > > affected groups, which implies device ownership.  It's not been
> > > > justified why this feature needs to exist, imo.    
> > > 
> > > The cdev API doesn't have the notion that owning a group means you
> > > "own" some collection of devices. It still happens as a side effect,
> > > but it isn't obviously part of the API. I'm really loath to
> > > re-introduce that group-based concept just for this. We are trying
> > > reduce the group API surface.
> > > 
> > > How about a different direction.
> > > 
> > > We add a new uAPI for cdev mode that is "take ownership of the reset
> > > group". Maybe it can be a flag in during bind.
> > > 
> > > When requested vfio will ensure that every device in the reset group
> > > is only bound to this iommufd_ctx or left closed. Now and in the
> > > future. Since no-iommu has no iommufd_ctx this means we can open only
> > > one device in the reset group.
> > > 
> > > With this flag RESET is guaranteed to always work by definition.
> > > 
> > > We continue with the zero-length FD, but we can just replace the
> > > security checks with a check if we are in reset group ownership mode.
> > > 
> > > _INFO is unchanged.
> > > 
> > > We decide if we add a new IOCTL to return the BDF so the existing
> > > _INFO can get back to the dev_id or a new IOCTL that returns the
> > > dev_id list of the reset group.
> > > 
> > > Userspace is required to figure out the extent of the reset, but we
> > > don't require that userspace prove to the kernel it did this when
> > > requesting the reset.  
> > 
> > Take for example a multi-function PCIe device with ACS isolation between
> > functions, are you going to allow a user who has only been granted
> > ownership of a subset of functions control of the entire dev_set?  
> 
> Our cdev model says that opening a cdev locks out other cdevs from
> independent use, eg because of the group sharing. Extending this to
> include the reset group as well seems consistent.

The DMA ownership model based on the IOMMU group is consistent with
legacy vfio, but now you're proposing a new ownership model that
optionally allows a user to extend their ownership, opportunistically
lock out other users, and wreaking havoc for management utilities that
also have no insight into dev_sets or userspace driver behavior.

> There is some security concern here, but that goes both ways, a 3rd
> party should not be able to break an application that needs to use
> this RESET and had sufficient privileges to assert an ownership.

There are clearly scenarios we have now that could break.  For example,
today if QEMU doesn't own all the IOMMU groups for a mult-function
device, it can't do a reset, the remaining functions are available for
other users.  As I understand the proposal, QEMU now gets to attempt to
claim ownership of the dev_set, so it opportunistically extends its
ownership and may block other users from the affected devices.
Ordering makes this effectively unpredictable, if a userspace like DPDK
that doesn't assert dev_set ownership is started first, QEMU can start
and be denied hot-reset support.  In the reverse ordering, the DPDK
application can be locked out by QEMU.

> I'd say anyone should be able to assert RESET ownership if, like
> today, the iommufd_ctx has all the groups of the dev_set inside
> it. Once asserted it becomes safe against all forms of hotplug, and
> continues to be safe even if some of the devices are closed. eg hot
> unplugging from the VM doesn't change the availability of RESET.
> 
> This comes from your ask that qemu know clearly if RESET works, and it
> doesn't change while qemu is running. This seems stronger and clearer
> than the current implicit scheme. It also doesn't require usespace to
> do any calculations with groups or BDFs to figure out of RESET is
> available, kernel confirms it directly.

As above, clarity and predictability seem lacking in this proposal.
With the current scheme, the ownership of the affected devices is
implied if they exist within an owned group, but the strength of that
ownership is clear.  Affected devices outside the set of owned groups
says that hot-reset is unavailable without any of this "but QEMU might
be able to request it" or "unless the affected device is currently
unopened" variables.

> > seems this proposal essentially extends the ownership model to the
> > greater of the dev_set or iommu group, apparently neither of which
> > are explicitly exposed to the user in the cdev API.  
> 
> IIRC the group id can be learned from sysfs before opening the cdev
> file. Something like /sys/class/vfio/XX/../../iommu_group

And in the passed cdev fd model... ?

> We should also have an iommufd ioctl to report the "same ioas"
> groupings of dev_ids to make it easy on userspace. I haven't checked
> to see what the current qemu patches are doing with this..

Seems we're ignoring that no-iommu doesn't have a valid iommufd.

> > How does a user determine when devices cannot be used independently
> > in the cdev API?   
> 
> We have this problem right now. The only way to learn the reset group
> is to call the _INFO ioctl. We could add a sysfs "pci_reset_group"
> under /sys/class/vfio/XX/ if something needs it earlier.

For all the complaints about complexity, now we're asking management
tools to not only take into account IOMMU groups, but also reset
groups, and some inferred knowledge about the application and devices
to speculate whether reset group ownership is taken by a given
userspace??  Thanks,

Alex


  reply	other threads:[~2023-04-21 14:10 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-01 14:44 [Intel-gfx] [PATCH v3 00/12] Introduce new methods for verifying ownership in vfio PCI hot reset Yi Liu
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 01/12] vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset() Yi Liu
2023-04-04 13:59   ` Eric Auger
2023-04-04 14:37     ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset Yi Liu
2023-04-04 13:59   ` Eric Auger
2023-04-04 14:37     ` Liu, Yi L
2023-04-04 15:18       ` Eric Auger
2023-04-04 15:29         ` Liu, Yi L
2023-04-04 15:59           ` Eric Auger
2023-04-05 11:41             ` Jason Gunthorpe
2023-04-05 15:14               ` Eric Auger
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 03/12] vfio/pci: Move the existing hot reset logic to be a helper Yi Liu
2023-04-04 13:59   ` Eric Auger
2023-04-04 14:24     ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 04/12] vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for vfio_device Yi Liu
2023-04-04 15:28   ` Eric Auger
2023-04-04 21:48     ` Alex Williamson
2023-04-21  7:11       ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 05/12] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET Yi Liu
2023-04-04 16:54   ` Eric Auger
2023-04-04 20:18   ` Alex Williamson
2023-04-05  7:55     ` Liu, Yi L
2023-04-05  8:01       ` Liu, Yi L
2023-04-05 15:36         ` Alex Williamson
2023-04-05 16:46           ` Jason Gunthorpe
2023-04-05  8:02     ` Eric Auger
2023-04-05  8:09       ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 06/12] vfio: Refine vfio file kAPIs for vfio PCI hot reset Yi Liu
2023-04-05  8:27   ` Eric Auger
2023-04-05  9:23     ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 07/12] vfio: Accpet device file from vfio PCI hot reset path Yi Liu
2023-04-04 20:31   ` Alex Williamson
2023-04-05  8:07   ` Eric Auger
2023-04-05  8:10     ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 08/12] vfio/pci: Renaming for accepting device fd in " Yi Liu
2023-04-04 21:23   ` Alex Williamson
2023-04-05  9:32   ` Eric Auger
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 09/12] vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl Yi Liu
2023-04-05  9:36   ` Eric Auger
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 10/12] vfio: Mark cdev usage in vfio_device Yi Liu
2023-04-05 11:48   ` Eric Auger
2023-04-21  7:06     ` Liu, Yi L
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 11/12] iommufd: Define IOMMUFD_INVALID_ID in uapi Yi Liu
2023-04-04 21:00   ` Alex Williamson
2023-04-05  9:31     ` Liu, Yi L
2023-04-05 15:13       ` Alex Williamson
2023-04-05 15:17         ` Liu, Yi L
2023-04-05 11:46   ` Eric Auger
2023-04-01 14:44 ` [Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO Yi Liu
2023-04-03  9:25   ` Liu, Yi L
2023-04-03 15:01     ` Alex Williamson
2023-04-03 15:22       ` Liu, Yi L
2023-04-03 15:32         ` Alex Williamson
2023-04-03 16:12           ` Jason Gunthorpe
2023-04-07 10:09       ` Liu, Yi L
2023-04-07 12:03         ` Alex Williamson
2023-04-07 13:24           ` Liu, Yi L
2023-04-07 13:51             ` Alex Williamson
2023-04-07 14:04               ` Liu, Yi L
2023-04-07 15:14                 ` Alex Williamson
2023-04-07 15:47                   ` Liu, Yi L
2023-04-07 21:07                     ` Alex Williamson
2023-04-08  5:07                       ` Liu, Yi L
2023-04-08 14:20                         ` Alex Williamson
2023-04-09 11:58                           ` Yi Liu
2023-04-09 13:29                             ` Alex Williamson
2023-04-10  8:48                               ` Liu, Yi L
2023-04-10 14:41                                 ` Alex Williamson
2023-04-10 15:18                                   ` Liu, Yi L
2023-04-10 15:23                                     ` Alex Williamson
2023-04-11 13:34                               ` Jason Gunthorpe
2023-04-11 13:33                       ` Jason Gunthorpe
2023-04-11  6:16           ` Liu, Yi L
2023-04-04 22:20   ` Alex Williamson
2023-04-05 12:19   ` Eric Auger
2023-04-05 14:04     ` Liu, Yi L
2023-04-05 16:25       ` Alex Williamson
2023-04-05 16:37         ` Jason Gunthorpe
2023-04-05 16:52           ` Alex Williamson
2023-04-05 17:23             ` Jason Gunthorpe
2023-04-05 18:56               ` Alex Williamson
2023-04-05 19:18                 ` Alex Williamson
2023-04-05 19:21                 ` Jason Gunthorpe
2023-04-05 19:49                   ` Alex Williamson
2023-04-05 23:22                     ` Jason Gunthorpe
2023-04-06 10:02                       ` Liu, Yi L
2023-04-06 17:53                         ` Alex Williamson
2023-04-07 10:09                           ` Liu, Yi L
2023-04-11 13:24                           ` Jason Gunthorpe
2023-04-11 15:54                             ` Alex Williamson
2023-04-11 17:11                               ` Alex Williamson [this message]
2023-04-11 18:40                                 ` Jason Gunthorpe
2023-04-11 21:58                                   ` Alex Williamson
2023-04-12  0:01                                     ` Jason Gunthorpe
2023-04-12  7:27                                       ` Tian, Kevin
2023-04-12 15:05                                         ` Jason Gunthorpe
2023-04-12 17:01                                           ` Alex Williamson
2023-04-13  2:57                                           ` Tian, Kevin
2023-04-12 10:09                                       ` Liu, Yi L
2023-04-12 16:54                                         ` Alex Williamson
2023-04-12 16:50                                       ` Alex Williamson
2023-04-12 20:06                                         ` Jason Gunthorpe
2023-04-13  8:25                                           ` Tian, Kevin
2023-04-13 11:50                                             ` Jason Gunthorpe
2023-04-13 14:35                                               ` Liu, Yi L
2023-04-13 14:41                                                 ` Jason Gunthorpe
2023-04-13 18:07                                               ` Alex Williamson
2023-04-14  9:11                                                 ` Tian, Kevin
2023-04-14 11:38                                                   ` Liu, Yi L
2023-04-14 17:10                                                     ` Alex Williamson
2023-04-17  4:20                                                       ` Liu, Yi L
2023-04-17 19:01                                                         ` Alex Williamson
2023-04-17 19:31                                                           ` Jason Gunthorpe
2023-04-17 20:06                                                             ` Alex Williamson
2023-04-18  3:24                                                               ` Tian, Kevin
2023-04-18  4:10                                                                 ` Alex Williamson
2023-04-18  5:02                                                                   ` Tian, Kevin
2023-04-18 12:59                                                                     ` Jason Gunthorpe
2023-04-18 16:44                                                                     ` Alex Williamson
2023-04-18 10:34                                                                   ` Liu, Yi L
2023-04-18 16:49                                                                     ` Alex Williamson
2023-04-18 12:57                                                               ` Jason Gunthorpe
2023-04-18 18:39                                                                 ` Alex Williamson
2023-04-20 12:10                                                                   ` Liu, Yi L
2023-04-20 14:08                                                                     ` Alex Williamson
2023-04-21 22:35                                                                       ` Jason Gunthorpe
2023-04-23 14:46                                                                         ` Liu, Yi L
2023-04-26  7:22                                                                       ` Liu, Yi L
2023-04-26 13:20                                                                         ` Alex Williamson
2023-04-26 15:08                                                                           ` Liu, Yi L
2023-04-14 16:34                                                   ` Alex Williamson
2023-04-17 13:39                                                   ` Jason Gunthorpe
2023-04-18  1:28                                                     ` Tian, Kevin
2023-04-18 10:23                                                     ` Liu, Yi L
2023-04-18 13:02                                                       ` Jason Gunthorpe
2023-04-23 10:28                                                         ` Liu, Yi L
2023-04-24 17:38                                                           ` Jason Gunthorpe
2023-04-17 14:05                                                 ` Jason Gunthorpe
2023-04-12  7:14                                     ` Tian, Kevin
2023-04-06  6:34                     ` Liu, Yi L
2023-04-06 17:07                       ` Alex Williamson
2023-04-05 17:58         ` Eric Auger
2023-04-06  5:31           ` Liu, Yi L
2023-04-01 14:47 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Introduce new methods for verifying ownership in vfio PCI hot reset (rev4) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230411111117.0766ad52.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc="     <intel-gvt-dev@lists.freedesktop.org>,  "@freedesktop.org \
    --cc="  <lulu@redhat.com>, "@freedesktop.org \
    --cc="  <suravee.suthikulpanit@amd.com>, "@freedesktop.org \
    --cc=" <intel-gfx@lists.freedesktop.org>,   "@freedesktop.org \
    --cc="shameerali.kolothum.thodi@huawei.com\"          <shameerali.kolothum.thodi@huawei.com>, "@freedesktop.org \
    --cc=Terrence@freedesktop.org \
    --cc=Xudong@freedesktop.org \
    --cc=Yan@freedesktop.org \
    --cc=Yanting@freedesktop.org \
    --cc=Zhenzhong@freedesktop.org \
    --cc=chao.p.peng@linux.intel.com \
    --cc=cohuck@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=lulu@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).