All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	bhelgaas@google.com, saeedm@nvidia.com,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	netdev@vger.kernel.org, kuba@kernel.org, leonro@nvidia.com,
	kwankhede@nvidia.com, mgurtovoy@nvidia.com, maorg@nvidia.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices
Date: Wed, 27 Oct 2021 16:23:45 -0300	[thread overview]
Message-ID: <20211027192345.GJ2744544@nvidia.com> (raw)
In-Reply-To: <20211027130520.33652a49.alex.williamson@redhat.com>

On Wed, Oct 27, 2021 at 01:05:20PM -0600, Alex Williamson wrote:

> > As far as the actual issue, if you hadn't just discovered it now
> > nobody would have known we have this gap - much like how the very
> > similar reset issue was present in VFIO for so many years until you
> > plugged it.
> 
> But the fact that we did discover it is hugely important.  We've
> identified that the potential use case is significantly limited and
> that userspace doesn't have a good mechanism to determine when to
> expose that limitation to the user.  

Huh?

We've identified that, depending on device behavior, the kernel may
need to revoke MMIO access to protect itself from hostile userspace
triggering TLP Errors or something.

Well behaved userspace must already stop touching the MMIO on the
device when !RUNNING - I see no compelling argument against that
position.

We've been investigating how the mlx5 HW will behave in corner cases,
and currently it looks like mlx5 vfio will not generate error TLPs, or
corrupt the device itself due to MMIO operations when !RUNNING. So the
driver itself, as written, probably does not currently have a bug
here, or need changes.

> We're tossing around solutions that involve extensions, if not
> changes to the uAPI.  It's Wednesday of rc7.

The P2P issue is seperate, and as I keep saying, unless you want to
block support for any HW that does not have freeze&queice userspace
must be aware of this ability and it is logical to design it as an
extension from where we are now.

> I feel like we've already been burned by making one of these
> "reasonable quanta of progress" to accept and mark experimental
> decisions with where we stand between defining the uAPI in the kernel
> and accepting an experimental implementation in QEMU.  

I won't argue there..

> Now we have multiple closed driver implementations (none of which
> are contributing to this discussion), but thankfully we're not
> committed to supporting them because we have no open
> implementations.  I think we could get away with ripping up the uAPI
> if we really needed to.

Do we need to?

> > > Deciding at some point in the future to forcefully block device MMIO
> > > access from userspace when the device stops running is clearly a user
> > > visible change and therefore subject to the don't-break-userspace
> > > clause.    
> > 
> > I don't think so, this was done for reset retroactively after
> > all. Well behaved qmeu should have silenced all MMIO touches as part
> > of the ABI contract anyhow.
> 
> That's not obvious to me and I think it conflates access to the device
> and execution of the device.  If it's QEMU's responsibility to quiesce
> access to the device anyway, why does the kernel need to impose this
> restriction.  I'd think we'd generally only impose such a restriction
> if the alternative allows the user to invoke bad behavior outside the
> scope of their use of the device or consistency of the migration data.
> It appears that any such behavior would be implementation specific here.

I think if an implementation has a problem, like error TLPs, then yes,
it must fence. The conservative specification of the uAPI is that
userspace should not allow MMIO when !RUNNING.

If we ever get any implementation that needs this to fence then we
should do it for all implementations just out of consistency.

> > The "don't-break-userspace" is not an absolute prohibition, Linus has
> > been very clear this limitation is about direct, ideally demonstrable,
> > breakage to actually deployed software.
> 
> And if we introduce an open driver that unblocks QEMU support to become
> non-experimental, I think that's where we stand.

Yes, if qemu becomes deployed, but our testing shows qemu support
needs a lot of work before it is deployable, so that doesn't seem to
be an immediate risk.

> > > That might also indicate that "freeze" is only an implementation
> > > specific requirement.  Thanks,  
> > 
> > It doesn't matter if a theoretical device can exist that doesn't need
> > "freeze" - this device does, and so it is the lowest common
> > denominator for the uAPI contract and userspace must abide by the
> > restriction.
> 
> Sorry, "to the victor go the spoils" is not really how I strictly want
> to define a uAPI contract with userspace.  

This is not the "victor go the spoils" this is meeting the least
common denominator of HW we have today.

If some fictional HW can be more advanced and can snapshot not freeze,
that is great, but it doesn't change one bit that mlx5 cannot and will
not work that way. Since mlx5 must be supported, there is no choice
but to define the uAPI around its limitations.

snapshot devices are strictly a superset of freeze devices, they can
emulate freeze by doing snapshot at the freeze operation.

In all cases userspace should not touch the device when !RUNNING to
preserve generality to all implementations.

> If we're claiming that userspace is responsible for quiescing
> devices and we're providing a means for that to occur, and userspace
> is already responsible for managing MMIO access, then the only
> reason the kernel would forcefully impose such a restriction itself
> would be to protect the host and the implementation of that would
> depend on whether this is expected to be a universal or device
> specific limitation.  

I think the best way forward is to allow for revoke to happen if we
ever need it (by specification), and not implement it right now.

So, I am not left with a clear idea what is still open that you see as
blocking. Can you summarize?

Jason

  reply	other threads:[~2021-10-27 19:23 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 10:58 [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 01/14] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 02/14] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 03/14] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 04/14] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 05/14] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device Yishai Hadas
2021-10-19 11:16   ` Max Gurtovoy
2021-10-20  8:58     ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 07/14] vfio: Fix VFIO_DEVICE_STATE_SET_ERROR macro Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR Yishai Hadas
2021-10-19 15:48   ` Alex Williamson
2021-10-19 15:50     ` Alex Williamson
2021-10-20  7:35       ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 09/14] vfio/pci_core: Make the region->release() function optional Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 10/14] net/mlx5: Introduce migration bits and structures Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 11/14] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2021-10-19 18:43   ` Alex Williamson
2021-10-19 19:23     ` Jason Gunthorpe
2021-10-19 20:58       ` Alex Williamson
2021-10-19 23:04         ` Jason Gunthorpe
2021-10-20  8:28           ` Yishai Hadas
2021-10-20 16:52             ` Alex Williamson
2021-10-20 18:59               ` Jason Gunthorpe
2021-10-20 21:07                 ` Alex Williamson
2021-10-21  9:34                   ` Cornelia Huck
2021-10-21 21:47                     ` Alex Williamson
2021-10-25 12:29                       ` Jason Gunthorpe
2021-10-25 14:28                         ` Alex Williamson
2021-10-25 14:56                           ` Jason Gunthorpe
2021-10-26 14:42                             ` Alex Williamson
2021-10-26 15:18                               ` Jason Gunthorpe
2021-10-26 19:50                                 ` Alex Williamson
2021-10-26 23:43                                   ` Jason Gunthorpe
2021-10-27 19:05                                     ` Alex Williamson
2021-10-27 19:23                                       ` Jason Gunthorpe [this message]
2021-10-28 15:08                                         ` Cornelia Huck
2021-10-29  0:26                                           ` Jason Gunthorpe
2021-10-29  7:35                                             ` Yishai Hadas
2021-10-28 15:30                                         ` Alex Williamson
2021-10-28 23:47                                           ` Jason Gunthorpe
2021-10-29  6:57                                             ` Cornelia Huck
2021-10-29  7:48                                               ` Yishai Hadas
2021-10-29 10:32                                             ` Shameerali Kolothum Thodi
2021-10-29 12:15                                               ` Jason Gunthorpe
2021-10-29 22:06                                             ` Alex Williamson
2021-11-01 17:25                                               ` Jason Gunthorpe
2021-11-02 11:19                                                 ` Shameerali Kolothum Thodi
2021-11-02 14:56                                                 ` Alex Williamson
2021-11-02 15:54                                                   ` Jason Gunthorpe
2021-11-02 16:22                                                     ` Alex Williamson
2021-11-02 16:36                                                       ` Jason Gunthorpe
2021-11-02 20:15                                                         ` Alex Williamson
2021-11-03 12:09                                                           ` Jason Gunthorpe
2021-11-03 15:44                                                             ` Alex Williamson
2021-11-03 16:10                                                               ` Jason Gunthorpe
2021-11-03 18:04                                                                 ` Alex Williamson
2021-11-04 11:19                                                                   ` Cornelia Huck
2021-11-05 16:53                                                                     ` Cornelia Huck
2021-11-16 16:59                                                                       ` Cornelia Huck
2021-11-05 13:24                                                                   ` Jason Gunthorpe
2021-11-05 15:31                                                                     ` Alex Williamson
2021-11-15 23:29                                                                       ` Jason Gunthorpe
2021-11-16 17:57                                                                         ` Alex Williamson
2021-11-16 19:25                                                                           ` Jason Gunthorpe
2021-11-16 21:10                                                                             ` Alex Williamson
2021-11-17  1:48                                                                               ` Jason Gunthorpe
2021-11-18 18:15                                                                                 ` Alex Williamson
2021-11-22 19:18                                                                                   ` Jason Gunthorpe
2021-11-08  8:53                                 ` Tian, Kevin
2021-11-08 12:35                                   ` Jason Gunthorpe
2021-11-09  0:58                                     ` Tian, Kevin
2021-11-09 12:45                                       ` Jason Gunthorpe
2021-10-25 16:34               ` Dr. David Alan Gilbert
2021-10-25 17:55                 ` Alex Williamson
2021-10-25 18:47                   ` Dr. David Alan Gilbert
2021-10-25 19:15                     ` Jason Gunthorpe
2021-10-26  8:40                       ` Dr. David Alan Gilbert
2021-10-26 12:13                         ` Jason Gunthorpe
2021-10-26 14:52                           ` Alex Williamson
2021-10-26 15:56                             ` Jason Gunthorpe
2021-10-26 14:29                     ` Alex Williamson
2021-10-26 14:51                       ` Dr. David Alan Gilbert
2021-10-26 15:25                         ` Jason Gunthorpe
2021-10-20  8:01     ` Yishai Hadas
2021-10-20 16:25       ` Jason Gunthorpe
2021-10-21 10:46         ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 13/14] vfio/pci: Expose vfio_pci_aer_err_detected() Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas
2021-10-19 18:55   ` Alex Williamson
2021-10-19 19:10     ` Jason Gunthorpe
2021-10-20  8:46       ` Yishai Hadas
2021-10-20 16:46         ` Jason Gunthorpe
2021-10-20 17:45           ` Alex Williamson
2021-10-20 18:57             ` Jason Gunthorpe
2021-10-20 21:38               ` Alex Williamson
2021-10-21 10:39             ` Yishai Hadas
2021-11-17 16:42 ` vfio migration discussions (was: [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver) Cornelia Huck
2021-11-17 17:47   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211027192345.GJ2744544@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.