linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	bhelgaas@google.com, saeedm@nvidia.com,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	netdev@vger.kernel.org, kuba@kernel.org, leonro@nvidia.com,
	kwankhede@nvidia.com, mgurtovoy@nvidia.com, maorg@nvidia.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices
Date: Wed, 3 Nov 2021 09:44:09 -0600	[thread overview]
Message-ID: <20211103094409.3ea180ab.alex.williamson@redhat.com> (raw)
In-Reply-To: <20211103120955.GK2744544@nvidia.com>

On Wed, 3 Nov 2021 09:09:55 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Nov 02, 2021 at 02:15:47PM -0600, Alex Williamson wrote:
> > On Tue, 2 Nov 2021 13:36:10 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Tue, Nov 02, 2021 at 10:22:36AM -0600, Alex Williamson wrote:
> > >   
> > > > > > There's no point at which we can do SET_IRQS other than in the
> > > > > > _RESUMING state.  Generally SET_IRQS ioctls are coordinated with the
> > > > > > guest driver based on actions to the device, we can't be mucking
> > > > > > with IRQs while the device is presumed running and already
> > > > > > generating interrupt conditions.      
> > > > > 
> > > > > We need to do it in state 000
> > > > > 
> > > > > ie resume should go 
> > > > > 
> > > > >   000 -> 100 -> 000 -> 001
> > > > > 
> > > > > With SET_IRQS and any other fixing done during the 2nd 000, after the
> > > > > migration data has been loaded into the device.    
> > > > 
> > > > Again, this is not how QEMU works today.    
> > > 
> > > I know, I think it is a poor choice to carve out certain changes to
> > > the device that must be preserved across loading the migration state.
> > >   
> > > > > The uAPI comment does not define when to do the SET_IRQS, it seems
> > > > > this has been missed.
> > > > > 
> > > > > We really should fix it, unless you feel strongly that the
> > > > > experimental API in qemu shouldn't be changed.    
> > > > 
> > > > I think the QEMU implementation fills in some details of how the uAPI
> > > > is expected to work.    
> > > 
> > > Well, we already know QEMU has problems, like the P2P thing. Is this a
> > > bug, or a preferred limitation as designed?
> > >   
> > > > MSI/X is expected to be restored while _RESUMING based on the
> > > > config space of the device, there is no intermediate step between
> > > > _RESUMING and _RUNNING.  Introducing such a requirement precludes
> > > > the option of a post-copy implementation of (_RESUMING | _RUNNING).    
> > > 
> > > Not precluded, a new state bit would be required to implement some
> > > future post-copy.
> > > 
> > > 0000 -> 1100 -> 1000 -> 1001 -> 0001
> > > 
> > > Instead of overloading the meaning of RUNNING.
> > > 
> > > I think this is cleaner anyhow.
> > > 
> > > (though I don't know how we'd structure the save side to get two
> > > bitstreams)  
> > 
> > The way this is supposed to work is that the device migration stream
> > contains the device internal state.  QEMU is then responsible for
> > restoring the external state of the device, including the DMA mappings,
> > interrupts, and config space.  It's not possible for the migration
> > driver to reestablish these things.  So there is a necessary division
> > of device state between QEMU and the migration driver.
> > 
> > If we don't think the uAPI includes the necessary states, doesn't
> > sufficiently define the states, and we're not following the existing
> > QEMU implementation as the guide for the intentions of the uAPI spec,
> > then what exactly is the proposed mlx5 migration driver implementing
> > and why would we even considering including it at this point?  Thanks,  
> 
> The driver posting follows the undocumented behaviors of QEMU

In one email I read that QEMU clearly should not be performing SET_IRQS
while the device is _RESUMING (which it does) and we need to require an
interim state before the device becomes _RUNNING to poke at the device
(which QEMU doesn't do and the uAPI doesn't require), and the next I
read that we should proceed with some useful quanta of work despite
that we clearly don't intend to retain much of the protocol of the
current uAPI long term...

> You asked that these all be documented, evaluated and formalized as a
> precondition to merging it.
> 
> So, what do you want? A critical review of the uAPI design or
> documenting whatever behvaior is coded in qemu?

Too much is in flux and we're only getting breadcrumbs of the changes
to come.  It's becoming more evident that we're likely to sufficiently
modify the uAPI to the point where I'd probably suggest a new "v2"
subtype for the region.

> A critical review suggest SET_IRQ should not happen during RESUMING,
> but mlx5 today doesn't care either way.

But if it can't happening during _RESUMING and once the device is
_RUNNING it's too late, then we're demanding an interim state that is
not required by the existing protocol.  We're redefining that existing
operations on the device while in _RESUMING cannot occur in that device
state.  That's more than uAPI clarification.  Thanks,

Alex


  reply	other threads:[~2021-11-03 15:44 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 10:58 [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 01/14] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 02/14] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 03/14] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 04/14] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 05/14] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device Yishai Hadas
2021-10-19 11:16   ` Max Gurtovoy
2021-10-20  8:58     ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 07/14] vfio: Fix VFIO_DEVICE_STATE_SET_ERROR macro Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR Yishai Hadas
2021-10-19 15:48   ` Alex Williamson
2021-10-19 15:50     ` Alex Williamson
2021-10-20  7:35       ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 09/14] vfio/pci_core: Make the region->release() function optional Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 10/14] net/mlx5: Introduce migration bits and structures Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 11/14] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2021-10-19 18:43   ` Alex Williamson
2021-10-19 19:23     ` Jason Gunthorpe
2021-10-19 20:58       ` Alex Williamson
2021-10-19 23:04         ` Jason Gunthorpe
2021-10-20  8:28           ` Yishai Hadas
2021-10-20 16:52             ` Alex Williamson
2021-10-20 18:59               ` Jason Gunthorpe
2021-10-20 21:07                 ` Alex Williamson
2021-10-21  9:34                   ` Cornelia Huck
2021-10-21 21:47                     ` Alex Williamson
2021-10-25 12:29                       ` Jason Gunthorpe
2021-10-25 14:28                         ` Alex Williamson
2021-10-25 14:56                           ` Jason Gunthorpe
2021-10-26 14:42                             ` Alex Williamson
2021-10-26 15:18                               ` Jason Gunthorpe
2021-10-26 19:50                                 ` Alex Williamson
2021-10-26 23:43                                   ` Jason Gunthorpe
2021-10-27 19:05                                     ` Alex Williamson
2021-10-27 19:23                                       ` Jason Gunthorpe
2021-10-28 15:08                                         ` Cornelia Huck
2021-10-29  0:26                                           ` Jason Gunthorpe
2021-10-29  7:35                                             ` Yishai Hadas
2021-10-28 15:30                                         ` Alex Williamson
2021-10-28 23:47                                           ` Jason Gunthorpe
2021-10-29  6:57                                             ` Cornelia Huck
2021-10-29  7:48                                               ` Yishai Hadas
2021-10-29 10:32                                             ` Shameerali Kolothum Thodi
2021-10-29 12:15                                               ` Jason Gunthorpe
2021-10-29 22:06                                             ` Alex Williamson
2021-11-01 17:25                                               ` Jason Gunthorpe
2021-11-02 11:19                                                 ` Shameerali Kolothum Thodi
2021-11-02 14:56                                                 ` Alex Williamson
2021-11-02 15:54                                                   ` Jason Gunthorpe
2021-11-02 16:22                                                     ` Alex Williamson
2021-11-02 16:36                                                       ` Jason Gunthorpe
2021-11-02 20:15                                                         ` Alex Williamson
2021-11-03 12:09                                                           ` Jason Gunthorpe
2021-11-03 15:44                                                             ` Alex Williamson [this message]
2021-11-03 16:10                                                               ` Jason Gunthorpe
2021-11-03 18:04                                                                 ` Alex Williamson
2021-11-04 11:19                                                                   ` Cornelia Huck
2021-11-05 16:53                                                                     ` Cornelia Huck
2021-11-16 16:59                                                                       ` Cornelia Huck
2021-11-05 13:24                                                                   ` Jason Gunthorpe
2021-11-05 15:31                                                                     ` Alex Williamson
2021-11-15 23:29                                                                       ` Jason Gunthorpe
2021-11-16 17:57                                                                         ` Alex Williamson
2021-11-16 19:25                                                                           ` Jason Gunthorpe
2021-11-16 21:10                                                                             ` Alex Williamson
2021-11-17  1:48                                                                               ` Jason Gunthorpe
2021-11-18 18:15                                                                                 ` Alex Williamson
2021-11-22 19:18                                                                                   ` Jason Gunthorpe
2021-11-08  8:53                                 ` Tian, Kevin
2021-11-08 12:35                                   ` Jason Gunthorpe
2021-11-09  0:58                                     ` Tian, Kevin
2021-11-09 12:45                                       ` Jason Gunthorpe
2021-10-25 16:34               ` Dr. David Alan Gilbert
2021-10-25 17:55                 ` Alex Williamson
2021-10-25 18:47                   ` Dr. David Alan Gilbert
2021-10-25 19:15                     ` Jason Gunthorpe
2021-10-26  8:40                       ` Dr. David Alan Gilbert
2021-10-26 12:13                         ` Jason Gunthorpe
2021-10-26 14:52                           ` Alex Williamson
2021-10-26 15:56                             ` Jason Gunthorpe
2021-10-26 14:29                     ` Alex Williamson
2021-10-26 14:51                       ` Dr. David Alan Gilbert
2021-10-26 15:25                         ` Jason Gunthorpe
2021-10-20  8:01     ` Yishai Hadas
2021-10-20 16:25       ` Jason Gunthorpe
2021-10-21 10:46         ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 13/14] vfio/pci: Expose vfio_pci_aer_err_detected() Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas
2021-10-19 18:55   ` Alex Williamson
2021-10-19 19:10     ` Jason Gunthorpe
2021-10-20  8:46       ` Yishai Hadas
2021-10-20 16:46         ` Jason Gunthorpe
2021-10-20 17:45           ` Alex Williamson
2021-10-20 18:57             ` Jason Gunthorpe
2021-10-20 21:38               ` Alex Williamson
2021-10-21 10:39             ` Yishai Hadas
2021-11-17 16:42 ` vfio migration discussions (was: [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver) Cornelia Huck
2021-11-17 17:47   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211103094409.3ea180ab.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).