All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"mgurtovoy@nvidia.com" <mgurtovoy@nvidia.com>,
	"yishaih@nvidia.com" <yishaih@nvidia.com>,
	Linuxarm <linuxarm@huawei.com>,
	liulongfang <liulongfang@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"Wangzhou (B)" <wangzhou1@hisilicon.com>
Subject: RE: [PATCH v6 09/10] hisi_acc_vfio_pci: Add support for VFIO live migration
Date: Wed, 2 Mar 2022 09:07:38 +0000	[thread overview]
Message-ID: <635f11c40e814d749ccf533f1414ba4e@huawei.com> (raw)
In-Reply-To: <20220302000329.GZ219866@nvidia.com>



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@nvidia.com]
> Sent: 02 March 2022 00:03
> To: Alex Williamson <alex.williamson@redhat.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-crypto@vger.kernel.org; cohuck@redhat.com; mgurtovoy@nvidia.com;
> yishaih@nvidia.com; Linuxarm <linuxarm@huawei.com>; liulongfang
> <liulongfang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> Jonathan Cameron <jonathan.cameron@huawei.com>; Wangzhou (B)
> <wangzhou1@hisilicon.com>
> Subject: Re: [PATCH v6 09/10] hisi_acc_vfio_pci: Add support for VFIO live
> migration
> 
> On Tue, Mar 01, 2022 at 03:44:31PM -0700, Alex Williamson wrote:
> > On Tue, 1 Mar 2022 16:39:38 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > On Tue, Mar 01, 2022 at 12:30:47PM -0700, Alex Williamson wrote:
> > > > Wouldn't it make more sense if initial-bytes started at QM_MATCH_SIZE
> > > > and dirty-bytes was always sizeof(vf_data) - QM_MATCH_SIZE?  ie.
> QEMU
> > > > would know that it has sizeof(vf_data) - QM_MATCH_SIZE remaining even
> > > > while it's getting ENOMSG after reading QM_MATCH_SIZE bytes of data.
> > >
> > > The purpose of this ioctl is to help userspace guess when moving on to
> > > STOP_COPY is a good idea ie when the device has done almost all the
> > > work it is going to be able to do in PRE_COPY. ENOMSG is a similar
> > > indicator.
> > >
> > > I expect all devices to have some additional STOP_COPY trailer_data in
> > > addition to their PRE_COPY initial_data and dirty_data
> > >
> > > There is a choice to make if we report the trailer_data during
> > > PRE_COPY or not. As this is all estimates, it doesn't matter unless
> > > the trailer_data is very big.
> > >
> > > Having all devices trend toward a 0 dirty_bytes to say they are are
> > > done all the pre-copy they can do makes sense from an API
> > > perspective. If one device trends toward 10MB due to a big
> > > trailer_data and one trends toward 0 bytes, how will qemu consistently
> > > decide when best to trigger STOP_COPY? It makes the API less useful.
> > >
> > > So, I would not include trailer_data in the dirty_bytes.
> >
> > That assumes that it's possible to keep up with the device dirty
> > rate.
> 
> It keeps options open so we have this choice someday.
> 
> We already see that implementations are using vCPU throttling as part
> of their migration strategy, and we are seriously looking at DMA
> throttling. It is not a big leap to imagine that
> internal-state-dirtying throttling will happne someday.
> 
> With throttling iterations would ratchet up the throttle until they
> reach an absolute small amount of dirty then cut over to STOP_COPY
> 
> > It seems like a better approach for userspace would be to look at how
> > dirty_bytes is trending.
> 
> It may be biw, but this approach doesn't care if the trailing_bytes
> are included or not, so lets leave them out and preserve the other
> operating model.
> 
> > If we exclude STOP_COPY trailing data from the VFIO_DEVICE_MIG_PRECOPY
> > ioctl, it seems even more of a disconnect that when we enter the
> > STOP_COPY state, suddenly we start getting new data out of a PRECOPY
> > ioctl.
> 
> Why? That amounts can go up at any time, how does it matter if it goes
> up after STOP_COPY or instantly before?
> 
> > BTW, "VFIO_DEVICE" should be reserved for ioctls and data structures
> > relative to the device FD, appending it with _MIG is too subtle for me.
> > This is also a GET operation for INFO, so I'd think for consistency
> > with the existing vfio uAPI we'd name this something like
> > VFIO_MIG_GET_PRECOPY_INFO where the structure might be named
> > vfio_precopy_info.
> 
> Sure
> 
> > So if we don't think this is the right approach for STOP_COPY, then why
> > are we pushing that it has any purpose outside of PRECOPY or might be
> > implemented by a non-PRECOPY driver for use in STOP_COPY?
> 
> It is just simpler and more consistent to implement the math under
> this ioctl in all cases then to try and artificially restrict it.
> 
> But I don't have a use case for it, so lets block it if you prefer.
> 
> Shameerali will you make these adjustments to the PRE_COPY patch?

Sure. I think we can summarize the discussion as below,

 - Rename the MIG_PRECOPY ioctl to VFIO_MIG_GET_PRECOPY_INFO and
  structure to vfio_precopy_info.
 - This ioctl is only valid in PRE_COPY state and should return -EINVAL in
  other states(Update the documentation).
 - No changes to the initial_bytes & dirty_bytes descriptions.

Please let me know if I missed anything.

I will address other comments on this series as well and sent out a
revised one soon.

Thanks,
Shameer    

  reply	other threads:[~2022-03-02  9:07 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28  9:01 [PATCH v6 00/10] vfio/hisilicon: add ACC live migration driver Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 01/10] crypto: hisilicon/qm: Move the QM header to include/linux Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 02/10] crypto: hisilicon/qm: Move few definitions to common header Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 03/10] hisi_acc_qm: Move PCI device IDs " Shameer Kolothum
2022-02-28 17:33   ` Alex Williamson
2022-02-28 20:12     ` Bjorn Helgaas
2022-02-28 20:23       ` Alex Williamson
2022-02-28 20:55         ` Bjorn Helgaas
2022-02-28  9:01 ` [PATCH v6 04/10] hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 05/10] hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 06/10] hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 07/10] vfio: Extend the device migration protocol with PRE_COPY Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 08/10] crypto: hisilicon/qm: Set the VF QM state register Shameer Kolothum
2022-02-28  9:01 ` [PATCH v6 09/10] hisi_acc_vfio_pci: Add support for VFIO live migration Shameer Kolothum
2022-02-28 14:57   ` Jason Gunthorpe
2022-02-28 18:01     ` Shameerali Kolothum Thodi
2022-02-28 18:05       ` Jason Gunthorpe
2022-02-28 20:16         ` Alex Williamson
2022-02-28 20:29           ` Jason Gunthorpe
2022-02-28 21:20             ` Alex Williamson
2022-02-28 23:47               ` Jason Gunthorpe
2022-03-01  4:41                 ` Alex Williamson
2022-03-01 13:15                   ` Jason Gunthorpe
2022-03-01 19:30                     ` Alex Williamson
2022-03-01 20:39                       ` Jason Gunthorpe
2022-03-01 22:44                         ` Alex Williamson
2022-03-02  0:03                           ` Jason Gunthorpe
2022-03-02  9:07                             ` Shameerali Kolothum Thodi [this message]
2022-02-28  9:01 ` [PATCH v6 10/10] hisi_acc_vfio_pci: Use its own PCI reset_done error handler Shameer Kolothum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=635f11c40e814d749ccf533f1414ba4e@huawei.com \
    --to=shameerali.kolothum.thodi@huawei.com \
    --cc=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liulongfang@huawei.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.