linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"mgurtovoy@nvidia.com" <mgurtovoy@nvidia.com>,
	Linuxarm <linuxarm@huawei.com>,
	liulongfang <liulongfang@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	yuzenghui <yuzenghui@huawei.com>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"Wangzhou (B)" <wangzhou1@hisilicon.com>
Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver
Date: Tue, 22 Feb 2022 11:55:55 +0000	[thread overview]
Message-ID: <7db79281-e72a-29f8-7192-07b739a63897@oracle.com> (raw)
In-Reply-To: <20220215162133.GV4160@nvidia.com>

On 2/15/22 16:21, Jason Gunthorpe wrote:
> On Tue, Feb 15, 2022 at 04:00:35PM +0000, Joao Martins wrote:
>> On 2/14/22 14:06, Jason Gunthorpe wrote:
>>> On Mon, Feb 14, 2022 at 01:34:15PM +0000, Joao Martins wrote:
>>>
>>>> [*] apparently we need to write an invalid entry first, invalidate the {IO}TLB
>>>> and then write the new valid entry. Not sure I understood correctly that this
>>>> is the 'break-before-make' thingie.
>>>
>>> Doesn't that explode if the invalid entry is DMA'd to?
>>>
>> Yes, IIUC. Also, the manual has this note:
> 
> Heh, sounds like "this doesn't work" to me :)
> 
Yeah, but I remember reading in manual that HTTUD (what ARM calls it for dirty
tracking, albeit DBM is another term for the same thing) requires FEAT_BBM
which avoids us to play the above games. So, supposedly, we can "just"
use atomics with IOPTE changes and IOTLB flush. Not if we need the latter
flush before or after on smmuv3.

>>> Like I said, I'd prefer we not build more on the VFIO type 1 code
>>> until we have a conclusion for iommufd..
>>>
>>
>> I didn't quite understand what you mean by conclusion.
> 
> If people are dead-set against doing iommufd, then lets abandon the
> idea and go back to hacking up vfio.
>  
Heh, I was under the impression everybody was investing so much *because*
that direction was set onto iommufd direction.

>> If by conclusion you mean the whole thing to be merged, how can the work be
>> broken up to pieces if we busy-waiting on the new subsystem? Or maybe you meant
>> in terms of direction...
> 
> I think go ahead and build it on top of iommufd, start working out the
> API details, etc. I think once the direction is concluded the new APIs
> will go forward.
>
/me nods, will do. Looking at your repository it is looking good.

>>> While returning the dirty data looks straight forward, it is hard to
>>> see an obvious path to enabling and controlling the system iommu the
>>> way vfio is now.
>>
>> It seems strange to have a whole UAPI for userspace [*] meant to
>> return dirty data to userspace, when dirty right now means the whole
>> pinned page set and so copying the whole guest ... 
> 
> Yes, the whole thing is only partially implemented, and doesn't have
> any in-kernel user. It is another place holder for an implementation
> to come someday.
> 
Yeap, seems like.

>> Hence my thinking was that the patches /if small/ would let us see how dirty
>> tracking might work for iommu kAPI (and iommufd) too.
> 
> It could be tried, but I think if you go into there you will find it
> quickly turns quite complicated to address all the edge cases. Eg what
> do you do if you have a mdev present after you turn on system
> tracking? What if the mdev is using a PASID?
> What about hotplug of new
> VFIO devices?
> 
> Remember, dirty tracking for vfio is totally useless without also
> having vfio device migration. 

Oh yes -- I am definitely aware. IOMMU/Device Dirty tracking is useless
if we can't do the device part first. But if quiescing DMA and saving
state are two hard requirements that are mandatory for a live migrateable
VF, having dirty tracking in the devices I suspect might be more rare.
So perhaps people will look at IOMMUs as a commodity-workaround to avoid
a whole bunch of hardware logic for dirty tracking, even bearing what it
entails for DMA performance (hisilicon might be an example).

> Do you already have a migration capable
> device to use with this?
> 
Not yet, but soon I hope.

>> Would it be better to do more iterative steps (when possible) as opposed to
>> scratch and rebuild VFIO type1 IOMMU handling?
> 
> Possibly, but every thing that gets added has to be carried over to
> the new code too, and energy has to be expended trying to figure out
> how the half implemented stuff should work while finishing it.
> 
/me nods I understand

> At the very least we must decide what to do with device-provided dirty
> tracking before the VFIO type1 stuff can be altered to use the system
> IOMMU.
> 
I, too, have been wondering what that is going to look like -- and how do we
convey the setup of dirty tracking versus the steering of it.

> This is very much like the migration FSM, the only appeal is the
> existing qemu implementation of the protocol.

Yeah.

  reply	other threads:[~2022-02-22 11:56 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-02  9:58 [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Shameer Kolothum
2021-07-02  9:58 ` [RFC v2 1/4] hisi-acc-vfio-pci: add new vfio_pci driver for HiSilicon ACC devices Shameer Kolothum
2021-07-02 20:29   ` Alex Williamson
2021-07-05  7:20     ` Shameerali Kolothum Thodi
2021-07-04  7:03   ` Leon Romanovsky
2021-07-05  8:47     ` Shameerali Kolothum Thodi
2021-07-05  9:41       ` Max Gurtovoy
2021-07-05 10:18         ` Shameerali Kolothum Thodi
2021-07-05 18:27           ` Leon Romanovsky
2021-07-05 18:32             ` Jason Gunthorpe
2021-07-06  3:59               ` Leon Romanovsky
2021-07-06  4:39               ` Christoph Hellwig
2021-07-06 11:51                 ` Jason Gunthorpe
2021-07-02  9:58 ` [RFC v2 2/4] hisi_acc_vfio_pci: Override ioctl method to limit BAR2 region size Shameer Kolothum
2021-07-02 20:29   ` Alex Williamson
2021-07-05  7:22     ` Shameerali Kolothum Thodi
2021-07-02  9:58 ` [RFC v2 3/4] crypto: hisilicon/qm - Export mailbox functions for common use Shameer Kolothum
2021-07-04  9:34   ` Max Gurtovoy
2021-07-05 10:23     ` Shameerali Kolothum Thodi
2021-07-02  9:58 ` [RFC v2 4/4] hisi_acc_vfio_pci: Add support for vfio live migration Shameer Kolothum
2022-02-02 13:14 ` [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Jason Gunthorpe
2022-02-02 14:34   ` Shameerali Kolothum Thodi
2022-02-02 15:39     ` Jason Gunthorpe
2022-02-02 16:10       ` Shameerali Kolothum Thodi
2022-02-02 17:03         ` Jason Gunthorpe
2022-02-02 19:05           ` Joao Martins
2022-02-03 15:18             ` Jason Gunthorpe
2022-02-04 19:53               ` Joao Martins
2022-02-04 23:07                 ` Jason Gunthorpe
2022-02-11 17:28                   ` Joao Martins
2022-02-11 17:49                     ` Jason Gunthorpe
2022-02-11 21:43                       ` Joao Martins
2022-02-12  0:01                         ` Jason Gunthorpe
2022-02-14 13:34                           ` Joao Martins
2022-02-14 14:06                             ` Jason Gunthorpe
2022-02-15 16:00                               ` Joao Martins
2022-02-15 16:21                                 ` Jason Gunthorpe
2022-02-22 11:55                                   ` Joao Martins [this message]
2022-02-23  1:03                                     ` Jason Gunthorpe
2022-02-25 19:18                                       ` Joao Martins
2022-02-25 20:44                                         ` Jason Gunthorpe
2022-02-28 13:01                                           ` Joao Martins
2022-02-28 21:01                                             ` Jason Gunthorpe
2022-03-01 13:06                                               ` Joao Martins
2022-03-01 13:54                                                 ` Jason Gunthorpe
2022-03-01 14:27                                                   ` Joao Martins
2022-03-11 13:51                                             ` iommufd(+vfio-compat) dirty tracking (Was: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver) Joao Martins
2022-03-15 19:29                                               ` Jason Gunthorpe
2022-03-16 16:36                                                 ` iommufd(+vfio-compat) dirty tracking Joao Martins
2022-03-16 20:37                                                   ` Joao Martins
2022-03-18 17:12                                                     ` Joao Martins
2022-03-18 17:34                                                       ` Jason Gunthorpe
2022-02-02 17:30     ` [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Alex Williamson
2022-02-02 18:04       ` Jason Gunthorpe
2022-02-18 16:37 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7db79281-e72a-29f8-7192-07b739a63897@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=alex.williamson@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liulongfang@huawei.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).