All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"mgurtovoy@nvidia.com" <mgurtovoy@nvidia.com>,
	Linuxarm <linuxarm@huawei.com>,
	liulongfang <liulongfang@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	yuzenghui <yuzenghui@huawei.com>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"Wangzhou (B)" <wangzhou1@hisilicon.com>
Subject: Re: iommufd(+vfio-compat) dirty tracking (Was: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver)
Date: Tue, 15 Mar 2022 16:29:52 -0300	[thread overview]
Message-ID: <20220315192952.GN11336@nvidia.com> (raw)
In-Reply-To: <8448d7fb-3808-c4e8-66cf-4a3184c24ec0@oracle.com>

On Fri, Mar 11, 2022 at 01:51:32PM +0000, Joao Martins wrote:
> On 2/28/22 13:01, Joao Martins wrote:
> > On 2/25/22 20:44, Jason Gunthorpe wrote:
> >> On Fri, Feb 25, 2022 at 07:18:37PM +0000, Joao Martins wrote:
> >>> On 2/23/22 01:03, Jason Gunthorpe wrote:
> >>>> On Tue, Feb 22, 2022 at 11:55:55AM +0000, Joao Martins wrote:
> >>> I'll be simplifying the interface in the other type1 series I had and making it
> >>> a simple iommu_set_feature(domain, flag, value) behind an ioctl for iommufd that can
> >>> enable/disable over a domain. Perhaps same trick could be expanded to other
> >>> features to have a bit more control on what userspace is allowed to use. I think
> >>> this just needs to set/clear a feature bit in the domain, for VFIO or userspace
> >>> to have full control during the different stages of migration of dirty tracking.
> >>> In all of the IOMMU implementations/manuals I read it means setting a protection
> >>> domain descriptor flag: AMD is a 2-bit field in the DTE, on Intel likewise but on
> >>> the PASID table only for scalable-mode PTEs, on SMMUv3.2 there's an equivalent
> >>> (albeit past work had also it always-on).
> >>>
> >>> Provided the iommufd does /separately/ more finer granularity on what we can
> >>> do with page tables. Thus the VMM can demote/promote the ioptes to a lower page size
> >>> at will as separate operations, before and after migration respectivally. That logic
> >>> would probably be better to be in separate iommufd ioctls(), as that it's going to be
> >>> expensive.
> >>
> >> This all sounds right to me
> >>
> >> Questions I have:
> >>  - Do we need ranges for some reason? You mentioned ARM SMMU wants
> >>    ranges? how/what/why?
> >>
> > Ignore that. I got mislead by the implementation and when I read the SDM
> > I realized that the implementation was doing the same thing I was doing
> > i.e. enabling dirty-bit in the protection domain at start rather than
> > dynamic toggling. So ARM is similar to Intel/AMD in which you set CD.HD
> > bit in the context descriptor to enable dirty bits or the STE.S2HD in the
> > stream table entry for the stage2 equivalent. Nothing here is per-range
> > basis. And the ranges was only used by the implementation for the eager
> > splitting/merging of IO page table levels.
> > 
> >>  - What about the unmap and read dirty without races operation that
> >>    vfio has?
> >>
> > I am afraid that might need a new unmap iommu op that reads the dirty bit
> > after clearing the page table entry. It's marshalling the bits from
> > iopte into a bitmap as opposed to some logic added on top. As far as I
> > looked for AMD this isn't difficult to add, (same for Intel) it can
> > *I think* reuse all of the unmap code.
> > 
> 
> OK, made some progress.

Nice!

 
> It's a WIP (here be dragons!) and still missing things e.g. iommufd selftests,
> revising locking, bugs, and more -- works with my emulated qemu patches which
> is a good sign. But hopefully starts some sort of skeleton of what we were
> talking about in the above thread.

I'm a bit bogged with the coming merge window right now, but will have
more to say in a bit
 
> The bigger TODO, though is the equivalent UAPI for IOMMUFD; I started with
> the vfio-compat one as it was easier provided there's existing userspace to work
> with (Qemu). To be fair the API is not that "far" from what would be IOMMUFD onto
> steering the dirty tracking, read-clear the dirty bits, unmap and
> get dirty. 

I think this is fine to start experimenting with

> But as we discussed earlier, the one gap of VFIO dirty API is that
> it lacks controls for upgrading/downgrading area/IOPTE sizes which
> is where IOMMUFD would most likely shine. When that latter part is
> done we can probably adopt an eager-split approach inside
> vfio-compat.

I think the native API should be new ioctls that operate on the
hw_pagetable object to:
  - enable/disable dirty tracking 
  - read&clear a bitmap from a range
  - read&unmap a bitmap from a range
  - Manipulate IOPTE sizes

As you say it should not be much distance from the VFIO compat stuff

Most probably I would say to leave dirty tracking out of the type1 api
and compat for it. Maybe we can make some limited cases work back
compat, like the whole ioas supports iommu dirty tracking or
something..

Need to understand if it is wortwhile - remember to use any of this
you need a qemu that is updated to the v2 migration interface,
so there is little practical value in back compat to old qemu if we
expect qemu will use the native interface anyhow.

> Additionally I also sort of want to skeleton ARM and Intel to see how it looks.
> Some of the commits made notes of some of research I made, so *I think* the APIs
> introduced capture h/w semantics for all the three IOMMUs supporting dirty
> tracking.

I think the primitives are pretty basic, I'd be surprised if there is
something different :)

Though things to be thinking about are how does this work with nested
translation and other advanced features.

Thanks,
Jason

  reply	other threads:[~2022-03-15 19:29 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-02  9:58 [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Shameer Kolothum
2021-07-02  9:58 ` [RFC v2 1/4] hisi-acc-vfio-pci: add new vfio_pci driver for HiSilicon ACC devices Shameer Kolothum
2021-07-02 20:29   ` Alex Williamson
2021-07-05  7:20     ` Shameerali Kolothum Thodi
2021-07-04  7:03   ` Leon Romanovsky
2021-07-05  8:47     ` Shameerali Kolothum Thodi
2021-07-05  9:41       ` Max Gurtovoy
2021-07-05 10:18         ` Shameerali Kolothum Thodi
2021-07-05 18:27           ` Leon Romanovsky
2021-07-05 18:32             ` Jason Gunthorpe
2021-07-06  3:59               ` Leon Romanovsky
2021-07-06  4:39               ` Christoph Hellwig
2021-07-06 11:51                 ` Jason Gunthorpe
2021-07-02  9:58 ` [RFC v2 2/4] hisi_acc_vfio_pci: Override ioctl method to limit BAR2 region size Shameer Kolothum
2021-07-02 20:29   ` Alex Williamson
2021-07-05  7:22     ` Shameerali Kolothum Thodi
2021-07-02  9:58 ` [RFC v2 3/4] crypto: hisilicon/qm - Export mailbox functions for common use Shameer Kolothum
2021-07-04  9:34   ` Max Gurtovoy
2021-07-05 10:23     ` Shameerali Kolothum Thodi
2021-07-02  9:58 ` [RFC v2 4/4] hisi_acc_vfio_pci: Add support for vfio live migration Shameer Kolothum
2022-02-02 13:14 ` [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Jason Gunthorpe
2022-02-02 14:34   ` Shameerali Kolothum Thodi
2022-02-02 15:39     ` Jason Gunthorpe
2022-02-02 16:10       ` Shameerali Kolothum Thodi
2022-02-02 17:03         ` Jason Gunthorpe
2022-02-02 19:05           ` Joao Martins
2022-02-03 15:18             ` Jason Gunthorpe
2022-02-04 19:53               ` Joao Martins
2022-02-04 23:07                 ` Jason Gunthorpe
2022-02-11 17:28                   ` Joao Martins
2022-02-11 17:49                     ` Jason Gunthorpe
2022-02-11 21:43                       ` Joao Martins
2022-02-12  0:01                         ` Jason Gunthorpe
2022-02-14 13:34                           ` Joao Martins
2022-02-14 14:06                             ` Jason Gunthorpe
2022-02-15 16:00                               ` Joao Martins
2022-02-15 16:21                                 ` Jason Gunthorpe
2022-02-22 11:55                                   ` Joao Martins
2022-02-23  1:03                                     ` Jason Gunthorpe
2022-02-25 19:18                                       ` Joao Martins
2022-02-25 20:44                                         ` Jason Gunthorpe
2022-02-28 13:01                                           ` Joao Martins
2022-02-28 21:01                                             ` Jason Gunthorpe
2022-03-01 13:06                                               ` Joao Martins
2022-03-01 13:54                                                 ` Jason Gunthorpe
2022-03-01 14:27                                                   ` Joao Martins
2022-03-11 13:51                                             ` iommufd(+vfio-compat) dirty tracking (Was: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver) Joao Martins
2022-03-15 19:29                                               ` Jason Gunthorpe [this message]
2022-03-16 16:36                                                 ` iommufd(+vfio-compat) dirty tracking Joao Martins
2022-03-16 20:37                                                   ` Joao Martins
2022-03-18 17:12                                                     ` Joao Martins
2022-03-18 17:34                                                       ` Jason Gunthorpe
2022-02-02 17:30     ` [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Alex Williamson
2022-02-02 18:04       ` Jason Gunthorpe
2022-02-18 16:37 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220315192952.GN11336@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liulongfang@huawei.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.