kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Joerg Roedel <joro@8bytes.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Will Deacon <will@kernel.org>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Christoph Hellwig <hch@lst.de>,
Subject: Re: [PATCH 3/6] vfio: remove the unused mdev iommu hook
Date: Mon, 24 May 2021 19:18:33 +0100	[thread overview]
Message-ID: <9d34b473-3a37-5de2-95f8-b508d85e558c@arm.com> (raw)
In-Reply-To: <20210520143420.GW1096940@ziepe.ca>

On 2021-05-20 15:34, Jason Gunthorpe wrote:
> On Thu, May 20, 2021 at 03:13:55PM +0100, Robin Murphy wrote:
>> By "mdev-like" I mean it's very similar in shape to the general SIOV-style
>> mediated device concept - i.e. a physical device with an awareness of
>> operating on multiple contexts at once, using a Substream ID/PASID for each
>> one - but instead of exposing control of the contexts to anyone else, they
>> remain hidden behind the kernel driver which already has its own abstracted
>> uAPI, so overall it ends up as more just internal housekeeping than any
>> actual mediation. We were looking at the mdev code for inspiration, but
>> directly using it was never the plan.
> Well:
>   - Who maps memory into the IOASID (ie the specific sub stream id)?

Sorry to nitpick, but I think it's important to get terminology right 
here to avoid unnecessary misunderstanding. You can't map memory into an 
address space ID; it's just a number. Ultimately that identifier ends up 
pointing at some actual address space, and most of the current work is 
focused on the case of that address space being provided by an mm where 
things are mapped implicitly by a userspace process; I care about the 
case of it being provided by an iommu_domain where things are mapped 
explicitly by a kernel driver. I would be extremely wary of creating 
some new third *address space* abstraction.

>   - What memory must be mapped?
>   - Who triggers DMA to this memory?

It's a pretty typical DMA flow, as far as I understand. Userspace 
allocates some buffers (in this case, via the kernel driver, but in 
general I'm not sure it makes much difference), puts data in the 
buffers, issues an ioctl to say "process this data", and polls for 
completion; the kernel driver makes sure the buffers are mapped in the 
device address space (at allocation time in this case, but in general I 
assume it could equally be done at request time for user pages), and 
deals with scheduling requests onto the hardware. I understand this 
interface is already deployed in a driver stack which supports a single 
client process at once; extending the internals to allow requests from 
multiple processes to run in parallel using Substream IDs for isolation 
is the future goal. The interface itself shouldn't change, only some 
internal arbitration details.

>> The driver simply needs to keep track of the domains and PASIDs -
>> when a process submits some work, it can look up the relevant
>> domain, iommu_map() the user pages to the right addresses, dma_map()
>> them for coherency, then poke in the PASID as part of scheduling the
>> work on the physical device.
> If you are doing stuff like this then the /dev/ioasid is what you
> actually want. The userprocess can create its own IOASID, program the
> io page tables for that IOASID to point to pages as it wants and then
> just hand over a fully instantiated io page table to the device
> driver.

No. In our case, the device does not need to operate on userspace 
addresses, in fact quite the opposite. There may need to be additional 
things mapped into the device address space which are not, and should 
not be, visible to userspace. There are also some quite weird criteria 
for optimal address space layout which frankly are best left hidden 
inside the kernel driver. Said driver is already explicitly managing its 
own iommu_domain in the same manner as various DRM drivers and others, 
so growing that to multiple parallel domains really isn't a big leap. 
Moving any of this responsibility into userspace would be unwanted and 
unnecessary upheaval.

> What you are describing is the literal use case of /dev/ioasid - a
> clean seperation of managing the IOMMU related parts through
> /dev/ioasid and the device driver itself is only concerned with
> generating device DMA that has the proper PASID/substream tag.
> The entire point is to not duplicate all the iommu code you are
> describing having written into every driver that just wants an IOASID.
> In particular, you are talking about having a substream capable device
> and driver but your driver's uAPI is so limited it can't address the
> full range of substream configurations:
>   - A substream pointing at a SVA
>   - A substream pointing a IO page table nested under another
>   - A substream pointing at an IOMMU page table shared by many users
> And more. Which is bad.

None of which make much if any sense for the way this device and the 
rest of its software stack are designed to work, though. Anyway, the 
actual uAPI in question is essentially just chucking buffer fds about in 
a very abstract manner, so I don't see that it has any relevance here. 
We're talking about a kernel driver *internally* managing how it chooses 
to expose the buffers backing those fds to the hardware. SVA has no 
meaning in that context (there's nothing to share), and I don't even 
understand your second case, but attaching multiple SSIDs to a single 
domain is absolutely something which _could_ be done, there's just zero 
point in a single driver doing that privately when it could simply run 
the relevant jobs under the same SSID instead.

>>> We already talked about this on the "how to use PASID from the kernel"
>>> thread.
>> Do you have a pointer to the right thread so I can catch up? It's not the
>> easiest thing to search for on lore amongst all the other PASID-related
>> business :(
> Somewhere in here:
> http://lore.kernel.org/r/20210517143758.GP1002214@nvidia.com

Thanks, along with our discussion here that kind of confirms my concern. 
Assuming IOASID can wrap up a whole encapsulated thing which is either 
SVA or IOMMU_DOMAIN_DMA is too much of an overabstraction. There 
definitely *are* uses for IOMMU_DOMAIN_DMA - say you want to put some 
SIOV ADIs to work for the host kernel using their regular 
non-IOMMU-aware driver - but there will also be cases for 
IOMMU_DOMAIN_UNMANAGED, although I do mostly expect those to be SoC 
devices whose drivers are already IOMMU-aware and just want to be so at 
a finer-grained level, not PCI devices. Even IOMMU_DOMAIN_PASSTHROUGH 
for IOASIDs _could_ be doable if a sufficiently compelling reason came 
along. I agree that SVA on init_mm is pretty bonkers, but don't get too 
hung up on the DMA API angle which is really orthogonal - passthrough 
domains with dma-direct ops have been working fine for years.

>> FWIW my non-SVA view is that a PASID is merely an index into a set of
>> iommu_domains, and in that context it doesn't even really matter *who*
>> allocates them, only that the device driver and IOMMU driver are in sync :)
> Right, this is where /dev/ioasid is going.
> However it gets worked out at the kAPI level in the iommu layer the
> things you asked for are intended to be solved, and lots more.

Great! It feels like one of the major things will be that, at least 
without major surgery to the DMA API, most of the use-cases will likely 
still need a struct device wrapped around the IOASID. I think the 
particular one I want to solve is actually the odd one out in that it 
doesn't really care, and could be made to work either way.


  reply	other threads:[~2021-05-24 18:18 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10  6:53 more iommu dead code removal Christoph Hellwig
2021-05-10  6:54 ` [PATCH 1/6] iommu: remove the unused dev_has_feat method Christoph Hellwig
2021-05-10  6:54 ` [PATCH 2/6] iommu: remove the unused iommu_aux_get_pasid interface Christoph Hellwig
2021-05-10  6:54 ` [PATCH 3/6] vfio: remove the unused mdev iommu hook Christoph Hellwig
2021-05-10 15:54   ` Jason Gunthorpe
2021-05-13  3:28     ` Tian, Kevin
2021-05-13 12:00       ` Jason Gunthorpe
2021-05-14  6:27         ` Tian, Kevin
2021-05-14  6:54         ` Tian, Kevin
2021-05-14 12:19           ` Jason Gunthorpe
2021-05-14 12:58             ` Tian, Kevin
2021-05-14 13:31               ` Jason Gunthorpe
2021-05-17 12:22                 ` Joerg Roedel
2021-05-17 12:30                   ` Jason Gunthorpe
2021-05-17 12:53                     ` Joerg Roedel
2021-05-17 13:35                       ` Jason Gunthorpe
2021-05-17 15:35                         ` Joerg Roedel
2021-05-19 15:23                           ` Robin Murphy
2021-05-19 18:06                             ` Jason Gunthorpe
2021-05-19 23:12                               ` Tian, Kevin
2021-05-19 23:24                                 ` Jason Gunthorpe
2021-05-20 14:13                                   ` Robin Murphy
2021-05-20 14:34                                     ` Jason Gunthorpe
2021-05-24 18:18                                       ` Robin Murphy [this message]
2021-05-25  0:00                                         ` Jason Gunthorpe
2021-06-30  9:08                           ` Tian, Kevin
2021-07-22 13:34                             ` Christoph Hellwig
2021-07-23  5:36                               ` Tian, Kevin
2021-07-23  5:41                                 ` Christoph Hellwig
2021-07-23  5:44                                   ` Tian, Kevin
2021-07-22  6:02                           ` Tian, Kevin
2021-05-14 13:17         ` Tian, Kevin
2021-05-14 13:39           ` Jason Gunthorpe
2021-05-14 14:28             ` Tian, Kevin
2021-05-14 14:44               ` Jason Gunthorpe
2021-05-10  6:54 ` [PATCH 4/6] iommu: remove iommu_aux_{attach,detach}_device Christoph Hellwig
2021-05-10  6:54 ` [PATCH 5/6] iommu: remove IOMMU_DEV_FEAT_AUX Christoph Hellwig
2021-05-10  6:54 ` [PATCH 6/6] iommu: remove iommu_dev_feature_enabled Christoph Hellwig
2021-05-10 11:54 ` more iommu dead code removal Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d34b473-3a37-5de2-95f8-b508d85e558c@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jgg@ziepe.ca \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=will@kernel.org \
    --subject='Re: [PATCH 3/6] vfio: remove the unused mdev iommu hook' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).