linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Liu Yi L <yi.l.liu@intel.com>,
	alex.williamson@redhat.com, hch@lst.de, jasowang@redhat.com,
	joro@8bytes.org, jean-philippe@linaro.org, kevin.tian@intel.com,
	parav@mellanox.com, lkml@metux.net, pbonzini@redhat.com,
	lushenming@huawei.com, eric.auger@redhat.com, corbet@lwn.net,
	ashok.raj@intel.com, yi.l.liu@linux.intel.com,
	jun.j.tian@intel.com, hao.wu@intel.com, dave.jiang@intel.com,
	jacob.jun.pan@linux.intel.com, kwankhede@nvidia.com,
	robin.murphy@arm.com, kvm@vger.kernel.org,
	iommu@lists.linux-foundation.org, dwmw2@infradead.org,
	linux-kernel@vger.kernel.org, baolu.lu@linux.intel.com,
	nicolinc@nvidia.com
Subject: Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE
Date: Mon, 11 Oct 2021 15:49:14 -0300	[thread overview]
Message-ID: <20211011184914.GQ2744544@nvidia.com> (raw)
In-Reply-To: <YWPTWdHhoI4k0Ksc@yekko>

On Mon, Oct 11, 2021 at 05:02:01PM +1100, David Gibson wrote:

> > This means we cannot define an input that has a magic HW specific
> > value.
> 
> I'm not entirely sure what you mean by that.

I mean if you make a general property 'foo' that userspace must
specify correctly then your API isn't general anymore. Userspace must
know if it is A or B HW to set foo=A or foo=B.

Supported IOVA ranges are easially like that as every IOMMU is
different. So DPDK shouldn't provide such specific or binding
information.

> No, I don't think that needs to be a condition.  I think it's
> perfectly reasonable for a constraint to be given, and for the host
> IOMMU to just say "no, I can't do that".  But that does mean that each
> of these values has to have an explicit way of userspace specifying "I
> don't care", so that the kernel will select a suitable value for those
> instead - that's what DPDK or other userspace would use nearly all the
> time.

My feeling is that qemu should be dealing with the host != target
case, not the kernel.

The kernel's job should be to expose the IOMMU HW it has, with all
features accessible, to userspace.

Qemu's job should be to have a userspace driver for each kernel IOMMU
and the internal infrastructure to make accelerated emulations for all
supported target IOMMUs.

In other words, it is not the kernel's job to provide target IOMMU
emulation.

The kernel should provide truely generic "works everywhere" interface
that qemu/etc can rely on to implement the least accelerated emulation
path.

So when I see proposals to have "generic" interfaces that actually
require very HW specific setup, and cannot be used by a generic qemu
userpace driver, I think it breaks this model. If qemu needs to know
it is on PPC (as it does today with VFIO's PPC specific API) then it
may as well speak PPC specific language and forget about pretending to
be generic.

This approach is grounded in 15 years of trying to build these
user/kernel split HW subsystems (particularly RDMA) where it has
become painfully obvious that the kernel is the worst place to try and
wrangle really divergent HW into a "common" uAPI.

This is because the kernel/user boundary is fixed. Introducing
anything generic here requires a lot of time, thought, arguing and
risk. Usually it ends up being done wrong (like the PPC specific
ioctls, for instance) and when this happens we can't learn and adapt,
we are stuck with stable uABI forever.

Exposing a device's native programming interface is much simpler. Each
device is fixed, defined and someone can sit down and figure out how
to expose it. Then that is it, it doesn't need revisiting, it doesn't
need harmonizing with a future slightly different device, it just
stays as is.

The cost, is that there must be a userspace driver component for each
HW piece - which we are already paying here!

> Ideally the host /dev/iommu will say "ok!", since both those ranges
> are within the 0..2^60 translated range of the host IOMMU, and don't
> touch the IO hole.  When the guest calls the IO mapping hypercalls,
> qemu translates those into DMA_MAP operations, and since they're all
> within the previously verified windows, they should work fine.

For instance, we are going to see HW with nested page tables, user
space owned page tables and even kernel-bypass fast IOTLB
invalidation.

In that world does it even make sense for qmeu to use slow DMA_MAP
ioctls for emulation?

A userspace framework in qemu can make these optimizations and is
also necessarily HW specific as the host page table is HW specific..

Jason

  parent reply	other threads:[~2021-10-11 18:49 UTC|newest]

Thread overview: 274+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-19  6:38 [RFC 00/20] Introduce /dev/iommu for userspace I/O address space management Liu Yi L
2021-09-19  6:38 ` [RFC 01/20] iommu/iommufd: Add /dev/iommu core Liu Yi L
2021-09-21 15:41   ` Jason Gunthorpe
2021-09-22  1:51     ` Tian, Kevin
2021-09-22 12:40       ` Jason Gunthorpe
2021-09-22 13:59         ` Tian, Kevin
2021-09-22 14:10           ` Jason Gunthorpe
2021-10-15  9:18     ` Liu, Yi L
2021-10-15 11:18       ` Jason Gunthorpe
2021-10-15 11:29         ` Liu, Yi L
2021-10-19 16:57         ` Jacob Pan
2021-10-19 16:57           ` Jason Gunthorpe
2021-10-19 17:11             ` Jacob Pan
2021-10-19 17:12               ` Jason Gunthorpe
2021-09-19  6:38 ` [RFC 02/20] vfio: Add device class for /dev/vfio/devices Liu Yi L
2021-09-21 15:57   ` Jason Gunthorpe
2021-09-21 23:56     ` Tian, Kevin
2021-09-22  0:55       ` Jason Gunthorpe
2021-09-22  1:07         ` Tian, Kevin
2021-09-22 12:31           ` Jason Gunthorpe
2021-09-22  3:22         ` Tian, Kevin
2021-09-22 12:50           ` Jason Gunthorpe
2021-09-22 14:09             ` Tian, Kevin
2021-09-21 19:56   ` Alex Williamson
2021-09-22  0:56     ` Tian, Kevin
2021-09-29  2:08   ` David Gibson
2021-09-29 19:05     ` Alex Williamson
2021-09-30  2:43       ` David Gibson
2021-10-20 12:39     ` Liu, Yi L
2021-09-19  6:38 ` [RFC 03/20] vfio: Add vfio_[un]register_device() Liu Yi L
2021-09-21 16:01   ` Jason Gunthorpe
2021-09-21 23:10     ` Tian, Kevin
2021-09-22  0:53       ` Jason Gunthorpe
2021-09-22  0:59         ` Tian, Kevin
2021-09-22  9:23         ` Tian, Kevin
2021-09-22 12:22           ` Jason Gunthorpe
2021-09-22 13:44             ` Tian, Kevin
2021-09-22 20:10             ` Alex Williamson
2021-09-22 22:34               ` Tian, Kevin
2021-09-22 22:45                 ` Alex Williamson
2021-09-22 23:45                   ` Tian, Kevin
2021-09-22 23:52                     ` Jason Gunthorpe
2021-09-23  0:38                       ` Tian, Kevin
2021-09-22 23:56               ` Jason Gunthorpe
2021-09-22  0:54     ` Tian, Kevin
2021-09-22  1:00       ` Jason Gunthorpe
2021-09-22  1:02         ` Tian, Kevin
2021-09-23  7:25         ` Eric Auger
2021-09-23 11:44           ` Jason Gunthorpe
2021-09-29  2:46         ` david
2021-09-29 12:22           ` Jason Gunthorpe
2021-09-30  2:48             ` david
2021-09-29  2:43   ` David Gibson
2021-09-29  3:40     ` Tian, Kevin
2021-09-29  5:30     ` Tian, Kevin
2021-09-29  7:08       ` Cornelia Huck
2021-09-29 12:15         ` Jason Gunthorpe
2021-09-19  6:38 ` [RFC 04/20] iommu: Add iommu_device_get_info interface Liu Yi L
2021-09-21 16:19   ` Jason Gunthorpe
2021-09-22  2:31     ` Lu Baolu
2021-09-22  5:07       ` Christoph Hellwig
2021-09-29  2:52   ` David Gibson
2021-09-29  9:25     ` Lu Baolu
2021-09-29  9:29       ` Lu Baolu
2021-09-19  6:38 ` [RFC 05/20] vfio/pci: Register device to /dev/vfio/devices Liu Yi L
2021-09-21 16:40   ` Jason Gunthorpe
2021-09-21 21:09     ` Alex Williamson
2021-09-21 21:58       ` Jason Gunthorpe
2021-09-22  1:24         ` Tian, Kevin
2021-09-22  1:19       ` Tian, Kevin
2021-09-22 21:17         ` Alex Williamson
2021-09-22 23:49           ` Tian, Kevin
2021-09-19  6:38 ` [RFC 06/20] iommu: Add iommu_device_init[exit]_user_dma interfaces Liu Yi L
2021-09-21 17:09   ` Jason Gunthorpe
2021-09-22  1:47     ` Tian, Kevin
2021-09-22 12:39       ` Jason Gunthorpe
2021-09-22 13:56         ` Tian, Kevin
2021-09-27  9:42         ` Tian, Kevin
2021-09-27 11:34           ` Lu Baolu
2021-09-27 13:08             ` Tian, Kevin
2021-09-27 11:53           ` Jason Gunthorpe
2021-09-27 13:00             ` Tian, Kevin
2021-09-27 13:09               ` Jason Gunthorpe
2021-09-27 13:32                 ` Tian, Kevin
2021-09-27 14:39                   ` Jason Gunthorpe
2021-09-28  7:13                     ` Tian, Kevin
2021-09-28 11:54                       ` Jason Gunthorpe
2021-09-28 23:59                         ` Tian, Kevin
2021-09-27 19:19                   ` Alex Williamson
2021-09-28  7:43                     ` Tian, Kevin
2021-09-28 16:26                       ` Alex Williamson
2021-09-27 15:09           ` Jason Gunthorpe
2021-09-28  7:30             ` Tian, Kevin
2021-09-28 11:57               ` Jason Gunthorpe
2021-09-28 13:35                 ` Lu Baolu
2021-09-28 14:07                   ` Jason Gunthorpe
2021-09-29  0:38                     ` Tian, Kevin
2021-09-29 12:59                       ` Jason Gunthorpe
2021-10-15  1:29                         ` Tian, Kevin
2021-10-15 11:09                           ` Jason Gunthorpe
2021-10-18  1:52                             ` Tian, Kevin
2021-09-29  2:22                     ` Lu Baolu
2021-09-29  2:29                       ` Tian, Kevin
2021-09-29  2:38                         ` Lu Baolu
2021-09-29  4:55   ` David Gibson
2021-09-29  5:38     ` Tian, Kevin
2021-09-29  6:35       ` David Gibson
2021-09-29  7:31         ` Tian, Kevin
2021-09-30  3:05           ` David Gibson
2021-09-29 12:57         ` Jason Gunthorpe
2021-09-30  3:09           ` David Gibson
2021-09-30 22:28             ` Jason Gunthorpe
2021-10-01  3:54               ` David Gibson
2021-09-19  6:38 ` [RFC 07/20] iommu/iommufd: Add iommufd_[un]bind_device() Liu Yi L
2021-09-21 17:14   ` Jason Gunthorpe
2021-10-15  9:21     ` Liu, Yi L
2021-09-29  5:25   ` David Gibson
2021-09-29 12:24     ` Jason Gunthorpe
2021-09-30  3:10       ` David Gibson
2021-10-01 12:43         ` Jason Gunthorpe
2021-10-07  1:23           ` David Gibson
2021-10-07 11:35             ` Jason Gunthorpe
2021-10-11  3:24               ` David Gibson
2021-09-19  6:38 ` [RFC 08/20] vfio/pci: Add VFIO_DEVICE_BIND_IOMMUFD Liu Yi L
2021-09-21 17:29   ` Jason Gunthorpe
2021-09-22 21:01     ` Alex Williamson
2021-09-22 23:01       ` Jason Gunthorpe
2021-09-29  6:00   ` David Gibson
2021-09-29  6:41     ` Tian, Kevin
2021-09-29 12:28       ` Jason Gunthorpe
2021-09-29 22:34         ` Tian, Kevin
2021-09-30  3:12       ` David Gibson
2021-09-19  6:38 ` [RFC 09/20] iommu: Add page size and address width attributes Liu Yi L
2021-09-22 13:42   ` Eric Auger
2021-09-22 14:19     ` Tian, Kevin
2021-09-19  6:38 ` [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO Liu Yi L
2021-09-21 17:40   ` Jason Gunthorpe
2021-09-22  3:30     ` Tian, Kevin
2021-09-22 12:41       ` Jason Gunthorpe
2021-09-29  6:18         ` david
2021-09-22 21:24   ` Alex Williamson
2021-09-22 23:49     ` Jason Gunthorpe
2021-09-23  3:10       ` Tian, Kevin
2021-09-23 10:15         ` Jean-Philippe Brucker
2021-09-23 11:27           ` Jason Gunthorpe
2021-09-23 12:05             ` Tian, Kevin
2021-09-23 12:22               ` Jason Gunthorpe
2021-09-29  8:48                 ` Tian, Kevin
2021-09-29 12:36                   ` Jason Gunthorpe
2021-09-30  8:30                     ` Tian, Kevin
2021-09-30 10:33                       ` Jean-Philippe Brucker
2021-09-30 22:04                         ` Jason Gunthorpe
2021-10-01  3:28                           ` hch
2021-10-14  8:13                             ` Tian, Kevin
2021-10-14  8:22                               ` hch
2021-10-14  8:29                                 ` Tian, Kevin
2021-10-14  8:01                         ` Tian, Kevin
2021-10-14  9:16                           ` Jean-Philippe Brucker
2021-09-30  8:49                 ` Tian, Kevin
2021-09-30 13:43                   ` Lu Baolu
2021-10-01  3:24                     ` hch
2021-09-30 22:08                   ` Jason Gunthorpe
2021-09-23 11:36         ` Jason Gunthorpe
     [not found]       ` <BN9PR11MB5433409DF766AAEF1BB2CF258CA39@BN9PR11MB5433.namprd11.prod.outlook.com>
2021-09-23  3:38         ` Tian, Kevin
2021-09-23 11:42           ` Jason Gunthorpe
2021-09-30  9:35             ` Tian, Kevin
2021-09-30 22:23               ` Jason Gunthorpe
2021-10-01  3:30                 ` hch
2021-10-14  9:11                 ` Tian, Kevin
2021-10-14 15:42                   ` Jason Gunthorpe
2021-10-15  1:01                     ` Tian, Kevin
     [not found]                     ` <BN9PR11MB543327BB6D58AEF91AD2C9D18CB99@BN9PR11MB5433.namprd11.prod.outlook.com>
2021-10-21  2:26                       ` Tian, Kevin
2021-10-21 14:58                         ` Jean-Philippe Brucker
2021-10-21 23:22                           ` Jason Gunthorpe
2021-10-22  7:49                             ` Jean-Philippe Brucker
2021-10-25 16:51                               ` Jason Gunthorpe
2021-10-21 23:30                         ` Jason Gunthorpe
2021-10-22  3:08                           ` Tian, Kevin
2021-10-25 23:34                             ` Jason Gunthorpe
2021-10-27  1:42                               ` Tian, Kevin
2021-10-28  2:07                               ` Tian, Kevin
2021-10-29 13:55                                 ` Jason Gunthorpe
2021-09-29  6:23   ` David Gibson
2021-09-19  6:38 ` [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE Liu Yi L
2021-09-21 17:44   ` Jason Gunthorpe
2021-09-22  3:40     ` Tian, Kevin
2021-09-22 14:09       ` Jason Gunthorpe
2021-09-23  9:14         ` Tian, Kevin
2021-09-23 12:06           ` Jason Gunthorpe
2021-09-23 12:22             ` Tian, Kevin
2021-09-23 12:31               ` Jason Gunthorpe
2021-09-23 12:45                 ` Tian, Kevin
2021-09-23 13:01                   ` Jason Gunthorpe
2021-09-23 13:20                     ` Tian, Kevin
2021-09-23 13:30                       ` Jason Gunthorpe
2021-09-23 13:41                         ` Tian, Kevin
2021-10-01  6:30               ` david
2021-10-01  6:26           ` david
2021-10-01  6:19         ` david
2021-10-01 12:25           ` Jason Gunthorpe
2021-10-02  4:21             ` david
2021-10-02 12:25               ` Jason Gunthorpe
2021-10-11  5:37                 ` david
2021-10-11 17:17                   ` Jason Gunthorpe
2021-10-14  4:33                     ` david
2021-10-14 15:06                       ` Jason Gunthorpe
2021-10-18  3:40                         ` david
2021-10-01  6:15       ` david
2021-09-22 12:51     ` Liu, Yi L
2021-09-22 13:32       ` Jason Gunthorpe
2021-09-23  6:26         ` Liu, Yi L
2021-10-01  6:13     ` David Gibson
2021-10-01 12:22       ` Jason Gunthorpe
2021-10-11  6:02         ` David Gibson
2021-10-11  8:49           ` Jean-Philippe Brucker
2021-10-11 23:38             ` Jason Gunthorpe
2021-10-12  8:33               ` Jean-Philippe Brucker
2021-10-13  7:14                 ` Tian, Kevin
2021-10-13  7:07             ` Tian, Kevin
2021-10-14  4:38             ` David Gibson
2021-10-11 18:49           ` Jason Gunthorpe [this message]
2021-10-14  4:53             ` David Gibson
2021-10-14 14:52               ` Jason Gunthorpe
2021-10-18  3:50                 ` David Gibson
2021-10-18 17:42                   ` Jason Gunthorpe
2021-09-22 13:45   ` Jean-Philippe Brucker
2021-09-29 10:47     ` Liu, Yi L
2021-10-01  6:11   ` David Gibson
2021-10-13  7:00     ` Tian, Kevin
2021-10-14  5:00       ` David Gibson
2021-10-14  6:53         ` Tian, Kevin
2021-10-25  5:05           ` David Gibson
2021-10-27  2:32             ` Tian, Kevin
2021-09-19  6:38 ` [RFC 12/20] iommu/iommufd: Add IOMMU_CHECK_EXTENSION Liu Yi L
2021-09-21 17:47   ` Jason Gunthorpe
2021-09-22  3:41     ` Tian, Kevin
2021-09-22 12:55       ` Jason Gunthorpe
2021-09-22 14:13         ` Tian, Kevin
2021-09-19  6:38 ` [RFC 13/20] iommu: Extend iommu_at[de]tach_device() for multiple devices group Liu Yi L
2021-10-14  5:24   ` David Gibson
2021-10-14  7:06     ` Tian, Kevin
2021-10-18  3:57       ` David Gibson
2021-10-18 16:32         ` Jason Gunthorpe
2021-10-25  5:14           ` David Gibson
2021-10-25 12:14             ` Jason Gunthorpe
2021-10-25 13:16               ` David Gibson
2021-10-25 23:36                 ` Jason Gunthorpe
2021-10-26  9:23                   ` David Gibson
2021-09-19  6:38 ` [RFC 14/20] iommu/iommufd: Add iommufd_device_[de]attach_ioasid() Liu Yi L
2021-09-21 18:02   ` Jason Gunthorpe
2021-09-22  3:53     ` Tian, Kevin
2021-09-22 12:57       ` Jason Gunthorpe
2021-09-22 14:16         ` Tian, Kevin
2021-09-19  6:38 ` [RFC 15/20] vfio/pci: Add VFIO_DEVICE_[DE]ATTACH_IOASID Liu Yi L
2021-09-21 18:04   ` Jason Gunthorpe
2021-09-22  3:56     ` Tian, Kevin
2021-09-22 12:58       ` Jason Gunthorpe
2021-09-22 14:17         ` Tian, Kevin
2021-09-19  6:38 ` [RFC 16/20] vfio/type1: Export symbols for dma [un]map code sharing Liu Yi L
2021-09-21 18:14   ` Jason Gunthorpe
2021-09-22  3:57     ` Tian, Kevin
2021-09-19  6:38 ` [RFC 17/20] iommu/iommufd: Report iova range to userspace Liu Yi L
2021-09-22 14:49   ` Jean-Philippe Brucker
2021-09-29 10:44     ` Liu, Yi L
2021-09-29 12:07       ` Jean-Philippe Brucker
2021-09-29 12:31         ` Jason Gunthorpe
2021-09-19  6:38 ` [RFC 18/20] iommu/iommufd: Add IOMMU_[UN]MAP_DMA on IOASID Liu Yi L
2021-09-19  6:38 ` [RFC 19/20] iommu/vt-d: Implement device_info iommu_ops callback Liu Yi L
2021-09-19  6:38 ` [RFC 20/20] Doc: Add documentation for /dev/iommu Liu Yi L
2021-10-29  0:15   ` David Gibson
2021-10-29 12:44     ` Jason Gunthorpe
2021-09-19  6:45 ` [RFC 00/20] Introduce /dev/iommu for userspace I/O address space management Liu, Yi L
2021-09-21 13:45 ` Jason Gunthorpe
2021-09-22  3:25   ` Liu, Yi L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211011184914.GQ2744544@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.jiang@intel.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=joro@8bytes.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@metux.net \
    --cc=lushenming@huawei.com \
    --cc=nicolinc@nvidia.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=yi.l.liu@intel.com \
    --cc=yi.l.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).