From: "Tian, Kevin" <kevin.tian@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
Li Zefan <lizefan@huawei.com>,
"Jiang, Dave" <dave.jiang@intel.com>,
"Raj, Ashok" <ashok.raj@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Jean-Philippe Brucker <jean-philippe@linaro.com>,
LKML <linux-kernel@vger.kernel.org>,
"iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
Jason Gunthorpe <jgg@nvidia.com>,
Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
"Wu, Hao" <hao.wu@intel.com>,
David Woodhouse <dwmw2@infradead.org>
Subject: RE: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs
Date: Fri, 7 May 2021 07:36:49 +0000 [thread overview]
Message-ID: <MWHPR11MB1886E0A7897758AA7BE509058C579@MWHPR11MB1886.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20210428090625.5a05dae8@redhat.com>
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, April 28, 2021 11:06 PM
>
> On Wed, 28 Apr 2021 06:34:11 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
>
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Monday, April 26, 2021 8:38 PM
> > >
> > [...]
> > > > Want to hear your opinion for one open here. There is no doubt that
> > > > an ioasid represents a HW page table when the table is constructed by
> > > > userspace and then linked to the IOMMU through the bind/unbind
> > > > API. But I'm not very sure about whether an ioasid should represent
> > > > the exact pgtable or the mapping metadata when the underlying
> > > > pgtable is indirectly constructed through map/unmap API. VFIO does
> > > > the latter way, which is why it allows multiple incompatible domains
> > > > in a single container which all share the same mapping metadata.
> > >
> > > I think VFIO's map/unmap is way too complex and we know it has bad
> > > performance problems.
> >
> > Can you or Alex elaborate where the complexity and performance problem
> > locate in VFIO map/umap? We'd like to understand more detail and see
> how
> > to avoid it in the new interface.
>
>
> The map/unmap interface is really only good for long lived mappings,
> the overhead is too high for things like vIOMMU use cases or any case
> where the mapping is intended to be dynamic. Userspace drivers must
> make use of a long lived buffer mapping in order to achieve performance.
This is not a limitation of VFIO map/unmap. It's the limitation of any
map/unmap semantics since the fact of long-lived vs. short-lived is
imposed by userspace. Nested translation is the only viable optimization
allowing 2nd-level to be a long-lived mapping even w/ vIOMMU. From
this angle I'm not sure how a new map/unmap implementation could
address this perf limitation alone.
>
> The mapping and unmapping granularity has been a problem as well,
> type1v1 allowed arbitrary unmaps to bisect the original mapping, with
> the massive caveat that the caller relies on the return value of the
> unmap to determine what was actually unmapped because the IOMMU use
> of
> superpages is transparent to the caller. This led to type1v2 that
> simply restricts the user to avoid ever bisecting mappings. That still
> leaves us with problems for things like virtio-mem support where we
> need to create initial mappings with a granularity that allows us to
> later remove entries, which can prevent effective use of IOMMU
> superpages.
We could start with a semantics similar to type1v2.
btw why does virtio-mem require a smaller granularity? Can we split
superpages in-the-fly when removal actually happens (just similar
to page split in VM live migration for efficient dirty page tracking)?
and isn't it another problem imposed by userspace? How could a new
map/unmap implementation mitigate this problem if the userspace
insists on a smaller granularity for initial mappings?
>
> Locked page accounting has been another constant issue. We perform
> locked page accounting at the container level, where each container
> accounts independently. A user may require multiple containers, the
> containers may pin the same physical memory, but be accounted against
> the user once per container.
for /dev/ioasid there is still an open whether an process is allowed to
open /dev/ioasid once or multiple times. If there is only one ioasid_fd
per process, the accounting can be made accurately. otherwise the
same problem still exists as each ioasid_fd is akin to the container, then
we need find a better solution.
>
> Those are the main ones I can think of. It is nice to have a simple
> map/unmap interface, I'd hope that a new /dev/ioasid interface wouldn't
> raise the barrier to entry too high, but the user needs to have the
> ability to have more control of their mappings and locked page
> accounting should probably be offloaded somewhere. Thanks,
>
Based on your feedbacks I feel it's probably reasonable to start with
a type1v2 semantics for the new interface. Locked accounting could
also start with the same VFIO restriction and then improve it
incrementally, if a cleaner way is intrusive (if not affecting uAPI).
But I didn't get the suggestion on "more control of their mappings".
Can you elaborate?
Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2021-05-07 7:37 UTC|newest]
Thread overview: 269+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-27 22:01 [PATCH V4 00/18] IOASID extensions for guest SVA Jacob Pan
2021-02-27 22:01 ` [PATCH V4 01/18] docs: Document IO Address Space ID (IOASID) APIs Jacob Pan
2021-02-27 22:01 ` [PATCH V4 02/18] iommu/ioasid: Rename ioasid_set_data() Jacob Pan
2021-02-27 22:01 ` [PATCH V4 03/18] iommu/ioasid: Add a separate function for detach data Jacob Pan
2021-02-27 22:01 ` [PATCH V4 04/18] iommu/ioasid: Support setting system-wide capacity Jacob Pan
2021-02-27 22:01 ` [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs Jacob Pan
2021-03-19 0:22 ` Jacob Pan
2021-03-19 9:58 ` Jean-Philippe Brucker
2021-03-19 12:46 ` Jason Gunthorpe
2021-03-19 13:41 ` Jean-Philippe Brucker
2021-03-19 13:54 ` Jason Gunthorpe
2021-03-19 18:22 ` Jacob Pan
2021-03-22 9:24 ` Jean-Philippe Brucker
2021-03-24 17:02 ` Jacob Pan
2021-03-24 17:03 ` Jason Gunthorpe
2021-03-24 22:12 ` Jacob Pan
2021-03-25 10:21 ` Jean-Philippe Brucker
2021-03-25 17:02 ` Jacob Pan
2021-03-25 17:16 ` Jason Gunthorpe
2021-03-25 18:23 ` Jacob Pan
2021-03-26 8:06 ` Jean-Philippe Brucker
2021-03-30 13:07 ` Jason Gunthorpe
2021-03-30 13:42 ` Jean-Philippe Brucker
2021-03-30 13:46 ` Jason Gunthorpe
2021-03-25 10:26 ` Jean-Philippe Brucker
2021-03-22 12:03 ` Jason Gunthorpe
2021-03-24 19:05 ` Jacob Pan
2021-03-29 16:31 ` Jason Gunthorpe
2021-03-29 22:55 ` Jacob Pan
2021-03-30 13:43 ` Jason Gunthorpe
2021-03-31 0:10 ` Jacob Pan
2021-03-31 12:28 ` Jason Gunthorpe
2021-03-31 16:34 ` Jacob Pan
2021-03-31 17:31 ` Jason Gunthorpe
2021-03-31 18:20 ` Jacob Pan
2021-03-31 18:33 ` Jason Gunthorpe
2021-03-31 21:50 ` Jacob Pan
2021-03-31 8:38 ` Liu, Yi L
2021-03-30 1:37 ` Tian, Kevin
2021-03-30 13:28 ` Jason Gunthorpe
2021-03-31 7:38 ` Liu, Yi L
2021-03-31 12:40 ` Jason Gunthorpe
2021-04-01 4:38 ` Liu, Yi L
2021-04-01 7:04 ` Liu, Yi L
2021-04-01 11:54 ` Jason Gunthorpe
2021-04-02 12:46 ` Liu, Yi L
2021-04-01 12:05 ` Jean-Philippe Brucker
2021-04-01 12:12 ` Jason Gunthorpe
2021-04-01 13:38 ` Liu, Yi L
2021-04-01 13:42 ` Jason Gunthorpe
2021-04-01 14:08 ` Liu, Yi L
2021-04-01 16:03 ` Jason Gunthorpe
2021-04-02 7:30 ` Tian, Kevin
2021-04-05 23:35 ` Jason Gunthorpe
2021-04-06 0:37 ` Tian, Kevin
2021-04-06 12:15 ` Jason Gunthorpe
2021-04-15 13:11 ` Auger Eric
2021-04-15 23:07 ` Jason Gunthorpe
2021-04-16 13:12 ` Jacob Pan
2021-04-16 15:45 ` Alex Williamson
2021-04-16 17:23 ` Jacob Pan
2021-04-16 17:54 ` Jason Gunthorpe
2021-04-21 13:18 ` Liu, Yi L
2021-04-21 16:23 ` Jason Gunthorpe
2021-04-21 16:54 ` Alex Williamson
2021-04-21 17:52 ` Jason Gunthorpe
2021-04-21 19:33 ` Alex Williamson
2021-04-21 23:03 ` Jason Gunthorpe
2021-04-22 8:34 ` Tian, Kevin
2021-04-22 12:10 ` Jason Gunthorpe
2021-04-23 9:06 ` Tian, Kevin
2021-04-23 11:49 ` Jason Gunthorpe
2021-04-25 9:24 ` Tian, Kevin
2021-04-26 12:38 ` Jason Gunthorpe
2021-04-28 6:34 ` Tian, Kevin
2021-04-28 15:06 ` Alex Williamson
2021-05-07 7:36 ` Tian, Kevin [this message]
2021-05-07 11:56 ` Jason Gunthorpe
2021-05-07 17:06 ` Alex Williamson
2021-05-07 17:10 ` Jason Gunthorpe
2021-05-08 6:08 ` Tian, Kevin
2021-05-08 7:31 ` Tian, Kevin
2021-05-10 2:56 ` Lu Baolu
2021-04-28 20:46 ` Jason Gunthorpe
2021-05-04 16:22 ` Jacob Pan
2021-05-04 16:31 ` Jason Gunthorpe
2021-05-08 5:46 ` Tian, Kevin
2021-05-04 15:41 ` Jacob Pan
2021-05-04 18:00 ` Jason Gunthorpe
2021-05-04 22:11 ` Jacob Pan
2021-05-04 23:15 ` Jason Gunthorpe
2021-05-05 17:22 ` Jacob Pan
2021-05-05 18:00 ` Jason Gunthorpe
2021-05-05 20:04 ` Jacob Pan
2021-05-05 22:21 ` Jason Gunthorpe
2021-05-05 23:23 ` Raj, Ashok
2021-05-06 12:22 ` Jason Gunthorpe
2021-05-08 7:06 ` Liu Yi L
2021-05-06 7:23 ` Jean-Philippe Brucker
2021-05-06 12:27 ` Jason Gunthorpe
2021-05-06 16:32 ` Raj, Ashok
2021-05-07 17:20 ` Jason Gunthorpe
2021-05-07 18:14 ` Raj, Ashok
2021-05-07 18:20 ` Jason Gunthorpe
2021-05-07 19:23 ` Raj, Ashok
2021-05-07 19:28 ` Jason Gunthorpe
2021-05-07 22:15 ` Jacob Pan
2021-05-08 9:56 ` Tian, Kevin
2021-05-10 12:37 ` Jason Gunthorpe
2021-05-10 15:25 ` Raj, Ashok
2021-05-10 15:31 ` Jason Gunthorpe
2021-05-10 16:22 ` Raj, Ashok
2021-05-10 16:39 ` Jason Gunthorpe
2021-05-10 22:28 ` Jacob Pan
2021-05-10 23:45 ` Jason Gunthorpe
2021-05-11 3:56 ` Jacob Pan
2021-05-11 9:10 ` Tian, Kevin
2021-05-11 13:24 ` Liu Yi L
2021-05-11 22:52 ` Tian, Kevin
2021-05-11 14:38 ` Jason Gunthorpe
2021-05-11 22:51 ` Tian, Kevin
2021-05-11 23:39 ` Jason Gunthorpe
2021-05-12 0:21 ` Tian, Kevin
2021-05-12 0:25 ` Jason Gunthorpe
2021-05-12 0:40 ` Tian, Kevin
2021-04-29 8:54 ` Auger Eric
2021-04-29 8:55 ` Auger Eric
2021-04-29 13:26 ` Auger Eric
2021-04-29 20:04 ` Jason Gunthorpe
2021-05-05 9:10 ` Auger Eric
2021-04-22 17:13 ` Alex Williamson
2021-04-22 17:57 ` Jason Gunthorpe
2021-04-22 19:37 ` Alex Williamson
2021-04-22 20:00 ` Jason Gunthorpe
2021-04-22 22:38 ` Alex Williamson
2021-04-22 23:39 ` Jason Gunthorpe
2021-04-23 10:31 ` Tian, Kevin
2021-04-23 11:57 ` Jason Gunthorpe
2021-04-27 5:11 ` David Gibson
2021-04-27 16:39 ` Jason Gunthorpe
2021-04-28 0:49 ` David Gibson
2021-04-23 16:38 ` Alex Williamson
2021-04-23 22:28 ` Jason Gunthorpe
2021-04-27 5:15 ` David Gibson
2021-04-27 5:08 ` David Gibson
2021-04-27 17:12 ` Jason Gunthorpe
2021-04-28 0:58 ` David Gibson
2021-04-28 14:56 ` Jason Gunthorpe
2021-04-29 3:04 ` David Gibson
2021-05-03 16:15 ` Jason Gunthorpe
2021-05-13 5:48 ` David Gibson
2021-05-13 13:59 ` Jason Gunthorpe
2021-05-24 7:52 ` David Gibson
2021-05-24 23:37 ` Jason Gunthorpe
2021-05-25 19:26 ` Kirti Wankhede
2021-05-25 19:52 ` Jason Gunthorpe
2021-05-25 21:18 ` Kirti Wankhede
2021-05-27 5:00 ` David Gibson
2021-05-27 18:25 ` Kirti Wankhede
2021-06-01 3:45 ` David Gibson
2021-05-27 4:58 ` David Gibson
2021-05-27 18:48 ` Jason Gunthorpe
2021-06-01 4:03 ` David Gibson
2021-06-01 12:57 ` Jason Gunthorpe
2021-06-08 0:44 ` David Gibson
2021-06-08 18:34 ` Jason Gunthorpe
2021-05-25 22:52 ` Alex Williamson
2021-05-26 18:10 ` Kirti Wankhede
2021-05-26 18:59 ` Alex Williamson
2021-05-26 19:13 ` Jason Gunthorpe
2021-05-27 4:53 ` David Gibson
2021-05-27 19:06 ` Jason Gunthorpe
2021-06-01 4:27 ` David Gibson
2021-04-28 6:58 ` Tian, Kevin
2021-05-04 17:12 ` Jason Gunthorpe
2021-05-07 8:09 ` Tian, Kevin
2021-04-28 7:47 ` Tian, Kevin
2021-04-28 18:41 ` Jason Gunthorpe
2021-04-27 4:50 ` David Gibson
2021-04-27 17:24 ` Jason Gunthorpe
2021-04-28 1:23 ` David Gibson
2021-04-29 0:21 ` Jason Gunthorpe
2021-04-29 3:20 ` David Gibson
2021-05-03 16:05 ` Jason Gunthorpe
2021-05-04 3:54 ` David Gibson
2021-05-04 18:15 ` Jason Gunthorpe
2021-05-05 4:28 ` Alexey Kardashevskiy
2021-05-05 16:39 ` Jason Gunthorpe
2021-05-13 6:07 ` David Gibson
2021-05-13 13:50 ` Jason Gunthorpe
2021-05-24 7:56 ` David Gibson
2021-05-13 6:01 ` David Gibson
2021-05-13 6:52 ` Tian, Kevin
2021-05-13 13:47 ` Jason Gunthorpe
2021-04-22 12:55 ` Liu Yi L
2021-04-16 13:38 ` Auger Eric
2021-04-16 14:05 ` Jason Gunthorpe
2021-04-16 14:26 ` Auger Eric
2021-04-16 14:34 ` Jason Gunthorpe
2021-04-16 15:00 ` Auger Eric
2021-04-01 11:46 ` Jason Gunthorpe
2021-04-01 13:10 ` Liu, Yi L
2021-04-01 13:15 ` Jason Gunthorpe
2021-04-01 13:43 ` Liu, Yi L
2021-04-01 13:46 ` Jason Gunthorpe
2021-04-02 7:58 ` Tian, Kevin
2021-04-05 23:39 ` Jason Gunthorpe
2021-04-06 1:02 ` Tian, Kevin
2021-04-06 12:21 ` Jason Gunthorpe
2021-04-07 2:23 ` Tian, Kevin
[not found] ` <MWHPR11MB188628BDB37A4EE36F3D99338C769@MWHPR11MB1886.namprd11.prod.outlook.com>
2021-04-06 2:08 ` Tian, Kevin
2021-04-02 10:01 ` Tian, Kevin
2021-04-02 8:22 ` Tian, Kevin
2021-04-05 23:42 ` Jason Gunthorpe
2021-04-06 1:27 ` Tian, Kevin
2021-04-06 12:34 ` Jason Gunthorpe
2021-04-07 2:08 ` Tian, Kevin
2021-04-07 12:20 ` Jason Gunthorpe
2021-04-07 23:50 ` Tian, Kevin
2021-04-08 11:41 ` Jason Gunthorpe
2021-04-06 1:35 ` Jason Wang
2021-04-06 12:42 ` Jason Gunthorpe
2021-04-07 2:06 ` Jason Wang
2021-04-07 8:17 ` Tian, Kevin
2021-04-07 11:58 ` Jason Gunthorpe
2021-04-07 18:43 ` Jean-Philippe Brucker
2021-04-07 19:36 ` Jason Gunthorpe
2021-04-08 9:37 ` Jean-Philippe Brucker
2021-03-30 2:24 ` Tian, Kevin
2021-03-30 13:24 ` Jason Gunthorpe
2021-03-30 4:14 ` Tian, Kevin
2021-03-30 13:27 ` Jason Gunthorpe
2021-03-31 7:41 ` Liu, Yi L
2021-03-31 12:38 ` Jason Gunthorpe
2021-03-31 23:46 ` Jacob Pan
2021-04-01 0:37 ` Jason Gunthorpe
2021-04-01 17:23 ` Jacob Pan
2021-04-01 17:26 ` Jason Gunthorpe
2021-03-19 17:14 ` Jacob Pan
2021-02-27 22:01 ` [PATCH V4 06/18] iommu/ioasid: Add free function and states Jacob Pan
2021-02-27 22:01 ` [PATCH V4 07/18] iommu/ioasid: Add ioasid_set iterator helper functions Jacob Pan
2021-02-27 22:01 ` [PATCH V4 08/18] iommu/ioasid: Introduce ioasid_set private ID Jacob Pan
2021-02-27 22:01 ` [PATCH V4 09/18] iommu/ioasid: Introduce notification APIs Jacob Pan
2021-02-27 22:01 ` [PATCH V4 10/18] iommu/ioasid: Support mm token type ioasid_set notifications Jacob Pan
2021-02-27 22:01 ` [PATCH V4 11/18] iommu/ioasid: Add ownership check in guest bind Jacob Pan
2021-02-27 22:01 ` [PATCH V4 12/18] iommu/vt-d: Remove mm reference for guest SVA Jacob Pan
2021-02-27 22:01 ` [PATCH V4 13/18] iommu/ioasid: Add a workqueue for cleanup work Jacob Pan
2021-02-27 22:01 ` [PATCH V4 14/18] iommu/vt-d: Listen to IOASID notifications Jacob Pan
2021-02-27 22:01 ` [RFC PATCH 15/18] cgroup: Introduce ioasids controller Jacob Pan
2021-03-03 15:44 ` Tejun Heo
2021-03-03 21:17 ` Jacob Pan
2021-03-04 0:02 ` Jacob Pan
2021-03-04 0:23 ` Jason Gunthorpe
2021-03-04 9:49 ` Jean-Philippe Brucker
2021-03-04 17:46 ` Jacob Pan
2021-03-04 17:54 ` Jason Gunthorpe
2021-03-04 19:01 ` Jacob Pan
2021-03-04 19:02 ` Jason Gunthorpe
2021-03-04 21:28 ` Jacob Pan
2021-03-05 8:30 ` Jean-Philippe Brucker
2021-03-05 17:16 ` Jean-Philippe Brucker
2021-03-05 18:20 ` Jacob Pan
2021-02-27 22:01 ` [RFC PATCH 16/18] iommu/ioasid: Consult IOASIDs cgroup for allocation Jacob Pan
2021-02-27 22:01 ` [RFC PATCH 17/18] docs: cgroup-v1: Add IOASIDs controller Jacob Pan
2021-02-27 22:01 ` [RFC PATCH 18/18] ioasid: Add /dev/ioasid for userspace Jacob Pan
2021-03-10 19:23 ` Jason Gunthorpe
2021-03-11 22:55 ` Jacob Pan
2021-03-12 14:54 ` Jason Gunthorpe
2021-03-02 12:58 ` [PATCH V4 00/18] IOASID extensions for guest SVA Liu, Yi L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MWHPR11MB1886E0A7897758AA7BE509058C579@MWHPR11MB1886.namprd11.prod.outlook.com \
--to=kevin.tian@intel.com \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=dave.jiang@intel.com \
--cc=dwmw2@infradead.org \
--cc=hannes@cmpxchg.org \
--cc=hao.wu@intel.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jean-philippe@linaro.com \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).