kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Eric Auger <eric.auger@redhat.com>,
	iommu@lists.linux-foundation.org,
	Jason Wang <jasowang@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Joao Martins <joao.m.martins@oracle.com>,
	Kevin Tian <kevin.tian@intel.com>,
	kvm@vger.kernel.org, Matthew Rosato <mjrosato@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>,
	Yi Liu <yi.l.liu@intel.com>, Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd
Date: Tue, 22 Mar 2022 17:31:26 +0100	[thread overview]
Message-ID: <974181b0c01513293909d56844aa58c09c5fa166.camel@linux.ibm.com> (raw)
In-Reply-To: <20220322145741.GH11336@nvidia.com>

On Tue, 2022-03-22 at 11:57 -0300, Jason Gunthorpe wrote:
> On Tue, Mar 22, 2022 at 03:28:22PM +0100, Niklas Schnelle wrote:
> > On Fri, 2022-03-18 at 14:27 -0300, Jason Gunthorpe wrote:
> > > Following the pattern of io_uring, perf, skb, and bpf iommfd will use
> >                                                 iommufd ----^
> > > user->locked_vm for accounting pinned pages. Ensure the value is included
> > > in the struct and export free_uid() as iommufd is modular.
> > > 
> > > user->locked_vm is the correct accounting to use for ulimit because it is
> > > per-user, and the ulimit is not supposed to be per-process. Other
> > > places (vfio, vdpa and infiniband) have used mm->pinned_vm and/or
> > > mm->locked_vm for accounting pinned pages, but this is only per-process
> > > and inconsistent with the majority of the kernel.
> > 
> > Since this will replace parts of vfio this difference seems
> > significant. Can you explain this a bit more?
> 
> I'm not sure what to say more, this is the correct way to account for
> this. It is natural to see it is right because the ulimit is supposted
> to be global to the user, not effectively reset every time the user
> creates a new process.
> 
> So checking the ulimit against a per-process variable in the mm_struct
> doesn't make too much sense.

Yes I agree that logically this makes more sense. I was kind of aiming
in the same direction as Alex i.e. it's a conscious decision to do it
right and we need to know where this may lead to differences and how to
handle them.

> 
> > I'm also a bit confused how io_uring handles this. When I stumbled over
> > the problem fixed by 6b7898eb180d ("io_uring: fix imbalanced sqo_mm
> > accounting") and from that commit description I seem to rember that
> > io_uring also accounts in mm->locked_vm too? 
> 
> locked/pinned_pages in the mm is kind of a debugging counter, it
> indicates how many pins the user obtained through this mm. AFAICT its
> only correct use is to report through proc. Things are supposed to
> update it, but there is no reason to limit it as the user limit
> supersedes it.
> 
> The commit you pointed at is fixing that io_uring corrupted the value.
> 
> Since VFIO checks locked/pinned_pages against the ulimit would blow up
> when the value was wrong.
> 
> > In fact I stumbled over that because the wrong accounting in
> > io_uring exhausted the applied to vfio (I was using a QEMU utilizing
> > io_uring itself).
> 
> I'm pretty interested in this as well, do you have anything you can
> share?

This was the issue reported in the following BZ.

https://bugzilla.kernel.org/show_bug.cgi?id=209025

I stumbled over the same problem on my x86 box and also on s390. I
don't remember exactly what limit this ran into but I suspect it had
something to do with the libvirt resource limits Alex mentioned.
Meaning io_uring had an accounting bug and then vfio / QEMU couldn't
pin memory. I think that libvirt limit is set to allow pinning all of
guest memory plus a bit so the io_uring misaccounting easily tipped it
over.

> 
> Thanks,
> Jason



  parent reply	other threads:[~2022-03-22 16:31 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-18 17:27 [PATCH RFC 00/12] IOMMUFD Generic interface Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 01/12] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 02/12] iommufd: Overview documentation Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 03/12] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-03-22 14:18   ` Niklas Schnelle
2022-03-22 14:50     ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-03-22 14:28   ` Niklas Schnelle
2022-03-22 14:57     ` Jason Gunthorpe
2022-03-22 15:29       ` Alex Williamson
2022-03-22 16:15         ` Jason Gunthorpe
2022-03-24  2:11           ` Tian, Kevin
2022-03-24  2:27             ` Jason Wang
2022-03-24  2:42               ` Tian, Kevin
2022-03-24  2:57                 ` Jason Wang
2022-03-24  3:15                   ` Tian, Kevin
2022-03-24  3:50                     ` Jason Wang
2022-03-24  4:29                       ` Tian, Kevin
2022-03-24 11:46                       ` Jason Gunthorpe
2022-03-28  1:53                         ` Jason Wang
2022-03-28 12:22                           ` Jason Gunthorpe
2022-03-29  4:59                             ` Jason Wang
2022-03-29 11:46                               ` Jason Gunthorpe
2022-03-28 13:14                           ` Sean Mooney
2022-03-28 14:27                             ` Jason Gunthorpe
2022-03-24 20:40           ` Alex Williamson
2022-03-24 22:27             ` Jason Gunthorpe
2022-03-24 22:41               ` Alex Williamson
2022-03-22 16:31       ` Niklas Schnelle [this message]
2022-03-22 16:41         ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 05/12] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-03-23 15:37   ` Niklas Schnelle
2022-03-23 16:09     ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 06/12] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 07/12] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-03-22 22:15   ` Alex Williamson
2022-03-23 18:15     ` Jason Gunthorpe
2022-03-24  3:09       ` Tian, Kevin
2022-03-24 12:46         ` Jason Gunthorpe
2022-03-25 13:34   ` zhangfei.gao
2022-03-25 17:19     ` Jason Gunthorpe
2022-04-13 14:02   ` Yi Liu
2022-04-13 14:36     ` Jason Gunthorpe
2022-04-13 14:49       ` Yi Liu
2022-04-17 14:56         ` Yi Liu
2022-04-18 10:47           ` Yi Liu
2022-03-18 17:27 ` [PATCH RFC 08/12] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-03-23 19:10   ` Alex Williamson
2022-03-23 19:34     ` Jason Gunthorpe
2022-03-23 20:04       ` Alex Williamson
2022-03-23 20:34         ` Jason Gunthorpe
2022-03-23 22:54           ` Jason Gunthorpe
2022-03-24  7:25             ` Tian, Kevin
2022-03-24 13:46               ` Jason Gunthorpe
2022-03-25  2:15                 ` Tian, Kevin
2022-03-27  2:32                 ` Tian, Kevin
2022-03-27 14:28                   ` Jason Gunthorpe
2022-03-28 17:17                 ` Alex Williamson
2022-03-28 18:57                   ` Jason Gunthorpe
2022-03-28 19:47                     ` Jason Gunthorpe
2022-03-28 21:26                       ` Alex Williamson
2022-03-24  6:46           ` Tian, Kevin
2022-03-30 13:35   ` Yi Liu
2022-03-31 12:59     ` Jason Gunthorpe
2022-04-01 13:30       ` Yi Liu
2022-03-31  4:36   ` David Gibson
2022-03-31  5:41     ` Tian, Kevin
2022-03-31 12:58     ` Jason Gunthorpe
2022-04-28  5:58       ` David Gibson
2022-04-28 14:22         ` Jason Gunthorpe
2022-04-29  6:00           ` David Gibson
2022-04-29 12:54             ` Jason Gunthorpe
2022-04-30 14:44               ` David Gibson
2022-03-18 17:27 ` [PATCH RFC 09/12] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 10/12] iommufd: Add kAPI toward external drivers Jason Gunthorpe
2022-03-23 18:10   ` Alex Williamson
2022-03-23 18:15     ` Jason Gunthorpe
2022-05-11 12:54   ` Yi Liu
2022-05-19  9:45   ` Yi Liu
2022-05-19 12:35     ` Jason Gunthorpe
2022-03-18 17:27 ` [PATCH RFC 11/12] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-03-23 22:51   ` Alex Williamson
2022-03-24  0:33     ` Jason Gunthorpe
2022-03-24  8:13       ` Eric Auger
2022-03-24 22:04       ` Alex Williamson
2022-03-24 23:11         ` Jason Gunthorpe
2022-03-25  3:10           ` Tian, Kevin
2022-03-25 11:24           ` Joao Martins
2022-04-28 14:53         ` David Gibson
2022-04-28 15:10           ` Jason Gunthorpe
2022-04-29  1:21             ` Tian, Kevin
2022-04-29  6:22               ` David Gibson
2022-04-29 12:50                 ` Jason Gunthorpe
2022-05-02  4:10                   ` David Gibson
2022-04-29  6:20             ` David Gibson
2022-04-29 12:48               ` Jason Gunthorpe
2022-05-02  7:30                 ` David Gibson
2022-05-05 19:07                   ` Jason Gunthorpe
2022-05-06  5:25                     ` David Gibson
2022-05-06 10:42                       ` Tian, Kevin
2022-05-09  3:36                         ` David Gibson
2022-05-06 12:48                       ` Jason Gunthorpe
2022-05-09  6:01                         ` David Gibson
2022-05-09 14:00                           ` Jason Gunthorpe
2022-05-10  7:12                             ` David Gibson
2022-05-10 19:00                               ` Jason Gunthorpe
2022-05-11  3:15                                 ` Tian, Kevin
2022-05-11 16:32                                   ` Jason Gunthorpe
2022-05-11 23:23                                     ` Tian, Kevin
2022-05-13  4:35                                   ` David Gibson
2022-05-11  4:40                                 ` David Gibson
2022-05-11  2:46                             ` Tian, Kevin
2022-05-23  6:02           ` Alexey Kardashevskiy
2022-05-24 13:25             ` Jason Gunthorpe
2022-05-25  1:39               ` David Gibson
2022-05-25  2:09               ` Alexey Kardashevskiy
2022-03-29  9:17     ` Yi Liu
2022-03-18 17:27 ` [PATCH RFC 12/12] iommufd: Add a selftest Jason Gunthorpe
2022-04-12 20:13 ` [PATCH RFC 00/12] IOMMUFD Generic interface Eric Auger
2022-04-12 20:22   ` Jason Gunthorpe
2022-04-12 20:50     ` Eric Auger
2022-04-14 10:56 ` Yi Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=974181b0c01513293909d56844aa58c09c5fa166.camel@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chaitanyak@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=mst@redhat.com \
    --cc=nicolinc@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).