Re: [RFC PATCH 00/19] mm: Introduce a cgroup to limit the amount of locked and pinned memory

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, jhubbard@nvidia.com,
	tjmercier@google.com, hannes@cmpxchg.org, surenb@google.com,
	mkoutny@suse.com, daniel@ffwll.ch
Subject: Re: [RFC PATCH 00/19] mm: Introduce a cgroup to limit the amount of locked and pinned memory
Date: Tue, 31 Jan 2023 10:10:43 -0400	[thread overview]
Message-ID: <Y9khYwunmC/xdXT9@nvidia.com> (raw)
In-Reply-To: <2e78d261-9ae9-d203-446e-eaa3c652ca6e@redhat.com>

On Tue, Jan 31, 2023 at 03:06:10PM +0100, David Hildenbrand wrote:
> On 31.01.23 15:03, Jason Gunthorpe wrote:
> > On Tue, Jan 31, 2023 at 02:57:20PM +0100, David Hildenbrand wrote:
> > 
> > > > I'm excited by this series, thanks for making it.
> > > > 
> > > > The pin accounting has been a long standing problem and cgroups will
> > > > really help!
> > > 
> > > Indeed. I'm curious how GUP-fast, pinning the same page multiple times, and
> > > pinning subpages of larger folios are handled :)
> > 
> > The same as today. The pinning is done based on the result from GUP,
> > and we charge every returned struct page.
> > 
> > So duplicates are counted multiple times, folios are ignored.
> > 
> > Removing duplicate charges would be costly, it would require storage
> > to keep track of how many times individual pages have been charged to
> > each cgroup (eg an xarray indexed by PFN of integers in each cgroup).
> > 
> > It doesn't seem worth the cost, IMHO.
> > 
> > We've made alot of investment now with iommufd to remove the most
> > annoying sources of duplicated pins so it is much less of a problem in
> > the qemu context at least.
> 
> Wasn't there the discussion regarding using vfio+io_uring+rdma+$whatever on
> a VM and requiring multiple times the VM size as memlock limit?

Yes, but iommufd gives us some more options to mitigate this.

eg it makes some of logical sense to point RDMA at the iommufd page
table that is already pinned when trying to DMA from guest memory, in
this case it could ride on the existing pin.

> Would it be the same now, just that we need multiple times the pin
> limit?

Yes

Jason

WARNING: multiple messages have this Message-ID (diff)

From: Jason Gunthorpe <jgg-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
To: David Hildenbrand <david-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Alistair Popple <apopple-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	jhubbard-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org,
	tjmercier-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	mkoutny-IBi9RG/b67k@public.gmane.org,
	daniel-/w4YWyX8dFk@public.gmane.org
Subject: Re: [RFC PATCH 00/19] mm: Introduce a cgroup to limit the amount of locked and pinned memory
Date: Tue, 31 Jan 2023 10:10:43 -0400	[thread overview]
Message-ID: <Y9khYwunmC/xdXT9@nvidia.com> (raw)
In-Reply-To: <2e78d261-9ae9-d203-446e-eaa3c652ca6e-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Tue, Jan 31, 2023 at 03:06:10PM +0100, David Hildenbrand wrote:
> On 31.01.23 15:03, Jason Gunthorpe wrote:
> > On Tue, Jan 31, 2023 at 02:57:20PM +0100, David Hildenbrand wrote:
> > 
> > > > I'm excited by this series, thanks for making it.
> > > > 
> > > > The pin accounting has been a long standing problem and cgroups will
> > > > really help!
> > > 
> > > Indeed. I'm curious how GUP-fast, pinning the same page multiple times, and
> > > pinning subpages of larger folios are handled :)
> > 
> > The same as today. The pinning is done based on the result from GUP,
> > and we charge every returned struct page.
> > 
> > So duplicates are counted multiple times, folios are ignored.
> > 
> > Removing duplicate charges would be costly, it would require storage
> > to keep track of how many times individual pages have been charged to
> > each cgroup (eg an xarray indexed by PFN of integers in each cgroup).
> > 
> > It doesn't seem worth the cost, IMHO.
> > 
> > We've made alot of investment now with iommufd to remove the most
> > annoying sources of duplicated pins so it is much less of a problem in
> > the qemu context at least.
> 
> Wasn't there the discussion regarding using vfio+io_uring+rdma+$whatever on
> a VM and requiring multiple times the VM size as memlock limit?

Yes, but iommufd gives us some more options to mitigate this.

eg it makes some of logical sense to point RDMA at the iommufd page
table that is already pinned when trying to DMA from guest memory, in
this case it could ride on the existing pin.

> Would it be the same now, just that we need multiple times the pin
> limit?

Yes

Jason

next prev parent reply	other threads:[~2023-01-31 14:10 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-24  5:42 [RFC PATCH 00/19] mm: Introduce a cgroup to limit the amount of locked and pinned memory Alistair Popple
2023-01-24  5:42 ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 01/19] mm: Introduce vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  6:29   ` Christoph Hellwig
2023-01-24  6:29     ` Christoph Hellwig
2023-01-24  6:29     ` Christoph Hellwig
2023-01-24 14:32   ` Jason Gunthorpe
2023-01-24 14:32     ` Jason Gunthorpe
2023-01-30 11:36     ` Alistair Popple
2023-01-30 11:36       ` Alistair Popple
2023-01-31 14:00   ` David Hildenbrand
2023-01-31 14:00     ` David Hildenbrand
2023-01-31 14:00     ` David Hildenbrand
2023-01-24  5:42 ` [RFC PATCH 02/19] drivers/vhost: Convert to use vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:55   ` Michael S. Tsirkin
2023-01-24  5:55     ` Michael S. Tsirkin
2023-01-24  5:55     ` Michael S. Tsirkin
2023-01-30 10:43     ` Alistair Popple
2023-01-30 10:43       ` Alistair Popple
2023-01-24 14:34   ` Jason Gunthorpe
2023-01-24  5:42 ` [RFC PATCH 03/19] drivers/vdpa: Convert vdpa to use the new vm_structure Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24 14:35   ` Jason Gunthorpe
2023-01-24 14:35     ` Jason Gunthorpe
2023-01-24  5:42 ` [RFC PATCH 04/19] infiniband/umem: Convert to use vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 05/19] RMDA/siw: " Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24 14:37   ` Jason Gunthorpe
2023-01-24 15:22     ` Bernard Metzler
2023-01-24 15:22       ` Bernard Metzler
2023-01-24 15:56     ` Bernard Metzler
2023-01-24 15:56       ` Bernard Metzler
2023-01-30 11:34       ` Alistair Popple
2023-01-30 11:34         ` Alistair Popple
2023-01-30 13:27         ` Bernard Metzler
2023-01-24  5:42 ` [RFC PATCH 06/19] RDMA/usnic: convert " Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24 14:41   ` Jason Gunthorpe
2023-01-24 14:41     ` Jason Gunthorpe
2023-01-30 11:10     ` Alistair Popple
2023-01-30 11:10       ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 07/19] vfio/type1: Charge pinned pages to pinned_vm instead of locked_vm Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 08/19] vfio/spapr_tce: Convert accounting to pinned_vm Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 09/19] io_uring: convert to use vm_account Alistair Popple
2023-01-24 14:44   ` Jason Gunthorpe
2023-01-30 11:12     ` Alistair Popple
2023-01-30 11:12       ` Alistair Popple
2023-01-30 13:21       ` Jason Gunthorpe
2023-01-24  5:42 ` [RFC PATCH 10/19] net: skb: Switch to using vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24 14:51   ` Jason Gunthorpe
2023-01-24 14:51     ` Jason Gunthorpe
2023-01-30 11:17     ` Alistair Popple
2023-02-06  4:36       ` Alistair Popple
2023-02-06  4:36         ` Alistair Popple
2023-02-06 13:14         ` Jason Gunthorpe
2023-02-06 13:14           ` Jason Gunthorpe
2023-01-24  5:42 ` [RFC PATCH 11/19] xdp: convert to use vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 12/19] kvm/book3s_64_vio: Convert account_locked_vm() to vm_account_pinned() Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 13/19] fpga: dfl: afu: convert to use vm_account Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 14/19] mm: Introduce a cgroup for pinned memory Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  8:20   ` kernel test robot
2023-01-24 15:00   ` kernel test robot
2023-01-24 15:41   ` kernel test robot
2023-01-27 21:44   ` Tejun Heo
2023-01-27 21:44     ` Tejun Heo
2023-01-30 13:20     ` Jason Gunthorpe
2023-01-30 13:20       ` Jason Gunthorpe
2023-01-24  5:42 ` [RFC PATCH 15/19] mm/util: Extend vm_account to charge pages against the pin cgroup Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 16/19] mm/util: Refactor account_locked_vm Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  9:52   ` kernel test robot
2023-01-24  5:42 ` [RFC PATCH 17/19] mm: Convert mmap and mlock to use account_locked_vm Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 18/19] mm/mmap: Charge locked memory to pins cgroup Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24  5:42 ` [RFC PATCH 19/19] selftests/vm: Add pins-cgroup selftest for mlock/mmap Alistair Popple
2023-01-24  5:42   ` Alistair Popple
2023-01-24 18:26 ` [RFC PATCH 00/19] mm: Introduce a cgroup to limit the amount of locked and pinned memory Yosry Ahmed
2023-01-24 18:26   ` Yosry Ahmed
2023-01-31  0:54   ` Alistair Popple
2023-01-31  0:54     ` Alistair Popple
2023-01-31  5:14     ` Yosry Ahmed
2023-01-31  5:14       ` Yosry Ahmed
2023-01-31 11:22       ` Alistair Popple
2023-01-31 11:22         ` Alistair Popple
2023-01-31 19:49         ` Yosry Ahmed
2023-01-31 19:49           ` Yosry Ahmed
2023-01-24 20:12 ` Jason Gunthorpe
2023-01-24 20:12   ` Jason Gunthorpe
2023-01-31 13:57   ` David Hildenbrand
2023-01-31 14:03     ` Jason Gunthorpe
2023-01-31 14:03       ` Jason Gunthorpe
2023-01-31 14:06       ` David Hildenbrand
2023-01-31 14:10         ` Jason Gunthorpe [this message]
2023-01-31 14:10           ` Jason Gunthorpe
2023-01-31 14:15           ` David Hildenbrand
2023-01-31 14:15             ` David Hildenbrand
2023-01-31 14:21             ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9khYwunmC/xdXT9@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel@ffwll.ch \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mkoutny@suse.com \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.