From: David Hildenbrand <david@redhat.com>
To: David Howells <dhowells@redhat.com>
Cc: "Teterevkov, Ivan" <Ivan.Teterevkov@amd.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"jhubbard@nvidia.com" <jhubbard@nvidia.com>,
"jack@suse.cz" <jack@suse.cz>,
"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
"jglisse@redhat.com" <jglisse@redhat.com>,
"ira.weiny@intel.com" <ira.weiny@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>,
Matthew Wilcox <willy@infradead.org>
Subject: Re: find_get_page() VS pin_user_pages()
Date: Thu, 13 Apr 2023 14:41:43 +0200 [thread overview]
Message-ID: <c1902449-ab9d-4e26-c532-5df0a73dc1f9@redhat.com> (raw)
In-Reply-To: <37946.1681288867@warthog.procyon.org.uk>
On 12.04.23 10:41, David Howells wrote:
> David Hildenbrand <david@redhat.com> wrote:
>
>> I suspect that find_get_page() is not the kind of interface you want to use
>> for the purpose you describe. find_get_page() is a wrapper around
>> pagecache_get_page() and seems more like a helper for implementing an fs
>> (looking at the users and the fact that it only considers pages that are in
>> the pagecache).
>
> Btw, at some point we're going to need public functions to get extra pins on
> pages. vmsplice() should be pinning the pages it pushes into a pipe - so all
> pages in a pipe should probably be pinned - and anyone who splices a page out
> of a pipe and retains it (skbuffs spring strongly to mind) should also get a
> pin on the page.
As discussed, vmsplice() is a bit special, because it has
longterm-pinning semantics: we'd want to migrate the page out of
ZONE_MOVABLE/MIGRATE_CMA/... because the page might remain pinned in the
pipe possibly forever, controlled by user space.
pin_user_pages(FOLL_LONGTERM) would do the right thing, but we might
ahve to be careful with extra pins.
I guess it depends on what we want to achieve. Let's discuss what would
happen when we want to pin some page (and not going via pin_user_page())
that's definitely not an anon page -- so let's assume a pagecache page:
(a) Short-term pinning when already pinned (extra pins): easy.
(b) Short-term pinning when not pinned yet: should be fairly easy
(pin_user_pages() doesn't do anything special for pagecache pages
either).
(c) Long-term pinning when already long-term pinned (extra long-term
pinnings): easy
(d) Long-term pinning when already short-term pinned: problematic,
because we might have to migrate the page first, but it's already
pinned ... and if we obtained the page via pin_user_page() from a
MAP_PRIVATE VMA, we'd have to do another
pin_user_page(FOLL_LONGTERM) that would properly break COW and give
us an anon page ...
(e) Long-term pinning when not pinned yet: fairly easy, but we might
have to migrate the page first (like FOLL_LONGTERM would).
Regarding anon pages, we should pin only via pin_user_page(), so the
"not pinned" case does not apply. Replicating pins -- (a) and (c) -- is
usually easy, but (d) is similarly problematic.
Focusing again on !anon pages: if it's just "get another short-term pin
on an already pinned page", it's easy (and I recall John H. had
patches). If it's "get a long-term pin on an already pinned page", it
can be problematic.
Any pages that will never have to be migrated when long-term pinning
(just some allocated kernel page without MOVABLE semantics) are super
easy to pin, and to add extra pins to.
>
> So should all pages held by an skbuff be pinned rather than ref'd? I have a
> patch to use the bottom two bits of an skb frag's page pointer to keep track
> of whether the page it points to is ref'd, pinned or neither, but if we can
> make it pin/not-pin them, I only need one bit for that.
It might possibly be the right thing. But ref'd vs. pinned really only
makes a difference to (a) pages mapped into user space or (b) pages in
the pageache. Of course, in any case, long-term semantics have to be
respected if the page to pin might have been allocated with MOVABLE
semantics.
--
Thanks,
David / dhildenb
prev parent reply other threads:[~2023-04-13 12:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 19:43 find_get_page() VS pin_user_pages() Teterevkov, Ivan
2023-04-11 23:38 ` Alistair Popple
2023-04-12 9:04 ` Teterevkov, Ivan
2023-04-12 10:41 ` Jan Kara
2023-04-12 12:13 ` Teterevkov, Ivan
2023-04-12 8:17 ` David Hildenbrand
2023-04-12 9:43 ` Teterevkov, Ivan
2023-04-12 8:41 ` David Howells
2023-04-13 12:41 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c1902449-ab9d-4e26-c532-5df0a73dc1f9@redhat.com \
--to=david@redhat.com \
--cc=Ivan.Teterevkov@amd.com \
--cc=dhowells@redhat.com \
--cc=hch@lst.de \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rppt@linux.ibm.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).