From: David Hildenbrand <david@redhat.com>
To: David Howells <dhowells@redhat.com>,
Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>, Al Viro <viro@zeniv.linux.org.uk>,
Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
Jeff Layton <jlayton@kernel.org>,
Jason Gunthorpe <jgg@nvidia.com>,
Logan Gunthorpe <logang@deltatee.com>,
Hillf Danton <hdanton@sina.com>,
Christian Brauner <brauner@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Extending page pinning into fs/direct-io.c
Date: Wed, 24 May 2023 09:06:36 +0200 [thread overview]
Message-ID: <5c4160cc-6aec-f6a6-8bab-b0bf201a037c@redhat.com> (raw)
In-Reply-To: <3068545.1684872971@warthog.procyon.org.uk>
On 23.05.23 22:16, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
>
>> But can you please also take care of the legacy direct I/O code? I'd really
>> hate to leave yet another unfinished transition around.
>
> I've been poking at it this afternoon, but it doesn't look like it's going to
> be straightforward, unfortunately. The mm folks have been withdrawing access
> to the pinning API behind the ramparts of the mm/ dir. Further, the dio code
> will (I think), under some circumstances, arbitrarily insert the zero_page
> into a list of things that are maybe pinned or maybe unpinned, but I can (I
> think) also be given a pinned zero_page from the GUP code if the page tables
> point to one and a DIO-write is requested - so just doing if page == zero_page
> isn't sufficient.
>
> What I'd like to do is to make the GUP code not take a ref on the zero_page
> if, say, FOLL_DONT_PIN_ZEROPAGE is passed in, and then make the bio cleanup
> code always ignore the zero_page.
We discussed doing that unconditionally in the context of vfio (below), but vfio
decided to add a workaround suitable for stable.
In case of FOLL_PIN it's simple: if we detect the zeropage, don't mess with the
refcount when pinning and don't mess with the refcount when unpinning (esp.
unpin_user_pages). FOLL_GET is a different story but we don't have to mess with
that.
So there shouldn't be need for a FOLL_DONT_PIN_ZEROPAGE, we could just do it
unconditionally.
>
> Alternatively, I can drop the pin immediately if I get given one on the
> zero_page - it's not going anywhere, after all.
That's what vfio did in
commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4
Author: Alex Williamson <alex.williamson@redhat.com>
Date: Mon Aug 29 21:05:40 2022 -0600
vfio/type1: Unpin zero pages
There's currently a reference count leak on the zero page. We increment
the reference via pin_user_pages_remote(), but the page is later handled
as an invalid/reserved page, therefore it's not accounted against the
user and not unpinned by our put_pfn().
Introducing special zero page handling in put_pfn() would resolve the
leak, but without accounting of the zero page, a single user could
still create enough mappings to generate a reference count overflow.
The zero page is always resident, so for our purposes there's no reason
to keep it pinned. Therefore, add a loop to walk pages returned from
pin_user_pages_remote() and unpin any zero pages.
For vfio that handling no longer required, because FOLL_LONGTERM will never pin
the shared zeropage.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2023-05-24 7:08 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 20:57 [PATCH v21 0/6] block: Use page pinning David Howells
2023-05-22 20:57 ` [PATCH v21 1/6] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing David Howells
2023-05-23 8:07 ` Jan Kara
2023-05-23 12:35 ` Christian Brauner
2023-05-22 20:57 ` [PATCH v21 2/6] block: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-05-23 8:07 ` Jan Kara
2023-05-23 12:37 ` Christian Brauner
2023-05-22 20:57 ` [PATCH v21 3/6] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic David Howells
2023-05-23 8:07 ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 4/6] block: Add BIO_PAGE_PINNED and associated infrastructure David Howells
2023-05-23 8:08 ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 5/6] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages David Howells
2023-05-23 8:15 ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 6/6] block: convert bio_map_user_iov " David Howells
2023-05-23 8:14 ` Jan Kara
2023-05-23 6:39 ` [PATCH v21 0/6] block: Use page pinning Christoph Hellwig
2023-05-23 20:16 ` Extending page pinning into fs/direct-io.c David Howells
2023-05-24 5:55 ` Christoph Hellwig
2023-05-24 7:06 ` David Hildenbrand [this message]
2023-05-24 8:47 ` David Howells
2023-05-25 9:51 ` Christoph Hellwig
2023-05-25 16:31 ` Linus Torvalds
2023-05-25 16:45 ` David Hildenbrand
2023-05-25 17:04 ` Linus Torvalds
2023-05-25 17:15 ` David Howells
2023-05-25 17:25 ` Linus Torvalds
2023-05-25 17:07 ` David Howells
2023-05-25 17:17 ` Linus Torvalds
2023-05-25 17:00 ` David Howells
2023-05-25 17:13 ` Linus Torvalds
2023-05-23 21:38 ` [PATCH v21 0/6] block: Use page pinning Jens Axboe
2023-05-24 5:52 ` Christoph Hellwig
2023-05-24 14:43 ` Jens Axboe
2023-05-24 7:35 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5c4160cc-6aec-f6a6-8bab-b0bf201a037c@redhat.com \
--to=david@redhat.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=hch@infradead.org \
--cc=hdanton@sina.com \
--cc=jack@suse.cz \
--cc=jgg@nvidia.com \
--cc=jlayton@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).