All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: David Howells <dhowells@redhat.com>,
	Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>, Al Viro <viro@zeniv.linux.org.uk>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@kernel.org>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Hillf Danton <hdanton@sina.com>,
	Christian Brauner <brauner@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Extending page pinning into fs/direct-io.c
Date: Wed, 24 May 2023 09:06:36 +0200	[thread overview]
Message-ID: <5c4160cc-6aec-f6a6-8bab-b0bf201a037c@redhat.com> (raw)
In-Reply-To: <3068545.1684872971@warthog.procyon.org.uk>

On 23.05.23 22:16, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> 
>> But can you please also take care of the legacy direct I/O code?  I'd really
>> hate to leave yet another unfinished transition around.
> 
> I've been poking at it this afternoon, but it doesn't look like it's going to
> be straightforward, unfortunately.  The mm folks have been withdrawing access
> to the pinning API behind the ramparts of the mm/ dir.  Further, the dio code
> will (I think), under some circumstances, arbitrarily insert the zero_page
> into a list of things that are maybe pinned or maybe unpinned, but I can (I
> think) also be given a pinned zero_page from the GUP code if the page tables
> point to one and a DIO-write is requested - so just doing if page == zero_page
> isn't sufficient.
> 
> What I'd like to do is to make the GUP code not take a ref on the zero_page
> if, say, FOLL_DONT_PIN_ZEROPAGE is passed in, and then make the bio cleanup
> code always ignore the zero_page.

We discussed doing that unconditionally in the context of vfio (below), but vfio
decided to add a workaround suitable for stable.

In case of FOLL_PIN it's simple: if we detect the zeropage, don't mess with the
refcount when pinning and don't mess with the refcount when unpinning (esp.
unpin_user_pages). FOLL_GET is a different story but we don't have to mess with
that.

So there shouldn't be need for a FOLL_DONT_PIN_ZEROPAGE, we could just do it
unconditionally.

> 
> Alternatively, I can drop the pin immediately if I get given one on the
> zero_page - it's not going anywhere, after all.

That's what vfio did in

commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Mon Aug 29 21:05:40 2022 -0600

     vfio/type1: Unpin zero pages
     
     There's currently a reference count leak on the zero page.  We increment
     the reference via pin_user_pages_remote(), but the page is later handled
     as an invalid/reserved page, therefore it's not accounted against the
     user and not unpinned by our put_pfn().
     
     Introducing special zero page handling in put_pfn() would resolve the
     leak, but without accounting of the zero page, a single user could
     still create enough mappings to generate a reference count overflow.
     
     The zero page is always resident, so for our purposes there's no reason
     to keep it pinned.  Therefore, add a loop to walk pages returned from
     pin_user_pages_remote() and unpin any zero pages.


For vfio that handling no longer required, because FOLL_LONGTERM will never pin
the shared zeropage.

-- 
Thanks,

David / dhildenb


  parent reply	other threads:[~2023-05-24  7:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22 20:57 [PATCH v21 0/6] block: Use page pinning David Howells
2023-05-22 20:57 ` [PATCH v21 1/6] iomap: Don't get an reference on ZERO_PAGE for direct I/O block zeroing David Howells
2023-05-23  8:07   ` Jan Kara
2023-05-23 12:35   ` Christian Brauner
2023-05-22 20:57 ` [PATCH v21 2/6] block: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-05-23  8:07   ` Jan Kara
2023-05-23 12:37   ` Christian Brauner
2023-05-22 20:57 ` [PATCH v21 3/6] block: Replace BIO_NO_PAGE_REF with BIO_PAGE_REFFED with inverted logic David Howells
2023-05-23  8:07   ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 4/6] block: Add BIO_PAGE_PINNED and associated infrastructure David Howells
2023-05-23  8:08   ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 5/6] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages David Howells
2023-05-23  8:15   ` Jan Kara
2023-05-22 20:57 ` [PATCH v21 6/6] block: convert bio_map_user_iov " David Howells
2023-05-23  8:14   ` Jan Kara
2023-05-23  6:39 ` [PATCH v21 0/6] block: Use page pinning Christoph Hellwig
2023-05-23 20:16 ` Extending page pinning into fs/direct-io.c David Howells
2023-05-24  5:55   ` Christoph Hellwig
2023-05-24  7:06   ` David Hildenbrand [this message]
2023-05-24  8:47   ` David Howells
2023-05-25  9:51     ` Christoph Hellwig
2023-05-25 16:31     ` Linus Torvalds
2023-05-25 16:45       ` David Hildenbrand
2023-05-25 17:04         ` Linus Torvalds
2023-05-25 17:15         ` David Howells
2023-05-25 17:25           ` Linus Torvalds
2023-05-25 17:07       ` David Howells
2023-05-25 17:17         ` Linus Torvalds
2023-05-25 17:00     ` David Howells
2023-05-25 17:13       ` Linus Torvalds
2023-05-23 21:38 ` [PATCH v21 0/6] block: Use page pinning Jens Axboe
2023-05-24  5:52   ` Christoph Hellwig
2023-05-24 14:43     ` Jens Axboe
2023-05-24  7:35   ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c4160cc-6aec-f6a6-8bab-b0bf201a037c@redhat.com \
    --to=david@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=hdanton@sina.com \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jlayton@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.