All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] Folio fixes for 5.19
@ 2022-06-10 19:22 Matthew Wilcox
  2022-06-10 19:56 ` Linus Torvalds
  2022-06-10 19:58 ` pr-tracker-bot
  0 siblings, 2 replies; 5+ messages in thread
From: Matthew Wilcox @ 2022-06-10 19:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, linux-mm, linux-kernel

The following changes since commit 3d9f55c57bc3659f986acc421eac431ff6edcc83:

  Merge tag 'fs_for_v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs (2022-06-09 12:26:05 -0700)

are available in the Git repository at:

  git://git.infradead.org/users/willy/pagecache.git tags/folio-5.19a

for you to fetch changes up to 334f6f53abcf57782bd2fe81da1cbd893e4ef05c:

  mm: Add kernel-doc for folio->mlock_count (2022-06-09 16:24:25 -0400)

----------------------------------------------------------------
Four folio-related fixes for 5.19:

 - Don't release a folio while it's still locked

 - Fix a use-after-free after dropping the mmap_lock

 - Fix a memory leak when splitting a page

 - Fix a kernel-doc warning for struct folio

----------------------------------------------------------------
Matthew Wilcox (Oracle) (4):
      filemap: Don't release a locked folio
      filemap: Cache the value of vm_flags
      mm/huge_memory: Fix xarray node memory leak
      mm: Add kernel-doc for folio->mlock_count

 include/linux/mm_types.h | 5 +++++
 include/linux/xarray.h   | 1 +
 lib/xarray.c             | 5 +++--
 mm/filemap.c             | 9 +++++----
 mm/huge_memory.c         | 3 +--
 mm/readahead.c           | 2 ++
 6 files changed, 17 insertions(+), 8 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] Folio fixes for 5.19
  2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
@ 2022-06-10 19:56 ` Linus Torvalds
  2022-06-10 21:39   ` Matthew Wilcox
  2022-06-10 19:58 ` pr-tracker-bot
  1 sibling, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2022-06-10 19:56 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List

On Fri, Jun 10, 2022 at 12:22 PM Matthew Wilcox <willy@infradead.org> wrote:
>
>  - Don't release a folio while it's still locked

Ugh.

I *hate* this patch. It's just incredibly broken.

Yes, I've pulled this, but I have looked at that readahead_folio()
function before, and I have despised it before, but this patch really
drove home how incredibly broken that function is.

Honestly, readahead_folio() returns a folio *AFTER* it has dropped the
ref to that folio.

That's broken to start with. It's not only really wrong to say "here's
a pointer, and by the way, you don't actually hold a ref to it any
more".

It's doubly broken because it's *very* different from the similarly
named readahead_page() that doesn't have that fundamental interface
bug.

But it's now *extra* broken now that you realize that the dropping of
the ref was very very wrong, so you re-get the ref. WTF?

As far as I can tell, ALL THE OTHER users of this broken function fall
into two categories:

 (a) they are broken (see the exact same broken pattern in
fs/erofs/fscache.c, for example)

 (b) they only use the thing as a boolean

Anyway, I've pulled this, but I really seriously object to that
completely mis-designed function.

Please send me a follow-up patch that fixes it by just making the
*callers* drop the refcount, instead of doing it incorrectly inside
readahead_folio().

Alternatively, make "readahead_folio()" just return a boolean - so
that the (b) case situation can continue the use this function - and
create a new function that just exposes __readahead_folio() without
this broken "drop refcount" behavior).

Or explain why that "drop and retake ref" isn't

 (a) completely broken and racy

 (b) inefficient

 (c) returning a non-ref'd folio pointer is horribly broken interface
to begin with

because right now it's just disgusting in so many ways and this bug is
just the last drop for me.

                 Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] Folio fixes for 5.19
  2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
  2022-06-10 19:56 ` Linus Torvalds
@ 2022-06-10 19:58 ` pr-tracker-bot
  1 sibling, 0 replies; 5+ messages in thread
From: pr-tracker-bot @ 2022-06-10 19:58 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Linus Torvalds, linux-fsdevel, linux-mm, linux-kernel

The pull request you sent on Fri, 10 Jun 2022 20:22:06 +0100:

> git://git.infradead.org/users/willy/pagecache.git tags/folio-5.19a

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a32e7ea362356af8e89e67600432bad83d2325da

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] Folio fixes for 5.19
  2022-06-10 19:56 ` Linus Torvalds
@ 2022-06-10 21:39   ` Matthew Wilcox
  2022-06-10 23:27     ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2022-06-10 21:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List

On Fri, Jun 10, 2022 at 12:56:48PM -0700, Linus Torvalds wrote:
> On Fri, Jun 10, 2022 at 12:22 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> >  - Don't release a folio while it's still locked
> 
> Ugh.
> 
> I *hate* this patch. It's just incredibly broken.
> 
> Yes, I've pulled this, but I have looked at that readahead_folio()
> function before, and I have despised it before, but this patch really
> drove home how incredibly broken that function is.
> 
> Honestly, readahead_folio() returns a folio *AFTER* it has dropped the
> ref to that folio.

OK, you caught me.

I realised (a little too late) that the rules around refcounts in
->readpage and ->readahead are different, and that creates pain for
people writing filesystems.  For ->readahead, I stuck with the refcount
model that was in ->readpages (there is an extra refcount on the folio
and the filesystem must put it before it returns).

But I don't want to change the refcounting rules on a method without
changing something else about the method, because trying to find a
missing refcount change is misery.  Anyway, my cunning thought was
that if I bundle the change to the refcount rule with the change
from readahead_page() to readahead_folio(), once all filesystems
are converted to readahead_folio(), I can pull the refcount game out
of readahead_folio() and do it in the caller where it belongs, all
transparent to the filesystems.

I think it's worth doing, because it's two fewer atomic ops per folio
that we read from a file.  But I didn't think through the transition
process clearly enough, and right now it's a mess.  How would you like
me to proceed?

(I don't think the erofs code has a bug because it doesn't remove
the folio from the pagecache while holding the lock -- the folio lock
prevents anyone _else_ from removing the folio from the pagecache,
so there must be a reference on the folio up until erofs calls
folio_unlock()).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [GIT PULL] Folio fixes for 5.19
  2022-06-10 21:39   ` Matthew Wilcox
@ 2022-06-10 23:27     ` Linus Torvalds
  0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2022-06-10 23:27 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List

On Fri, Jun 10, 2022 at 2:40 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> But I don't want to change the refcounting rules on a method without
> changing something else about the method, because trying to find a
> missing refcount change is misery.  Anyway, my cunning thought was
> that if I bundle the change to the refcount rule with the change
> from readahead_page() to readahead_folio(), once all filesystems
> are converted to readahead_folio(), I can pull the refcount game out
> of readahead_folio() and do it in the caller where it belongs, all
> transparent to the filesystems.

Hmm. Any reason why that can't be done right now? Aren't we basically
converted already?

Yeah, yeah, there's a couple of users of readahead_page() left, but if
cleaning up the folio case requires some fixup to those, then that
sounds better than the current "folio interface is very messy".

> (I don't think the erofs code has a bug because it doesn't remove
> the folio from the pagecache while holding the lock -- the folio lock
> prevents anyone _else_ from removing the folio from the pagecache,
> so there must be a reference on the folio up until erofs calls
> folio_unlock()).

Ahh. Ugh. And I guess the whole "clearing the lock bit is the last
time we touch the page flags" and "folio_wake_bit() is very careful to
only touch the external waitqueue" so that there can be no nasty races
with somebody coming in *exactly* as the folio is unlocked.

This has been subtle before, but I think we did allow it exactly for
this kind of reason. I've swapped out the details.

            Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-10 23:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
2022-06-10 19:56 ` Linus Torvalds
2022-06-10 21:39   ` Matthew Wilcox
2022-06-10 23:27     ` Linus Torvalds
2022-06-10 19:58 ` pr-tracker-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.