* [GIT PULL] Folio fixes for 5.19
@ 2022-06-10 19:22 Matthew Wilcox
2022-06-10 19:56 ` Linus Torvalds
2022-06-10 19:58 ` pr-tracker-bot
0 siblings, 2 replies; 5+ messages in thread
From: Matthew Wilcox @ 2022-06-10 19:22 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-fsdevel, linux-mm, linux-kernel
The following changes since commit 3d9f55c57bc3659f986acc421eac431ff6edcc83:
Merge tag 'fs_for_v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs (2022-06-09 12:26:05 -0700)
are available in the Git repository at:
git://git.infradead.org/users/willy/pagecache.git tags/folio-5.19a
for you to fetch changes up to 334f6f53abcf57782bd2fe81da1cbd893e4ef05c:
mm: Add kernel-doc for folio->mlock_count (2022-06-09 16:24:25 -0400)
----------------------------------------------------------------
Four folio-related fixes for 5.19:
- Don't release a folio while it's still locked
- Fix a use-after-free after dropping the mmap_lock
- Fix a memory leak when splitting a page
- Fix a kernel-doc warning for struct folio
----------------------------------------------------------------
Matthew Wilcox (Oracle) (4):
filemap: Don't release a locked folio
filemap: Cache the value of vm_flags
mm/huge_memory: Fix xarray node memory leak
mm: Add kernel-doc for folio->mlock_count
include/linux/mm_types.h | 5 +++++
include/linux/xarray.h | 1 +
lib/xarray.c | 5 +++--
mm/filemap.c | 9 +++++----
mm/huge_memory.c | 3 +--
mm/readahead.c | 2 ++
6 files changed, 17 insertions(+), 8 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Folio fixes for 5.19
2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
@ 2022-06-10 19:56 ` Linus Torvalds
2022-06-10 21:39 ` Matthew Wilcox
2022-06-10 19:58 ` pr-tracker-bot
1 sibling, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2022-06-10 19:56 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List
On Fri, Jun 10, 2022 at 12:22 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> - Don't release a folio while it's still locked
Ugh.
I *hate* this patch. It's just incredibly broken.
Yes, I've pulled this, but I have looked at that readahead_folio()
function before, and I have despised it before, but this patch really
drove home how incredibly broken that function is.
Honestly, readahead_folio() returns a folio *AFTER* it has dropped the
ref to that folio.
That's broken to start with. It's not only really wrong to say "here's
a pointer, and by the way, you don't actually hold a ref to it any
more".
It's doubly broken because it's *very* different from the similarly
named readahead_page() that doesn't have that fundamental interface
bug.
But it's now *extra* broken now that you realize that the dropping of
the ref was very very wrong, so you re-get the ref. WTF?
As far as I can tell, ALL THE OTHER users of this broken function fall
into two categories:
(a) they are broken (see the exact same broken pattern in
fs/erofs/fscache.c, for example)
(b) they only use the thing as a boolean
Anyway, I've pulled this, but I really seriously object to that
completely mis-designed function.
Please send me a follow-up patch that fixes it by just making the
*callers* drop the refcount, instead of doing it incorrectly inside
readahead_folio().
Alternatively, make "readahead_folio()" just return a boolean - so
that the (b) case situation can continue the use this function - and
create a new function that just exposes __readahead_folio() without
this broken "drop refcount" behavior).
Or explain why that "drop and retake ref" isn't
(a) completely broken and racy
(b) inefficient
(c) returning a non-ref'd folio pointer is horribly broken interface
to begin with
because right now it's just disgusting in so many ways and this bug is
just the last drop for me.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Folio fixes for 5.19
2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
2022-06-10 19:56 ` Linus Torvalds
@ 2022-06-10 19:58 ` pr-tracker-bot
1 sibling, 0 replies; 5+ messages in thread
From: pr-tracker-bot @ 2022-06-10 19:58 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Linus Torvalds, linux-fsdevel, linux-mm, linux-kernel
The pull request you sent on Fri, 10 Jun 2022 20:22:06 +0100:
> git://git.infradead.org/users/willy/pagecache.git tags/folio-5.19a
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a32e7ea362356af8e89e67600432bad83d2325da
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Folio fixes for 5.19
2022-06-10 19:56 ` Linus Torvalds
@ 2022-06-10 21:39 ` Matthew Wilcox
2022-06-10 23:27 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2022-06-10 21:39 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List
On Fri, Jun 10, 2022 at 12:56:48PM -0700, Linus Torvalds wrote:
> On Fri, Jun 10, 2022 at 12:22 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > - Don't release a folio while it's still locked
>
> Ugh.
>
> I *hate* this patch. It's just incredibly broken.
>
> Yes, I've pulled this, but I have looked at that readahead_folio()
> function before, and I have despised it before, but this patch really
> drove home how incredibly broken that function is.
>
> Honestly, readahead_folio() returns a folio *AFTER* it has dropped the
> ref to that folio.
OK, you caught me.
I realised (a little too late) that the rules around refcounts in
->readpage and ->readahead are different, and that creates pain for
people writing filesystems. For ->readahead, I stuck with the refcount
model that was in ->readpages (there is an extra refcount on the folio
and the filesystem must put it before it returns).
But I don't want to change the refcounting rules on a method without
changing something else about the method, because trying to find a
missing refcount change is misery. Anyway, my cunning thought was
that if I bundle the change to the refcount rule with the change
from readahead_page() to readahead_folio(), once all filesystems
are converted to readahead_folio(), I can pull the refcount game out
of readahead_folio() and do it in the caller where it belongs, all
transparent to the filesystems.
I think it's worth doing, because it's two fewer atomic ops per folio
that we read from a file. But I didn't think through the transition
process clearly enough, and right now it's a mess. How would you like
me to proceed?
(I don't think the erofs code has a bug because it doesn't remove
the folio from the pagecache while holding the lock -- the folio lock
prevents anyone _else_ from removing the folio from the pagecache,
so there must be a reference on the folio up until erofs calls
folio_unlock()).
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Folio fixes for 5.19
2022-06-10 21:39 ` Matthew Wilcox
@ 2022-06-10 23:27 ` Linus Torvalds
0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2022-06-10 23:27 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: linux-fsdevel, Linux-MM, Linux Kernel Mailing List
On Fri, Jun 10, 2022 at 2:40 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> But I don't want to change the refcounting rules on a method without
> changing something else about the method, because trying to find a
> missing refcount change is misery. Anyway, my cunning thought was
> that if I bundle the change to the refcount rule with the change
> from readahead_page() to readahead_folio(), once all filesystems
> are converted to readahead_folio(), I can pull the refcount game out
> of readahead_folio() and do it in the caller where it belongs, all
> transparent to the filesystems.
Hmm. Any reason why that can't be done right now? Aren't we basically
converted already?
Yeah, yeah, there's a couple of users of readahead_page() left, but if
cleaning up the folio case requires some fixup to those, then that
sounds better than the current "folio interface is very messy".
> (I don't think the erofs code has a bug because it doesn't remove
> the folio from the pagecache while holding the lock -- the folio lock
> prevents anyone _else_ from removing the folio from the pagecache,
> so there must be a reference on the folio up until erofs calls
> folio_unlock()).
Ahh. Ugh. And I guess the whole "clearing the lock bit is the last
time we touch the page flags" and "folio_wake_bit() is very careful to
only touch the external waitqueue" so that there can be no nasty races
with somebody coming in *exactly* as the folio is unlocked.
This has been subtle before, but I think we did allow it exactly for
this kind of reason. I've swapped out the details.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-06-10 23:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 19:22 [GIT PULL] Folio fixes for 5.19 Matthew Wilcox
2022-06-10 19:56 ` Linus Torvalds
2022-06-10 21:39 ` Matthew Wilcox
2022-06-10 23:27 ` Linus Torvalds
2022-06-10 19:58 ` pr-tracker-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.