linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* When is page->index stable?
@ 2020-08-27 19:14 Matthew Wilcox
  2020-08-27 20:52 ` Hugh Dickins
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2020-08-27 19:14 UTC (permalink / raw)
  To: linux-mm

We have a number of places where we look up a page in the page cache,
lock it, then have some kind of assertion that we got back the page we
asked for, eg filemap_fault():

        page = find_get_page(mapping, offset);
...
        if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
                goto out_retry;
...
        VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);

but today I noticed this in shmem_undo_range():

                pvec.nr = find_get_entries(mapping, index,
                        min(end - index, (pgoff_t)PAGEVEC_SIZE),
                        pvec.pages, indices);
...
                        VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page);
...
                        if (!trylock_page(page))
                                continue;

So is page->index stable if we have a refcount on the page, or is a lock
on the page required?  A refcount on the page prevents it from being
split or freed.  And there's plenty of comments along the lines of:

mm/filemap.c:		/* Leave page->index set: truncation lookup relies on it */

which indicates that once a page is removed from the page cache, its
index remains reliable (until it's freed).

It might be nice to remove all these assertions from the callers and
bury them down in find_get_(entry,page,entries,...), but we can't do
that if we need the lock to check the index.  If we don't need the lock,
then it should be safe to check as soon as we've checked that
page == xas_reload().


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: When is page->index stable?
  2020-08-27 19:14 When is page->index stable? Matthew Wilcox
@ 2020-08-27 20:52 ` Hugh Dickins
  2020-09-10 14:27   ` Matthew Wilcox
  0 siblings, 1 reply; 3+ messages in thread
From: Hugh Dickins @ 2020-08-27 20:52 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Thu, 27 Aug 2020, Matthew Wilcox wrote:

> We have a number of places where we look up a page in the page cache,
> lock it, then have some kind of assertion that we got back the page we
> asked for, eg filemap_fault():
> 
>         page = find_get_page(mapping, offset);
> ...
>         if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
>                 goto out_retry;
> ...
>         VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
> 
> but today I noticed this in shmem_undo_range():
> 
>                 pvec.nr = find_get_entries(mapping, index,
>                         min(end - index, (pgoff_t)PAGEVEC_SIZE),
>                         pvec.pages, indices);
> ...
>                         VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page);
> ...
>                         if (!trylock_page(page))
>                                 continue;
> 
> So is page->index stable if we have a refcount on the page,

Yes (once it has been found in the page cache -
obviously not stable before it has been put into the page cache).

> or is a lock on the page required?

No.  A lock on the page is required for page cache page->mapping
to be stable, but not required for its page->index to remain stable.

> A refcount on the page prevents it from being
> split or freed.  And there's plenty of comments along the lines of:
> 
> mm/filemap.c:		/* Leave page->index set: truncation lookup relies on it */
> 
> which indicates that once a page is removed from the page cache, its
> index remains reliable (until it's freed).
> 
> It might be nice to remove all these assertions from the callers and
> bury them down in find_get_(entry,page,entries,...), but we can't do
> that if we need the lock to check the index.  If we don't need the lock,
> then it should be safe to check as soon as we've checked that
> page == xas_reload().

Yes.

But you might then discover something violating the principle.
I have an indistinct memory of spotting an instance once, maybe
just in a prospective patchset that didn't reach the kernel; perhaps
someone resetting page->index to 0 "for tidiness" before freeing;
maybe page migration did that once upon a time, then got fixed.

And of course beware of hugetlbfs, defining page->index differently
(unless you have fixed that already).

Hugh


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: When is page->index stable?
  2020-08-27 20:52 ` Hugh Dickins
@ 2020-09-10 14:27   ` Matthew Wilcox
  0 siblings, 0 replies; 3+ messages in thread
From: Matthew Wilcox @ 2020-09-10 14:27 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm

On Thu, Aug 27, 2020 at 01:52:47PM -0700, Hugh Dickins wrote:
> On Thu, 27 Aug 2020, Matthew Wilcox wrote:
> > We have a number of places where we look up a page in the page cache,
> > lock it, then have some kind of assertion that we got back the page we
> > asked for, eg filemap_fault():
> > 
> >         page = find_get_page(mapping, offset);
> > ...
> >         if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> >                 goto out_retry;
> > ...
> >         VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);
> > 
> > but today I noticed this in shmem_undo_range():
> > 
> >                 pvec.nr = find_get_entries(mapping, index,
> >                         min(end - index, (pgoff_t)PAGEVEC_SIZE),
> >                         pvec.pages, indices);
> > ...
> >                         VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page);
> > ...
> >                         if (!trylock_page(page))
> >                                 continue;
> > 
> > So is page->index stable if we have a refcount on the page,
> 
> Yes (once it has been found in the page cache -
> obviously not stable before it has been put into the page cache).
> 
> > or is a lock on the page required?
> 
> No.  A lock on the page is required for page cache page->mapping
> to be stable, but not required for its page->index to remain stable.
> 
> > A refcount on the page prevents it from being
> > split or freed.  And there's plenty of comments along the lines of:
> > 
> > mm/filemap.c:		/* Leave page->index set: truncation lookup relies on it */
> > 
> > which indicates that once a page is removed from the page cache, its
> > index remains reliable (until it's freed).
> > 
> > It might be nice to remove all these assertions from the callers and
> > bury them down in find_get_(entry,page,entries,...), but we can't do
> > that if we need the lock to check the index.  If we don't need the lock,
> > then it should be safe to check as soon as we've checked that
> > page == xas_reload().
> 
> Yes.
> 
> But you might then discover something violating the principle.
> I have an indistinct memory of spotting an instance once, maybe
> just in a prospective patchset that didn't reach the kernel; perhaps
> someone resetting page->index to 0 "for tidiness" before freeing;
> maybe page migration did that once upon a time, then got fixed.
> 
> And of course beware of hugetlbfs, defining page->index differently
> (unless you have fixed that already).

The other thing to beware of is swapcache, which also defines page->index
differently.  We went through this last year ...
https://lore.kernel.org/linux-mm/20190323033852.GC10344@bombadil.infradead.org/T/#u

As can be seen from that thread, even calling page_index() instead of
directly looking at page->index is insufficient because having a reference
on a page is insufficient to keep ClearPageSwapCache from being cleared.

What you're doing is safe, because you know mapping isn't a swapcache
mapping; it's a shmem mapping.

So I'm going to back out a good chunk of the work I did yesterday :-(


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-09-10 14:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 19:14 When is page->index stable? Matthew Wilcox
2020-08-27 20:52 ` Hugh Dickins
2020-09-10 14:27   ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).