From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>, Ted Tso <tytso@mit.edu>,
Dave Chinner <david@fromorbit.com>,
Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org, linux-xfs@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net,
linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock
Date: Tue, 13 Jul 2021 13:11:39 +0200 [thread overview]
Message-ID: <20210713111139.GG12142@quack2.suse.cz> (raw)
In-Reply-To: <20210713012514.GB22402@magnolia>
On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > pgoff_t max_off;
> > struct page *page;
> > vm_fault_t ret = 0;
> > + bool mapping_locked = false;
> >
> > max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> > if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> > ret = VM_FAULT_MAJOR;
> > fpin = do_sync_mmap_readahead(vmf);
> > + }
> > +
> > + if (!page) {
>
> Is it still necessary to re-evaluate !page here?
No, you are right it is not necessary. I'll remove it.
> > retry_find:
> > + /*
> > + * See comment in filemap_create_page() why we need
> > + * invalidate_lock
> > + */
> > + if (!mapping_locked) {
> > + filemap_invalidate_lock_shared(mapping);
> > + mapping_locked = true;
> > + }
> > page = pagecache_get_page(mapping, offset,
> > FGP_CREAT|FGP_FOR_MMAP,
> > vmf->gfp_mask);
> > if (!page) {
> > if (fpin)
> > goto out_retry;
> > + filemap_invalidate_unlock_shared(mapping);
> > return VM_FAULT_OOM;
> > }
> > + } else if (unlikely(!PageUptodate(page))) {
> > + filemap_invalidate_lock_shared(mapping);
> > + mapping_locked = true;
> > }
> >
> > if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > * We have a locked page in the page cache, now we need to check
> > * that it's up-to-date. If not, it is going to be due to an error.
> > */
> > - if (unlikely(!PageUptodate(page)))
> > + if (unlikely(!PageUptodate(page))) {
> > + /*
> > + * The page was in cache and uptodate and now it is not.
> > + * Strange but possible since we didn't hold the page lock all
> > + * the time. Let's drop everything get the invalidate lock and
> > + * try again.
> > + */
> > + if (!mapping_locked) {
> > + unlock_page(page);
> > + put_page(page);
> > + goto retry_find;
> > + }
> > goto page_not_uptodate;
> > + }
> >
> > /*
> > * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > unlock_page(page);
> > goto out_retry;
> > }
> > + if (mapping_locked)
> > + filemap_invalidate_unlock_shared(mapping);
> >
> > /*
> > * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >
> > if (!error || error == AOP_TRUNCATED_PAGE)
> > goto retry_find;
> > + filemap_invalidate_unlock_shared(mapping);
>
> Hm. I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?
Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:
if (unlikely(!PageUptodate(page))) {
/*
* The page was in cache and uptodate and now it is not.
* Strange but possible since we didn't hold the page lock all
* the time. Let's drop everything get the invalidate lock and
* try again.
*/
if (!mapping_locked) {
unlock_page(page);
put_page(page);
goto retry_find;
}
goto page_not_uptodate;
}
> >
> > return VM_FAULT_SIGBUS;
> >
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > */
> > if (page)
> > put_page(page);
> > + if (mapping_locked)
> > + filemap_invalidate_unlock_shared(mapping);
>
> Hm. I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
>
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes. Is that
> right?
Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).
There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2021-07-13 11:11 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-12 16:55 [PATCH 0/14 v9] fs: Hole punch vs page cache filling races Jan Kara
2021-07-12 16:55 ` [PATCH 01/14] mm: Fix comments mentioning i_mutex Jan Kara
2021-07-12 16:55 ` [PATCH 02/14] documentation: Sync file_operations members with reality Jan Kara
2021-07-13 1:02 ` Darrick J. Wong
2021-07-12 16:55 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-07-13 1:25 ` Darrick J. Wong
2021-07-13 11:11 ` Jan Kara [this message]
2021-07-13 6:25 ` Christoph Hellwig
2021-07-13 12:35 ` Jan Kara
2021-07-12 16:55 ` [PATCH 04/14] mm: Add functions to lock invalidate_lock for two mappings Jan Kara
2021-07-12 16:55 ` [PATCH 05/14] ext4: Convert to use mapping->invalidate_lock Jan Kara
2021-07-12 16:55 ` [PATCH 06/14] ext2: Convert to using invalidate_lock Jan Kara
2021-07-12 16:55 ` [PATCH 07/14] xfs: Refactor xfs_isilocked() Jan Kara
2021-07-12 16:55 ` [PATCH 08/14] xfs: Convert to use invalidate_lock Jan Kara
2021-07-12 16:56 ` [PATCH 09/14] xfs: Convert double locking of MMAPLOCK to use VFS helpers Jan Kara
2021-07-12 16:56 ` [PATCH 10/14] zonefs: Convert to using invalidate_lock Jan Kara
2021-07-12 16:56 ` [PATCH 11/14] f2fs: " Jan Kara
2021-07-12 16:56 ` [PATCH 12/14] fuse: " Jan Kara
2021-07-12 16:56 ` [PATCH 13/14] ceph: Fix race between hole punch and page fault Jan Kara
2021-07-12 16:56 ` [PATCH 14/14] cifs: " Jan Kara
-- strict thread matches above, loose matches on Subject: below --
2021-07-15 13:40 [PATCH 0/14 v10] fs: Hole punch vs page cache filling races Jan Kara
2021-07-15 13:40 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-15 9:17 [PATCH 0/14 v8] fs: Hole punch vs page cache filling races Jan Kara
2021-06-15 9:17 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-16 5:33 ` Christoph Hellwig
2021-06-17 16:15 ` Darrick J. Wong
2021-06-07 14:52 [PATCH 0/14 v7] fs: Hole punch vs page cache filling races Jan Kara
2021-06-07 14:52 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-07 16:09 ` Darrick J. Wong
2021-06-08 12:19 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210713111139.GG12142@quack2.suse.cz \
--to=jack@suse.cz \
--cc=ceph-devel@vger.kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).