All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>, Ted Tso <tytso@mit.edu>,
	Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, linux-xfs@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock
Date: Tue, 13 Jul 2021 13:11:39 +0200	[thread overview]
Message-ID: <20210713111139.GG12142@quack2.suse.cz> (raw)
In-Reply-To: <20210713012514.GB22402@magnolia>

On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	pgoff_t max_off;
> >  	struct page *page;
> >  	vm_fault_t ret = 0;
> > +	bool mapping_locked = false;
> >  
> >  	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> >  	if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> >  		ret = VM_FAULT_MAJOR;
> >  		fpin = do_sync_mmap_readahead(vmf);
> > +	}
> > +
> > +	if (!page) {
> 
> Is it still necessary to re-evaluate !page here?

No, you are right it is not necessary. I'll remove it.

> >  retry_find:
> > +		/*
> > +		 * See comment in filemap_create_page() why we need
> > +		 * invalidate_lock
> > +		 */
> > +		if (!mapping_locked) {
> > +			filemap_invalidate_lock_shared(mapping);
> > +			mapping_locked = true;
> > +		}
> >  		page = pagecache_get_page(mapping, offset,
> >  					  FGP_CREAT|FGP_FOR_MMAP,
> >  					  vmf->gfp_mask);
> >  		if (!page) {
> >  			if (fpin)
> >  				goto out_retry;
> > +			filemap_invalidate_unlock_shared(mapping);
> >  			return VM_FAULT_OOM;
> >  		}
> > +	} else if (unlikely(!PageUptodate(page))) {
> > +		filemap_invalidate_lock_shared(mapping);
> > +		mapping_locked = true;
> >  	}
> >  
> >  	if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 * We have a locked page in the page cache, now we need to check
> >  	 * that it's up-to-date. If not, it is going to be due to an error.
> >  	 */
> > -	if (unlikely(!PageUptodate(page)))
> > +	if (unlikely(!PageUptodate(page))) {
> > +		/*
> > +		 * The page was in cache and uptodate and now it is not.
> > +		 * Strange but possible since we didn't hold the page lock all
> > +		 * the time. Let's drop everything get the invalidate lock and
> > +		 * try again.
> > +		 */
> > +		if (!mapping_locked) {
> > +			unlock_page(page);
> > +			put_page(page);
> > +			goto retry_find;
> > +		}
> >  		goto page_not_uptodate;
> > +	}
> >  
> >  	/*
> >  	 * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		unlock_page(page);
> >  		goto out_retry;
> >  	}
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> >  
> >  	/*
> >  	 * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  
> >  	if (!error || error == AOP_TRUNCATED_PAGE)
> >  		goto retry_find;
> > +	filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?

Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:

        if (unlikely(!PageUptodate(page))) {
                /*
                 * The page was in cache and uptodate and now it is not.
                 * Strange but possible since we didn't hold the page lock all
                 * the time. Let's drop everything get the invalidate lock and
                 * try again.
                 */
                if (!mapping_locked) {
                        unlock_page(page);
                        put_page(page);
                        goto retry_find;
                }
                goto page_not_uptodate;
        }

> >  
> >  	return VM_FAULT_SIGBUS;
> >  
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 */
> >  	if (page)
> >  		put_page(page);
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
> 
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes.   Is that
> right?

Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).

There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID
From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>,
	linux-cifs@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-f2fs-devel@lists.sourceforge.net,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Ted Tso <tytso@mit.edu>,
	ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [f2fs-dev] [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock
Date: Tue, 13 Jul 2021 13:11:39 +0200	[thread overview]
Message-ID: <20210713111139.GG12142@quack2.suse.cz> (raw)
In-Reply-To: <20210713012514.GB22402@magnolia>

On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	pgoff_t max_off;
> >  	struct page *page;
> >  	vm_fault_t ret = 0;
> > +	bool mapping_locked = false;
> >  
> >  	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> >  	if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> >  		ret = VM_FAULT_MAJOR;
> >  		fpin = do_sync_mmap_readahead(vmf);
> > +	}
> > +
> > +	if (!page) {
> 
> Is it still necessary to re-evaluate !page here?

No, you are right it is not necessary. I'll remove it.

> >  retry_find:
> > +		/*
> > +		 * See comment in filemap_create_page() why we need
> > +		 * invalidate_lock
> > +		 */
> > +		if (!mapping_locked) {
> > +			filemap_invalidate_lock_shared(mapping);
> > +			mapping_locked = true;
> > +		}
> >  		page = pagecache_get_page(mapping, offset,
> >  					  FGP_CREAT|FGP_FOR_MMAP,
> >  					  vmf->gfp_mask);
> >  		if (!page) {
> >  			if (fpin)
> >  				goto out_retry;
> > +			filemap_invalidate_unlock_shared(mapping);
> >  			return VM_FAULT_OOM;
> >  		}
> > +	} else if (unlikely(!PageUptodate(page))) {
> > +		filemap_invalidate_lock_shared(mapping);
> > +		mapping_locked = true;
> >  	}
> >  
> >  	if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 * We have a locked page in the page cache, now we need to check
> >  	 * that it's up-to-date. If not, it is going to be due to an error.
> >  	 */
> > -	if (unlikely(!PageUptodate(page)))
> > +	if (unlikely(!PageUptodate(page))) {
> > +		/*
> > +		 * The page was in cache and uptodate and now it is not.
> > +		 * Strange but possible since we didn't hold the page lock all
> > +		 * the time. Let's drop everything get the invalidate lock and
> > +		 * try again.
> > +		 */
> > +		if (!mapping_locked) {
> > +			unlock_page(page);
> > +			put_page(page);
> > +			goto retry_find;
> > +		}
> >  		goto page_not_uptodate;
> > +	}
> >  
> >  	/*
> >  	 * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  		unlock_page(page);
> >  		goto out_retry;
> >  	}
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> >  
> >  	/*
> >  	 * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  
> >  	if (!error || error == AOP_TRUNCATED_PAGE)
> >  		goto retry_find;
> > +	filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?

Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:

        if (unlikely(!PageUptodate(page))) {
                /*
                 * The page was in cache and uptodate and now it is not.
                 * Strange but possible since we didn't hold the page lock all
                 * the time. Let's drop everything get the invalidate lock and
                 * try again.
                 */
                if (!mapping_locked) {
                        unlock_page(page);
                        put_page(page);
                        goto retry_find;
                }
                goto page_not_uptodate;
        }

> >  
> >  	return VM_FAULT_SIGBUS;
> >  
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >  	 */
> >  	if (page)
> >  		put_page(page);
> > +	if (mapping_locked)
> > +		filemap_invalidate_unlock_shared(mapping);
> 
> Hm.  I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
> 
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes.   Is that
> right?

Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).

There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2021-07-13 11:11 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12 16:55 [PATCH 0/14 v9] fs: Hole punch vs page cache filling races Jan Kara
2021-07-12 16:55 ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 01/14] mm: Fix comments mentioning i_mutex Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 02/14] documentation: Sync file_operations members with reality Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-13  1:02   ` Darrick J. Wong
2021-07-13  1:02     ` [f2fs-dev] " Darrick J. Wong
2021-07-12 16:55 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-13  1:25   ` Darrick J. Wong
2021-07-13  1:25     ` [f2fs-dev] " Darrick J. Wong
2021-07-13 11:11     ` Jan Kara [this message]
2021-07-13 11:11       ` Jan Kara
2021-07-13  6:25   ` Christoph Hellwig
2021-07-13  6:25     ` [f2fs-dev] " Christoph Hellwig
2021-07-13 12:35     ` Jan Kara
2021-07-13 12:35       ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 04/14] mm: Add functions to lock invalidate_lock for two mappings Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 05/14] ext4: Convert to use mapping->invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 06/14] ext2: Convert to using invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 07/14] xfs: Refactor xfs_isilocked() Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:55 ` [PATCH 08/14] xfs: Convert to use invalidate_lock Jan Kara
2021-07-12 16:55   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 09/14] xfs: Convert double locking of MMAPLOCK to use VFS helpers Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 10/14] zonefs: Convert to using invalidate_lock Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 11/14] f2fs: " Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 12/14] fuse: " Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 13/14] ceph: Fix race between hole punch and page fault Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
2021-07-12 16:56 ` [PATCH 14/14] cifs: " Jan Kara
2021-07-12 16:56   ` [f2fs-dev] " Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2021-07-15 13:40 [PATCH 0/14 v10] fs: Hole punch vs page cache filling races Jan Kara
2021-07-15 13:40 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-15  9:17 [PATCH 0/14 v8] fs: Hole punch vs page cache filling races Jan Kara
2021-06-15  9:17 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-16  5:33   ` Christoph Hellwig
2021-06-17 16:15   ` Darrick J. Wong
2021-06-07 14:52 [PATCH 0/14 v7] fs: Hole punch vs page cache filling races Jan Kara
2021-06-07 14:52 ` [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock Jan Kara
2021-06-07 16:09   ` Darrick J. Wong
2021-06-08 12:19     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210713111139.GG12142@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ceph-devel@vger.kernel.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH 03/14] mm: Protect operations adding pages to page cache with invalidate_lock' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.