All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: kernel-team@fb.com, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org, tj@kernel.org,
	akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, riel@fb.com, linux-mm@kvack.org
Subject: Re: [PATCH 2/7] mm: drop mmap_sem for page cache read IO submission
Date: Fri, 19 Oct 2018 14:14:46 +1100	[thread overview]
Message-ID: <20181019031446.GH18822@dastard> (raw)
In-Reply-To: <20181018202318.9131-3-josef@toxicpanda.com>

On Thu, Oct 18, 2018 at 04:23:13PM -0400, Josef Bacik wrote:
> From: Johannes Weiner <hannes@cmpxchg.org>
> 
> Reads can take a long time, and if anybody needs to take a write lock on
> the mmap_sem it'll block any subsequent readers to the mmap_sem while
> the read is outstanding, which could cause long delays.  Instead drop
> the mmap_sem if we do any reads at all.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
.....
>  vm_fault_t filemap_fault(struct vm_fault *vmf)
>  {
>  	int error;
> +	struct mm_struct *mm = vmf->vma->vm_mm;
>  	struct file *file = vmf->vma->vm_file;
>  	struct address_space *mapping = file->f_mapping;
>  	struct file_ra_state *ra = &file->f_ra;
>  	struct inode *inode = mapping->host;
>  	pgoff_t offset = vmf->pgoff;
> +	int flags = vmf->flags;

local copy of flags.

>  	pgoff_t max_off;
>  	struct page *page;
>  	vm_fault_t ret = 0;
> @@ -2509,27 +2540,44 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>  	 * Do we have something in the page cache already?
>  	 */
>  	page = find_get_page(mapping, offset);
> -	if (likely(page) && !(vmf->flags & FAULT_FLAG_TRIED)) {
> +	if (likely(page) && !(flags & FAULT_FLAG_TRIED)) {

Used here.

>  		/*
>  		 * We found the page, so try async readahead before
>  		 * waiting for the lock.
>  		 */
> -		do_async_mmap_readahead(vmf->vma, ra, file, page, offset);
> +		error = do_async_mmap_readahead(vmf->vma, ra, file, page, offset, vmf->flags);

Not here.

> +		if (error == -EAGAIN)
> +			goto out_retry_wait;
>  	} else if (!page) {
>  		/* No page in the page cache at all */
> -		do_sync_mmap_readahead(vmf->vma, ra, file, offset);
> -		count_vm_event(PGMAJFAULT);
> -		count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
>  		ret = VM_FAULT_MAJOR;
> +		count_vm_event(PGMAJFAULT);
> +		count_memcg_event_mm(mm, PGMAJFAULT);
> +		error = do_sync_mmap_readahead(vmf->vma, ra, file, offset, vmf->flags);

or here.

(Also, the vmf is passed through to where these flags
are used, so why is it passed as a separate flag parameter?)

> +		if (error == -EAGAIN)
> +			goto out_retry_wait;
>  retry_find:
>  		page = find_get_page(mapping, offset);
>  		if (!page)
>  			goto no_cached_page;
>  	}
>  
> -	if (!lock_page_or_retry(page, vmf->vma->vm_mm, vmf->flags)) {
> -		put_page(page);
> -		return ret | VM_FAULT_RETRY;
> +	if (!trylock_page(page)) {
> +		if (flags & FAULT_FLAG_ALLOW_RETRY) {
> +			if (flags & FAULT_FLAG_RETRY_NOWAIT)
> +				goto out_retry;
> +			up_read(&mm->mmap_sem);
> +			goto out_retry_wait;
> +		}
> +		if (flags & FAULT_FLAG_KILLABLE) {

but is used here...

> +			int ret = __lock_page_killable(page);
> +
> +			if (ret) {
> +				up_read(&mm->mmap_sem);
> +				goto out_retry;
> +			}
> +		} else
> +			__lock_page(page);
>  	}
>  
>  	/* Did it get truncated? */
> @@ -2607,6 +2655,19 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
>  	/* Things didn't work out. Return zero to tell the mm layer so. */
>  	shrink_readahead_size_eio(file, ra);
>  	return VM_FAULT_SIGBUS;
> +
> +out_retry_wait:
> +	if (page) {
> +		if (flags & FAULT_FLAG_KILLABLE)

and here.

Any reason for this discrepancy?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-10-19  3:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18 20:23 [PATCH 0/7][V3] drop the mmap_sem when doing IO in the fault path Josef Bacik
2018-10-18 20:23 ` [PATCH 1/7] mm: infrastructure for page fault page caching Josef Bacik
2018-10-18 20:23 ` [PATCH 2/7] mm: drop mmap_sem for page cache read IO submission Josef Bacik
2018-10-19  3:14   ` Dave Chinner [this message]
2018-10-18 20:23 ` [PATCH 3/7] mm: drop the mmap_sem in all read fault cases Josef Bacik
2018-10-19  3:21   ` Dave Chinner
2018-10-18 20:23 ` [PATCH 4/7] mm: use the cached page for filemap_fault Josef Bacik
2018-10-19  3:27   ` Dave Chinner
2018-10-18 20:23 ` [PATCH 5/7] mm: add a flag to indicate we used a cached page Josef Bacik
2018-10-19  3:34   ` Dave Chinner
2018-10-18 20:23 ` [PATCH 6/7] mm: allow ->page_mkwrite to do retries Josef Bacik
2018-10-19  3:36   ` Dave Chinner
2018-10-18 20:23 ` [PATCH 7/7] btrfs: drop mmap_sem in mkwrite for btrfs Josef Bacik
2018-10-19  3:48   ` Dave Chinner
2018-10-22 17:56     ` Josef Bacik
2018-10-22 21:31       ` Dave Chinner
2018-10-25 13:22   ` Jan Kara
2018-10-25 13:58     ` Josef Bacik
2018-10-26  9:47       ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181019031446.GH18822@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.