linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Steven Whitehouse <swhiteho@redhat.com>
Subject: Re: [PATCH] mm/filemap: do not allocate cache pages beyond end of file at read
Date: Mon, 28 Oct 2019 15:42:22 +0300	[thread overview]
Message-ID: <20191028124222.ld6u3dhhujfqcn7w@box> (raw)
In-Reply-To: <157225677483.3442.4227193290486305330.stgit@buzz>

On Mon, Oct 28, 2019 at 12:59:34PM +0300, Konstantin Khlebnikov wrote:
> Page cache could contain pages beyond end of file during write or
> if read races with truncate. But generic_file_buffered_read() always
> allocates unneeded pages beyond eof if somebody reads here and one
> extra page at the end if file size is page-aligned.
> 
> Function generic_file_buffered_read() calls page_cache_sync_readahead()
> if page not found in cache and then do another lookup. Readahead checks
> file size in __do_page_cache_readahead() before allocating pages.
> After that generic_file_buffered_read() falls back to slow path and
> allocates page for ->readpage() without checking file size.
> 
> This patch checks file size before allocating page for ->readpage().
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> ---
>  mm/filemap.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 85b7d087eb45..92abf5f348a9 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2225,6 +2225,10 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
>  		goto out;
>  
>  no_cached_page:
> +		/* Do not allocate cache pages beyond end of file. */
> +		if (((loff_t)index << PAGE_SHIFT) >= i_size_read(inode))
> +			goto out;
> +
>  		/*
>  		 * Ok, it wasn't cached, so we need to create a new
>  		 * page..
> 
> 

CC Steven.

I've tried something of this sort back in 2013:

http://lore.kernel.org/r/1377099441-2224-1-git-send-email-kirill.shutemov@linux.intel.com

and I've got push back.

Apparently, some filesystems may not have valid i_size before >readpage().
Not sure if it's still the case...

Anyway I don't think it's valid reason for this inefficiency. These
filesystems have to have own implementation of >read_iter() to deal with
this.

-- 
 Kirill A. Shutemov

  parent reply	other threads:[~2019-10-28 12:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28  9:59 [PATCH] mm/filemap: do not allocate cache pages beyond end of file at read Konstantin Khlebnikov
2019-10-28 12:39 ` Linus Torvalds
2019-10-28 12:42 ` Kirill A. Shutemov [this message]
2019-10-28 12:47   ` Linus Torvalds
2019-10-28 12:57     ` Kirill A. Shutemov
2019-10-29 14:25       ` Konstantin Khlebnikov
2019-10-29 16:52         ` Linus Torvalds
2019-10-30  6:50           ` Kirill A. Shutemov
2019-10-30  7:02             ` Linus Torvalds
2019-10-30 10:34           ` Steven Whitehouse
2019-10-30 10:54             ` Linus Torvalds
2019-10-31 11:40               ` Steven Whitehouse
2019-11-22 23:59                 ` Andreas Grünbacher
2019-11-25 10:52                   ` Steven Whitehouse
2019-11-25 17:05                     ` Linus Torvalds
2019-11-27 15:41                       ` Steven Whitehouse
2019-11-27 16:29                         ` Andreas Gruenbacher
2019-11-27 17:29                         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191028124222.ld6u3dhhujfqcn7w@box \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=swhiteho@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).