linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@lst.de>,
	linux-mm@kvack.org, linux-cachefs@redhat.com,
	linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org,
	linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
	v9fs-developer@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@redhat.com>,
	David Wysochanski <dwysocha@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 03/33] mm: Implement readahead_control pageset expansion
Date: Wed, 17 Feb 2021 16:13:58 +0000	[thread overview]
Message-ID: <20210217161358.GM2858050@casper.infradead.org> (raw)
In-Reply-To: <161340389201.1303470.14353807284546854878.stgit@warthog.procyon.org.uk>

On Mon, Feb 15, 2021 at 03:44:52PM +0000, David Howells wrote:
> +++ b/include/linux/pagemap.h
> @@ -761,6 +761,8 @@ extern void __delete_from_page_cache(struct page *page, void *shadow);
>  int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask);
>  void delete_from_page_cache_batch(struct address_space *mapping,
>  				  struct pagevec *pvec);
> +void readahead_expand(struct readahead_control *ractl,
> +		      loff_t new_start, size_t new_len);

If we're revising this patchset, I'd rather this lived with the other
readahead declarations, ie after the definition of readahead_control.

> +	/* Expand the trailing edge upwards */
> +	while (ractl->_nr_pages < new_nr_pages) {
> +		unsigned long index = ractl->_index + ractl->_nr_pages;
> +		struct page *page = xa_load(&mapping->i_pages, index);
> +
> +		if (page && !xa_is_value(page))
> +			return; /* Page apparently present */
> +
> +		page = __page_cache_alloc(gfp_mask);
> +		if (!page)
> +			return;
> +		if (add_to_page_cache_lru(page, mapping, index, gfp_mask) < 0) {
> +			put_page(page);
> +			return;
> +		}
> +		ractl->_nr_pages++;
> +	}

We're defeating the ondemand_readahead() algorithm here.  Let's suppose
userspace is doing 64kB reads, the filesystem is OrangeFS which only
wants to do 4MB reads, the page cache is initially empty and there's
only one thread doing a sequential read.  ondemand_readahead() calls
get_init_ra_size() which tells it to allocate 128kB and set the async
marker at 64kB.  Then orangefs calls readahead_expand() to allocate the
remainder of the 4MB.  After the app has read the first 64kB, it comes
back to read the next 64kB, sees the readahead marker and tries to trigger
the next batch of readahead, but it's already present, so it does nothing
(see page_cache_ra_unbounded() for what happens with pages present).

Then it keeps going through the 4MB that's been read, not seeing any more
readahead markers, gets to 4MB and asks for ... 256kB?  Not quite sure.
Anyway, it then has to wait for the next 4MB because the readahead didn't
overlap with the application processing.

So readahead_expand() needs to adjust the file's f_ra so that when the
application gets to 64kB, it kicks off the readahead of 4MB-8MB chunk (and
then when we get to 4MB+256kB, it kicks off the readahead of 8MB-12MB,
and so on).

Unless someone sees a better way to do this?  I don't
want to inadvertently break POSIX_FADV_WILLNEED which calls
force_page_cache_readahead() and should not perturb the kernel's
ondemand algorithm.  Perhaps we need to add an 'ra' pointer to the
ractl to indicate whether the file_ra_state should be updated by
readahead_expand()?


  parent reply	other threads:[~2021-02-17 16:17 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-15 15:44 [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] David Howells
2021-02-15 15:44 ` [PATCH 01/33] iov_iter: Add ITER_XARRAY David Howells
2021-02-15 15:44 ` [PATCH 02/33] mm: Add an unlock function for PG_private_2/PG_fscache David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-15 15:44 ` [PATCH 03/33] mm: Implement readahead_control pageset expansion David Howells
2021-02-16 10:32   ` Christoph Hellwig
2021-02-16 13:22     ` Matthew Wilcox
2021-02-17 14:36       ` Mike Marshall
2021-02-17 15:42       ` David Howells
2021-02-17 16:59         ` Mike Marshall
2021-02-17 22:20         ` David Howells
2021-02-16 11:48   ` David Howells
2021-02-17 16:13   ` Matthew Wilcox [this message]
2021-02-17 22:34   ` David Howells
2021-02-17 22:49     ` Matthew Wilcox
2021-02-18 17:47   ` David Howells
2021-02-15 15:45 ` [PATCH 04/33] vfs: Export rw_verify_area() for use by cachefiles David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-16 11:55   ` David Howells
2021-02-15 15:45 ` [PATCH 05/33] netfs: Make a netfs helper module David Howells
2021-02-15 15:45 ` [PATCH 06/33] netfs, mm: Move PG_fscache helper funcs to linux/netfs.h David Howells
2021-02-15 15:45 ` [PATCH 07/33] netfs, mm: Add unlock_page_fscache() and wait_on_page_fscache() David Howells
2021-02-15 15:45 ` [PATCH 08/33] netfs: Provide readahead and readpage netfs helpers David Howells
2021-02-15 15:45 ` [PATCH 09/33] netfs: Add tracepoints David Howells
2021-02-15 15:46 ` [PATCH 10/33] netfs: Gather stats David Howells
2021-02-15 15:46 ` [PATCH 11/33] netfs: Add write_begin helper David Howells
2021-02-15 15:46 ` [PATCH 12/33] netfs: Define an interface to talk to a cache David Howells
2021-02-15 15:46 ` [PATCH 13/33] netfs: Hold a ref on a page when PG_private_2 is set David Howells
2021-02-15 18:05 ` [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] Jeff Layton
2021-02-16  0:40   ` Steve French
2021-02-16  2:10     ` Matthew Wilcox
2021-02-16  5:18       ` Steve French
2021-02-16  5:22       ` Steve French
2021-02-23 20:27         ` Matthew Wilcox
2021-02-24  4:57           ` Steve French
2021-02-24 13:32       ` David Howells
2021-02-24 15:51         ` Matthew Wilcox
2021-02-16 11:01     ` Jeff Layton
2021-02-15 22:46 ` [PATCH 34/33] netfs: Use in_interrupt() not in_softirq() David Howells
2021-02-16  8:42   ` Christoph Hellwig
2021-02-16  9:06     ` Sebastian Andrzej Siewior
2021-02-16  9:29   ` David Howells
2021-02-16  9:30     ` Christoph Hellwig
2021-02-18 14:02     ` [PATCH 34/33] netfs: Pass flag rather than use in_softirq() David Howells
2021-02-18 15:06       ` Marc Dionne
2021-02-18 15:16       ` Marc Dionne
2021-02-19  9:01       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210217161358.GM2858050@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=dwysocha@redhat.com \
    --cc=hch@lst.de \
    --cc=jlayton@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=sfrench@samba.org \
    --cc=trondmy@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).