From: Matthew Wilcox <willy@infradead.org>
To: Shyam Prasad N <nspmangalore@gmail.com>
Cc: David Howells <dhowells@redhat.com>,
Steve French <smfrench@gmail.com>,
CIFS <linux-cifs@vger.kernel.org>
Subject: Re: Classification of reads within a filesystem
Date: Wed, 21 Jul 2021 18:52:05 +0100 [thread overview]
Message-ID: <YPhexTyuuE0/Wxf5@casper.infradead.org> (raw)
In-Reply-To: <CANT5p=p+f6mrQqKULqJdbyDN-NJoQCsGruvVMH+BUJU0-n62rg@mail.gmail.com>
On Wed, Jul 21, 2021 at 10:38:59PM +0530, Shyam Prasad N wrote:
> In a scenario where a user/application issues a readahead/fadvise for
> large data ranges in advance (informing the kernel that they intend to
> read these data ranges soon). Depending on how much data ranges these
> calls cover, it could keep the network quite busy for a network
> filesystem (or the disk for a block filesystem).
>
> I see some value if filesystems have the ability to differentiate the
> reads from regular buffered reads by users. In such cases, the
> filesystem can choose to throttle the readahead reads, so that there's
> a specified bandwidth that's still available for regular reads.
>
> I wanted to get your opinions about this. And whether this can be done
> already in VFS ->readahead and ->readpage calls in the filesystems?
This is something I have an interest in, but haven't had time to pursue.
The readahead code gets this information because the page cache
calls page_cache_sync_ra() if it needs this page right now, and calls
page_cache_async_ra() if it thinks it will need the page in the future.
ondemand_readahead() currently gets a true/false parameter
(hit_readahead_marker), although my folio patches change it to pass in
a folio or NULL. That is then *not* passed to the filesystem, but it
could be information passed in the ractl.
There's also some tidying-up to be done around faulting. Currently
fault-around doesn't have a way to express "read me all the pages around
page N". Instead it just assumes that pages N-R/2 to N+R/2 are the
right ones to fetch when it should be left up to the filesystem or the
readahead code to determine what window of pages to fetch.
Another thing I have an interest in doing but not had opportunity to
pursue is making ->readpage synchronous. The current MM code always
calls ->readahead first and only calls ->readpage if ->readahead fails.
That means that all the async ->readpage work is actually wrong; we
want to return the best error possible from ->readpage, even if that
means sleeping.
Oh ... except for swap. For NFS only, it calls ->readpage, so it really
wants ->readpage to be async so it can kick off multiple pages and
then wait for the one it actually needs. That gets into a conversation
about how much we really care about swap-over-NFS, whether swap should
be using ->readpage or ->direct_IO, and whether swap should use the
file readahead code or its own virtual address based readahead code.
Most of those discussions are outside my area of expertise.
next prev parent reply other threads:[~2021-07-21 17:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-21 17:08 Classification of reads within a filesystem Shyam Prasad N
2021-07-21 17:52 ` Matthew Wilcox [this message]
2021-07-22 4:14 ` Christoph Hellwig
2021-07-23 10:37 ` Shyam Prasad N
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YPhexTyuuE0/Wxf5@casper.infradead.org \
--to=willy@infradead.org \
--cc=dhowells@redhat.com \
--cc=linux-cifs@vger.kernel.org \
--cc=nspmangalore@gmail.com \
--cc=smfrench@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).