From: David Howells <email@example.com> To: Matthew Wilcox <firstname.lastname@example.org> Cc: email@example.com, firstname.lastname@example.org, Kent Overstreet <email@example.com>, Mike Marshall <firstname.lastname@example.org> Subject: Re: The future of readahead Date: Thu, 27 Aug 2020 18:02:18 +0100 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <20200826193116.GU17456@casper.infradead.org> Matthew Wilcox <firstname.lastname@example.org> wrote: > So solving #2 and #3 looks like a new interface for filesystems to call: > > void readahead_expand(struct readahead_control *rac, loff_t start, u64 len); > or possibly > void readahead_expand(struct readahead_control *rac, pgoff_t start, > unsigned int count); > > It might not actually expand the readahead attempt at all -- for example, > if there's already a page in the page cache, or if it can't allocate > memory. But this puts the responsibility for allocating pages in the VFS, > where it belongs. This is exactly what the fscache read helper in my fscache rewrite is doing, except that I'm doing it in fs/fscache/read_helper.c. Have a look here: https://email@example.com/ and look for the fscache_read_helper() function. Note that it's slighly complicated because it handles ->readpage(), ->readpages() and ->write_begin()[*]. [*] I want to be able to bring the granule into the cache for modification. Ideally I'd be able to see that the entire granule is going to get written over and skip - kind of like write_begin for a whole granule rather than a page. Shaping the readahead request has the following issues: (1) The request may span multiple granules. (2) Those granules may be a mixture of cached and uncached. (3) The granule size may vary. (4) Granules fall on power-of-2 boundaries (for example 256K boundaries) within the file, but the request may not start on a boundary and may not end on one. To deal with this, fscache_read_helper() calls out to the cache backend (fscache_shape_request()) and the netfs (req->ops->reshape()) to adjust the read it's going to make. Shaping the request may mean moving the start earlier as well as expanding or contracting the size. The only thing that's guaranteed is that the first page of the request will be retained. I also don't let a request cross a cached/uncached boundary, but rather cut the request off there and return. The filesystem can then generate a new request and call back in. (Note that I have to be able to keep track of the filesystem's metadata so that I can reissue the request to the netfs in the event that cache suffers some sort of error). What I was originally envisioning for the new ->readahead() interface is add a second aop that allows the shaping to be accessed by the VM, before it's started pinning any pages. The shaping parameters I think we need are: - The inode, for i_size and fscache cookie - The proposed page range and what you would get back could be: - Shaped page range - Minimum I/O granularity - Minimum preferred granularity - Flag indicating if the pages can just be zero-filled  The filesystem doesn't want to read in smaller chunks than this.  The cache doesn't want to read in smaller chunks than this, though in the cache's case, a partially read block is just abandoned for the moment. This number would allow the readahead algorithm to shorten the request if it can't allocate a page.  If I know that the local i_size is much bigger than the i_size on the server, there's no need to download/read those pages and readahead can just clear them. This is more applicable to write_begin() normally. Now a chunk of this is in struct readahead_control, so it might be reasonable to add the other bits there too. Note that one thing I really would like to avoid having to do is to expand a request forward, particularly if the main page of interest is precreated and locked by the VM before calling the filesystem. I would much rather the VM created the pages, starting from the lowest-numbered. Anyway, that's my 2p. David
next prev parent reply other threads:[~2020-08-27 17:02 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-08-26 19:31 Matthew Wilcox 2020-08-27 17:02 ` David Howells [this message] 2020-08-27 17:21 ` Matthew Wilcox
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: The future of readahead' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).