All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Matthew Wilcox <willy@infradead.org>, Theodore Ts'o <tytso@mit.edu>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-block@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations
Date: Wed, 1 Mar 2023 12:49:10 +0800	[thread overview]
Message-ID: <c6612406-11c7-2158-5186-ebee72c9b698@linux.alibaba.com> (raw)
In-Reply-To: <Y/7WJMNLjrQ+/+Vs@casper.infradead.org>

Hi Matthew!

On 2023/3/1 12:35, Matthew Wilcox wrote:
> On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote:
>> For example, most cloud storage devices are doing read-ahead to try to
>> anticipate read requests from the VM.  This can interfere with the
>> read-ahead being done by the guest kernel.  So being able to tell
>> cloud storage device whether a particular read request is stemming
>> from a read-ahead or not.  At the moment, as Matthew Wilcox has
>> pointed out, we currently use the read-ahead code path for synchronous
>> buffered reads.  So plumbing this information so it can passed through
>> multiple levels of the mm, fs, and block layers will probably be
>> needed.
> 
> This shouldn't be _too_ painful.  For example, the NVMe driver already
> does the right thing:
> 
>          if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD))
>                  control |= NVME_RW_LR;
> 
>          if (req->cmd_flags & REQ_RAHEAD)
>                  dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH;
> 
> (LR is Limited Retry; FREQ_PREFETCH is "Speculative read. The command
> is part of a prefetch operation")
> 
> The only problem is that the readahead code doesn't tell the filesystem
> whether the request is sync or async.  This should be a simple matter
> of adding a new 'bool async' to the readahead_control and then setting
> REQ_RAHEAD based on that, rather than on whether the request came in
> through readahead() or read_folio() (eg see mpage_readahead()).

Great!  In addition to that, just (somewhat) off topic, if we have a
"bool async" now, I think it will immediately have some users (such as
EROFS), since we'd like to do post-processing (such as decompression)
immediately in the same context with sync readahead (due to missing
pages) and leave it to another kworker for async readahead (I think
it's almost same for decryption and verification).

So "bool async" is quite useful on my side if it could be possible
passed to fs side.  I'd like to raise my hands to have it.

Thanks,
Gao Xiang

> 
> Another thing to fix is that SCSI doesn't do anything with the REQ_RAHEAD
> flag, so I presume T10 has some work to do (maybe they could borrow the
> Access Frequency field from NVMe, since that was what the drive vendors
> told us they wanted; maybe they changed their minds since).

  reply	other threads:[~2023-03-01  4:49 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  3:52 [LSF/MM/BPF TOPIC] Cloud storage optimizations Theodore Ts'o
2023-03-01  4:18 ` Gao Xiang
2023-03-01  4:40   ` Matthew Wilcox
2023-03-01  4:59     ` Gao Xiang
2023-03-01  4:35 ` Matthew Wilcox
2023-03-01  4:49   ` Gao Xiang [this message]
2023-03-01  5:01     ` Matthew Wilcox
2023-03-01  5:09       ` Gao Xiang
2023-03-01  5:19         ` Gao Xiang
2023-03-01  5:42         ` Matthew Wilcox
2023-03-01  5:51           ` Gao Xiang
2023-03-01  6:00             ` Gao Xiang
2023-03-02  3:13 ` Chaitanya Kulkarni
2023-03-02  3:50 ` Darrick J. Wong
2023-03-03  3:03   ` Martin K. Petersen
2023-03-02 20:30 ` Bart Van Assche
2023-03-03  3:05   ` Martin K. Petersen
2023-03-03  1:58 ` Keith Busch
2023-03-03  3:49   ` Matthew Wilcox
2023-03-03 11:32     ` Hannes Reinecke
2023-03-03 13:11     ` James Bottomley
2023-03-04  7:34       ` Matthew Wilcox
2023-03-04 13:41         ` James Bottomley
2023-03-04 16:39           ` Matthew Wilcox
2023-03-05  4:15             ` Luis Chamberlain
2023-03-05  5:02               ` Matthew Wilcox
2023-03-08  6:11                 ` Luis Chamberlain
2023-03-08  7:59                   ` Dave Chinner
2023-03-06 12:04               ` Hannes Reinecke
2023-03-06  3:50             ` James Bottomley
2023-03-04 19:04         ` Luis Chamberlain
2023-03-03 21:45     ` Luis Chamberlain
2023-03-03 22:07       ` Keith Busch
2023-03-03 22:14         ` Luis Chamberlain
2023-03-03 22:32           ` Keith Busch
2023-03-03 23:09             ` Luis Chamberlain
2023-03-16 15:29             ` Pankaj Raghav
2023-03-16 15:41               ` Pankaj Raghav
2023-03-03 23:51       ` Bart Van Assche
2023-03-04 11:08       ` Hannes Reinecke
2023-03-04 13:24         ` Javier González
2023-03-04 16:47         ` Matthew Wilcox
2023-03-04 17:17           ` Hannes Reinecke
2023-03-04 17:54             ` Matthew Wilcox
2023-03-04 18:53               ` Luis Chamberlain
2023-03-05  3:06               ` Damien Le Moal
2023-03-05 11:22               ` Hannes Reinecke
2023-03-06  8:23                 ` Matthew Wilcox
2023-03-06 10:05                   ` Hannes Reinecke
2023-03-06 16:12                   ` Theodore Ts'o
2023-03-08 17:53                     ` Matthew Wilcox
2023-03-08 18:13                       ` James Bottomley
2023-03-09  8:04                         ` Javier González
2023-03-09 13:11                           ` James Bottomley
2023-03-09 14:05                             ` Keith Busch
2023-03-09 15:23                             ` Martin K. Petersen
2023-03-09 20:49                               ` James Bottomley
2023-03-09 21:13                                 ` Luis Chamberlain
2023-03-09 21:28                                   ` Martin K. Petersen
2023-03-10  1:16                                     ` Dan Helmick
2023-03-10  7:59                             ` Javier González
2023-03-08 19:35                 ` Luis Chamberlain
2023-03-08 19:55                 ` Bart Van Assche
2023-03-03  2:54 ` Martin K. Petersen
2023-03-03  3:29   ` Keith Busch
2023-03-03  4:20   ` Theodore Ts'o
2023-07-16  4:09 BELINDA Goodpaster kelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6612406-11c7-2158-5186-ebee72c9b698@linux.alibaba.com \
    --to=hsiangkao@linux.alibaba.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.