linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Steve French <smfrench@gmail.com>,
	Jeff Layton <jlayton@redhat.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	Steve French <sfrench@samba.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	CIFS <linux-cifs@vger.kernel.org>,
	ceph-devel@vger.kernel.org, linux-cachefs@redhat.com,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-mm <linux-mm@kvack.org>,
	linux-afs@lists.infradead.org,
	v9fs-developer@lists.sourceforge.net,
	Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	David Wysochanski <dwysocha@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	William Kucharski <william.kucharski@oracle.com>
Subject: Re: [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3]
Date: Wed, 24 Feb 2021 15:51:21 +0000	[thread overview]
Message-ID: <20210224155121.GQ2858050@casper.infradead.org> (raw)
In-Reply-To: <3743319.1614173522@warthog.procyon.org.uk>

On Wed, Feb 24, 2021 at 01:32:02PM +0000, David Howells wrote:
> Steve French <smfrench@gmail.com> wrote:
> 
> > This (readahead behavior improvements in Linux, on single large file
> > sequential read workloads like cp or grep) gets particularly interesting
> > with SMB3 as multichannel becomes more common.  With one channel having one
> > readahead request pending on the network is suboptimal - but not as bad as
> > when multichannel is negotiated. Interestingly in most cases two network
> > connections to the same server (different TCP sockets,but the same mount,
> > even in cases where only network adapter) can achieve better performance -
> > but still significantly lags Windows (and probably other clients) as in
> > Linux we don't keep multiple I/Os in flight at one time (unless different
> > files are being read at the same time by different threads).
> 
> I think it should be relatively straightforward to make the netfs_readahead()
> function generate multiple read requests.  If I wasn't handed sufficient pages
> by the VM upfront to do two or more read requests, I would need to do extra
> expansion.  There are a couple of ways this could be done:

I don't think this is a job for netfs_readahead().  We can get into a
similar situation with SSDs or RAID arrays where ideally we would have
several outstanding readahead requests.

If your drive is connected through a 1Gbps link (eg PCIe gen 1 x1) and
has a latency of 10ms seek time, with one outstanding read, each read
needs to be 12.5MB in size in order to saturate the bus.  If the device
supports 128 outstanding commands, each read need only be 100kB.

We need the core readahead code to handle this situation.  My suggestion
for doing this is to send off an extra readahead request every time we
hit a !Uptodate page.  It looks something like this (assuming the app
is processing the data fast and always hits the !Uptodate case) ...

1. hit 0,
	set readahead size to 64kB,
	mark 32kB as Readahead, send read for 0-64kB
	wait for 0-64kB to complete
2. hit 32kB (Readahead), no reads outstanding
	inc readahead size to 128kB,
	mark 128kB as Readahead, send read for 64k-192kB
3. hit 64kB (!Uptodate), one read outstanding
	mark 256kB as Readahead, send read for 192-320kB
	mark 384kB as Readahead, send read for 320-448kB
	wait for 64-192kB to complete
4. hit 128kB (Readahead), two reads outstanding
	inc readahead size to 256kB,
	mark 576kB as Readahead, send read for 448-704kB
5. hit 192kB (!Uptodate), three reads outstanding
	mark 832kB as Readahead, send read for 704-960kB
	mark 1088kB as Readahead, send read for 960-1216kB
	wait for 192-320kB to complete
6. hit 256kB (Readahead), four reads outstanding
	mark 1344kB as Readahead, send read for 1216-1472kB
7. hit 320kB (!Uptodate), five reads outstanding
	mark 1600kB as Readahead, send read for 1472-1728kB
	mark 1856kB as Readahead, send read for 1728-1984kB
	wait for 320-448kB to complete
8. hit 384kB (Readahead), five reads outstanding
	mark 2112kB as Readahead, send read for 1984-2240kB
9. hit 448kB (!Uptodate), six reads outstanding
	mark 2368kB as Readahead, send read for 2240-2496kB
	mark 2624kB as Readahead, send read for 2496-2752kB
	wait for 448-704kB to complete
10. hit 576kB (Readahead), seven reads outstanding
	mark 2880kB as Readahead, send read for 2752-3008kB

...

Once we stop hitting !Uptodate pages, we'll maintain the number of pages
marked as Readahead, and thus keep the number of readahead requests
at the level it determined was necessary to keep the link saturated.
I think we may need to put a parallelism cap in the bdi so that a device
which is just slow instead of at the end of a long fat pipe doesn't get
overwhelmed with requests.


  reply	other threads:[~2021-02-24 15:52 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-15 15:44 [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] David Howells
2021-02-15 15:44 ` [PATCH 01/33] iov_iter: Add ITER_XARRAY David Howells
2021-02-15 15:44 ` [PATCH 02/33] mm: Add an unlock function for PG_private_2/PG_fscache David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-15 15:44 ` [PATCH 03/33] mm: Implement readahead_control pageset expansion David Howells
2021-02-16 10:32   ` Christoph Hellwig
2021-02-16 13:22     ` Matthew Wilcox
2021-02-17 14:36       ` Mike Marshall
2021-02-17 15:42       ` David Howells
2021-02-17 16:59         ` Mike Marshall
2021-02-17 22:20         ` David Howells
2021-02-16 11:48   ` David Howells
2021-02-17 16:13   ` Matthew Wilcox
2021-02-17 22:34   ` David Howells
2021-02-17 22:49     ` Matthew Wilcox
2021-02-18 17:47   ` David Howells
2021-02-15 15:45 ` [PATCH 04/33] vfs: Export rw_verify_area() for use by cachefiles David Howells
2021-02-16 10:26   ` Christoph Hellwig
2021-02-16 11:55   ` David Howells
2021-02-15 15:45 ` [PATCH 05/33] netfs: Make a netfs helper module David Howells
2021-02-15 15:45 ` [PATCH 06/33] netfs, mm: Move PG_fscache helper funcs to linux/netfs.h David Howells
2021-02-15 15:45 ` [PATCH 07/33] netfs, mm: Add unlock_page_fscache() and wait_on_page_fscache() David Howells
2021-02-15 15:45 ` [PATCH 08/33] netfs: Provide readahead and readpage netfs helpers David Howells
2021-02-15 15:45 ` [PATCH 09/33] netfs: Add tracepoints David Howells
2021-02-15 15:46 ` [PATCH 10/33] netfs: Gather stats David Howells
2021-02-15 15:46 ` [PATCH 11/33] netfs: Add write_begin helper David Howells
2021-02-15 15:46 ` [PATCH 12/33] netfs: Define an interface to talk to a cache David Howells
2021-02-15 15:46 ` [PATCH 13/33] netfs: Hold a ref on a page when PG_private_2 is set David Howells
2021-02-15 18:05 ` [PATCH 00/33] Network fs helper library & fscache kiocb API [ver #3] Jeff Layton
2021-02-16  0:40   ` Steve French
2021-02-16  2:10     ` Matthew Wilcox
2021-02-16  5:18       ` Steve French
2021-02-16  5:22       ` Steve French
2021-02-23 20:27         ` Matthew Wilcox
2021-02-24  4:57           ` Steve French
2021-02-24 13:32       ` David Howells
2021-02-24 15:51         ` Matthew Wilcox [this message]
2021-02-16 11:01     ` Jeff Layton
2021-02-15 22:46 ` [PATCH 34/33] netfs: Use in_interrupt() not in_softirq() David Howells
2021-02-16  8:42   ` Christoph Hellwig
2021-02-16  9:06     ` Sebastian Andrzej Siewior
2021-02-16  9:29   ` David Howells
2021-02-16  9:30     ` Christoph Hellwig
2021-02-18 14:02     ` [PATCH 34/33] netfs: Pass flag rather than use in_softirq() David Howells
2021-02-18 15:06       ` Marc Dionne
2021-02-18 15:16       ` Marc Dionne
2021-02-19  9:01       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210224155121.GQ2858050@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=anna.schumaker@netapp.com \
    --cc=asmadeus@codewreck.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=dwysocha@redhat.com \
    --cc=hch@lst.de \
    --cc=jlayton@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cachefs@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=sfrench@samba.org \
    --cc=smfrench@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=trondmy@hammerspace.com \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=william.kucharski@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).