All of lore.kernel.org
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Miklos Szeredi" <miklos@szeredi.hu>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Jaegeuk Kim" <jaegeuk@kernel.org>, "Chao Yu" <chao@kernel.org>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Ilya Dryomov" <idryomov@gmail.com>,
	"Trond Myklebust" <trond.myklebust@hammerspace.com>,
	"Anna Schumaker" <anna.schumaker@netapp.com>,
	"Ryusuke Konishi" <konishi.ryusuke@gmail.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"Philipp Reisner" <philipp.reisner@linbit.com>,
	"Lars Ellenberg" <lars.ellenberg@linbit.com>,
	"Paolo Valente" <paolo.valente@linaro.org>,
	"Jens Axboe" <axboe@kernel.dk>, "linux-mm" <linux-mm@kvack.org>,
	linux-nilfs@vger.kernel.org,
	"Linux NFS list" <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	"Ext4" <linux-ext4@vger.kernel.org>,
	ceph-devel@vger.kernel.org, drbd-dev@lists.linbit.com,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [PATCH 1/9] Remove inode_congested()
Date: Sat, 29 Jan 2022 08:36:02 +1100	[thread overview]
Message-ID: <164340576289.5493.5784848964540459557@noble.neil.brown.name> (raw)
In-Reply-To: <CAJfpegt-igF8HqsDUcMzfU0jYv8WpofLy0Uv0YnXLzsfx=tkGg@mail.gmail.com>

On Fri, 28 Jan 2022, Miklos Szeredi wrote:
> On Thu, 27 Jan 2022 at 03:47, NeilBrown <neilb@suse.de> wrote:
> >
> > inode_congested() reports if the backing-device for the inode is
> > congested.  Few bdi report congestion any more, only ceph, fuse, and
> > nfs.  Having support just for those is unlikely to be useful.
> >
> > The places which test inode_congested() or it variants like
> > inode_write_congested(), avoid initiating IO if congestion is present.
> > We now have to rely on other places in the stack to back off, or abort
> > requests - we already do for everything except these 3 filesystems.
> >
> > So remove inode_congested() and related functions, and remove the call
> > sites, assuming that inode_congested() always returns 'false'.
> 
> Looks to me this is going to "break" fuse; e.g. readahead path will go
> ahead and try to submit more requests, even if the queue is getting
> congested.   In this case the readahead submission will eventually
> block, which is counterproductive.
> 
> I think we should *first* make sure all call sites are substituted
> with appropriate mechanisms in the affected filesystems and as a last
> step remove the superfluous bdi congestion mechanism.
> 
> You are saying that all fs except these three already have such
> mechanisms in place, right?  Can you elaborate on that?

Not much.  I haven't looked into how other filesystems cope, I just know
that they must because no other filesystem ever has a congested bdi
(with one or two minor exceptions, like filesystems over drbd).

Surely read-ahead should never block.  If it hits congestion, the
read-ahead request should simply fail.  block-based filesystems seem to
set REQ_RAHEAD which might get mapped to REQ_FAILFAST_MASK, though I
don't know how that is ultimately used.

Maybe fuse and others should continue to track 'congestion' and reject
read-ahead requests when congested.
Maybe also skip WB_SYNC_NONE writes..

Or maybe this doesn't really matter in practice...  I wonder if we can
measure the usefulness of congestion.

Thanks,
NeilBrown

WARNING: multiple messages have this Message-ID (diff)
From: "NeilBrown" <neilb@suse.de>
To: "Miklos Szeredi" <miklos@szeredi.hu>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
	drbd-dev@lists.linbit.com,
	Paolo Valente <paolo.valente@linaro.org>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Ilya Dryomov <idryomov@gmail.com>,
	Ext4 <linux-ext4@vger.kernel.org>,
	linux-block@vger.kernel.org, linux-nilfs@vger.kernel.org,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	ceph-devel@vger.kernel.org,
	Ryusuke Konishi <konishi.ryusuke@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	Linux NFS list <linux-nfs@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jeff Layton <jlayton@kernel.org>,
	Philipp Reisner <philipp.reisner@linbit.com>,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org,
	Lars Ellenberg <lars.ellenberg@linbit.com>,
	Anna Schumaker <anna.schumaker@netapp.com>
Subject: Re: [f2fs-dev] [PATCH 1/9] Remove inode_congested()
Date: Sat, 29 Jan 2022 08:36:02 +1100	[thread overview]
Message-ID: <164340576289.5493.5784848964540459557@noble.neil.brown.name> (raw)
In-Reply-To: <CAJfpegt-igF8HqsDUcMzfU0jYv8WpofLy0Uv0YnXLzsfx=tkGg@mail.gmail.com>

On Fri, 28 Jan 2022, Miklos Szeredi wrote:
> On Thu, 27 Jan 2022 at 03:47, NeilBrown <neilb@suse.de> wrote:
> >
> > inode_congested() reports if the backing-device for the inode is
> > congested.  Few bdi report congestion any more, only ceph, fuse, and
> > nfs.  Having support just for those is unlikely to be useful.
> >
> > The places which test inode_congested() or it variants like
> > inode_write_congested(), avoid initiating IO if congestion is present.
> > We now have to rely on other places in the stack to back off, or abort
> > requests - we already do for everything except these 3 filesystems.
> >
> > So remove inode_congested() and related functions, and remove the call
> > sites, assuming that inode_congested() always returns 'false'.
> 
> Looks to me this is going to "break" fuse; e.g. readahead path will go
> ahead and try to submit more requests, even if the queue is getting
> congested.   In this case the readahead submission will eventually
> block, which is counterproductive.
> 
> I think we should *first* make sure all call sites are substituted
> with appropriate mechanisms in the affected filesystems and as a last
> step remove the superfluous bdi congestion mechanism.
> 
> You are saying that all fs except these three already have such
> mechanisms in place, right?  Can you elaborate on that?

Not much.  I haven't looked into how other filesystems cope, I just know
that they must because no other filesystem ever has a congested bdi
(with one or two minor exceptions, like filesystems over drbd).

Surely read-ahead should never block.  If it hits congestion, the
read-ahead request should simply fail.  block-based filesystems seem to
set REQ_RAHEAD which might get mapped to REQ_FAILFAST_MASK, though I
don't know how that is ultimately used.

Maybe fuse and others should continue to track 'congestion' and reject
read-ahead requests when congested.
Maybe also skip WB_SYNC_NONE writes..

Or maybe this doesn't really matter in practice...  I wonder if we can
measure the usefulness of congestion.

Thanks,
NeilBrown


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

WARNING: multiple messages have this Message-ID (diff)
From: "NeilBrown" <neilb-l3A5Bk7waGM@public.gmane.org>
To: Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>
Cc: "Darrick J. Wong"
	<djwong-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org,
	Paolo Valente
	<paolo.valente-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-F/q8l9xzQnoyLce1RVWEUA@public.gmane.org>,
	Ilya Dryomov <idryomov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Ext4 <linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Chao Yu <chao-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jaegeuk Kim <jaegeuk-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ryusuke Konishi
	<konishi.ryusuke-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Linux NFS list
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Jeff Layton <jlayton-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Philipp Reisner
	<philipp.reisner-63ez5xqkn6DQT0dZR+AlfA@public.gmane.org>,
	linux-f2fs-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Lars Ellenberg
	<lars.ellenberg-63ez5xqkn6DQT0dZR+AlfA@public.gmane.org>,
	Anna Schumaker
	<anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 1/9] Remove inode_congested()
Date: Sat, 29 Jan 2022 08:36:02 +1100	[thread overview]
Message-ID: <164340576289.5493.5784848964540459557@noble.neil.brown.name> (raw)
In-Reply-To: <CAJfpegt-igF8HqsDUcMzfU0jYv8WpofLy0Uv0YnXLzsfx=tkGg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, 28 Jan 2022, Miklos Szeredi wrote:
> On Thu, 27 Jan 2022 at 03:47, NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org> wrote:
> >
> > inode_congested() reports if the backing-device for the inode is
> > congested.  Few bdi report congestion any more, only ceph, fuse, and
> > nfs.  Having support just for those is unlikely to be useful.
> >
> > The places which test inode_congested() or it variants like
> > inode_write_congested(), avoid initiating IO if congestion is present.
> > We now have to rely on other places in the stack to back off, or abort
> > requests - we already do for everything except these 3 filesystems.
> >
> > So remove inode_congested() and related functions, and remove the call
> > sites, assuming that inode_congested() always returns 'false'.
> 
> Looks to me this is going to "break" fuse; e.g. readahead path will go
> ahead and try to submit more requests, even if the queue is getting
> congested.   In this case the readahead submission will eventually
> block, which is counterproductive.
> 
> I think we should *first* make sure all call sites are substituted
> with appropriate mechanisms in the affected filesystems and as a last
> step remove the superfluous bdi congestion mechanism.
> 
> You are saying that all fs except these three already have such
> mechanisms in place, right?  Can you elaborate on that?

Not much.  I haven't looked into how other filesystems cope, I just know
that they must because no other filesystem ever has a congested bdi
(with one or two minor exceptions, like filesystems over drbd).

Surely read-ahead should never block.  If it hits congestion, the
read-ahead request should simply fail.  block-based filesystems seem to
set REQ_RAHEAD which might get mapped to REQ_FAILFAST_MASK, though I
don't know how that is ultimately used.

Maybe fuse and others should continue to track 'congestion' and reject
read-ahead requests when congested.
Maybe also skip WB_SYNC_NONE writes..

Or maybe this doesn't really matter in practice...  I wonder if we can
measure the usefulness of congestion.

Thanks,
NeilBrown

  reply	other threads:[~2022-01-28 21:36 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27  2:46 [PATCH 0/9] Remove remaining parts of congestions tracking code NeilBrown
2022-01-27  2:46 ` NeilBrown
2022-01-27  2:46 ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 2/9] Remove bdi_congested() and wb_congested() and related functions NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27  9:54   ` kernel test robot
2022-01-27  9:54     ` kernel test robot
2022-01-27 22:10   ` Ryusuke Konishi
2022-01-27 22:10     ` Ryusuke Konishi
2022-01-27 22:10     ` [f2fs-dev] " Ryusuke Konishi
2022-01-27  2:46 ` [PATCH 5/9] cephfs: don't set/clear bdi_congestion NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27 11:12   ` Jeff Layton
2022-01-27 11:12     ` [f2fs-dev] " Jeff Layton
2022-01-27  2:46 ` [PATCH 9/9] Remove congestion tracking framework NeilBrown
2022-01-27  2:46   ` NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 8/9] block/bfq-iosched.c: use "false" rather than "BLK_RW_ASYNC" NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 1/9] Remove inode_congested() NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-28  9:37   ` Miklos Szeredi
2022-01-28  9:37     ` Miklos Szeredi
2022-01-28  9:37     ` [f2fs-dev] " Miklos Szeredi
2022-01-28 21:36     ` NeilBrown [this message]
2022-01-28 21:36       ` NeilBrown
2022-01-28 21:36       ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 6/9] fuse: don't set/clear bdi_congested NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 3/9] f2fs: change retry waiting for f2fs_write_single_data_page() NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-28  1:34   ` Jaegeuk Kim
2022-01-28  1:34     ` Jaegeuk Kim
2022-01-28  1:34     ` [f2fs-dev] " Jaegeuk Kim
2022-01-27  2:46 ` [PATCH 7/9] NFS: remove congestion control NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-27  2:46 ` [PATCH 4/9] f2f2: replace some congestion_wait() calls with io_schedule_timeout() NeilBrown
2022-01-27  2:46   ` [f2fs-dev] " NeilBrown
2022-01-28  1:27   ` Jaegeuk Kim
2022-01-28  1:27     ` Jaegeuk Kim
2022-01-28  1:27     ` [f2fs-dev] " Jaegeuk Kim
2022-01-27 22:42 ` [PATCH 0/9] Remove remaining parts of congestions tracking code Andrew Morton
2022-01-27 22:42   ` Andrew Morton
2022-01-27 22:42   ` [f2fs-dev] " Andrew Morton
2022-01-28  0:58 ` Jens Axboe
2022-01-28  0:58   ` Jens Axboe
2022-01-28  0:58   ` [f2fs-dev] " Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164340576289.5493.5784848964540459557@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=anna.schumaker@netapp.com \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=chao@kernel.org \
    --cc=djwong@kernel.org \
    --cc=drbd-dev@lists.linbit.com \
    --cc=idryomov@gmail.com \
    --cc=jaegeuk@kernel.org \
    --cc=jlayton@kernel.org \
    --cc=konishi.ryusuke@gmail.com \
    --cc=lars.ellenberg@linbit.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-nilfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=paolo.valente@linaro.org \
    --cc=philipp.reisner@linbit.com \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.