linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Stapelberg <stapelberg+linux@google.com>
To: Jan Kara <jack@suse.cz>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Dennis Zhou <dennis@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Roman Gushchin <guro@fb.com>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	Song Liu <song@kernel.org>, David Sterba <dsterba@suse.com>
Subject: Re: [PATCH] backing_dev_info: introduce min_bw/max_bw limits
Date: Tue, 22 Jun 2021 14:29:32 +0200	[thread overview]
Message-ID: <CAH9Oa-YxL1iu_TVn6bL3Nd4qzYSVDPaO9a96sX4u7dhq+ewasA@mail.gmail.com> (raw)
In-Reply-To: <20210622121205.GG14261@quack2.suse.cz>

Thanks for taking a look! Comments inline:

On Tue, 22 Jun 2021 at 14:12, Jan Kara <jack@suse.cz> wrote:
>
> On Mon 21-06-21 11:20:10, Michael Stapelberg wrote:
> > Hey Miklos
> >
> > On Fri, 18 Jun 2021 at 16:42, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Fri, 18 Jun 2021 at 10:31, Michael Stapelberg
> > > <stapelberg+linux@google.com> wrote:
> > >
> > > > Maybe, but I don’t have the expertise, motivation or time to
> > > > investigate this any further, let alone commit to get it done.
> > > > During our previous discussion I got the impression that nobody else
> > > > had any cycles for this either:
> > > > https://lore.kernel.org/linux-fsdevel/CANnVG6n=ySfe1gOr=0ituQidp56idGARDKHzP0hv=ERedeMrMA@mail.gmail.com/
> > > >
> > > > Have you had a look at the China LSF report at
> > > > http://bardofschool.blogspot.com/2011/?
> > > > The author of the heuristic has spent significant effort and time
> > > > coming up with what we currently have in the kernel:
> > > >
> > > > """
> > > > Fengguang said he draw more than 10K performance graphs and read even
> > > > more in the past year.
> > > > """
> > > >
> > > > This implies that making changes to the heuristic will not be a quick fix.
> > >
> > > Having a piece of kernel code sitting there that nobody is willing to
> > > fix is certainly not a great situation to be in.
> >
> > Agreed.
> >
> > >
> > > And introducing band aids is not going improve the above situation,
> > > more likely it will prolong it even further.
> >
> > Sounds like “Perfect is the enemy of good” to me: you’re looking for a
> > perfect hypothetical solution,
> > whereas we have a known-working low risk fix for a real problem.
> >
> > Could we find a solution where medium-/long-term, the code in question
> > is improved,
> > perhaps via a Summer Of Code project or similar community efforts,
> > but until then, we apply the patch at hand?
> >
> > As I mentioned, I think adding min/max limits can be useful regardless
> > of how the heuristic itself changes.
> >
> > If that turns out to be incorrect or undesired, we can still turn the
> > knobs into a no-op, if removal isn’t an option.
>
> Well, removal of added knobs is more or less out of question as it can
> break some userspace. Similarly making them no-op is problematic unless we
> are pretty certain it cannot break some existing setup. That's why we have
> to think twice (or better three times ;) before adding any knobs. Also
> honestly the knobs you suggest will be pretty hard to tune when there are
> multiple cgroups with writeback control involved (which can be affected by
> the same problems you observe as well). So I agree with Miklos that this is
> not the right way to go. Speaking of tunables, did you try tuning
> /sys/devices/virtual/bdi/<fuse-bdi>/min_ratio? I suspect that may
> workaround your problems...

Back then, I did try the various tunables (vm.dirty_ratio and
vm.dirty_background_ratio on the global level,
/sys/class/bdi/<bdi>/{min,max}_ratio on the file system level), and
they have had no observable effect on the problem at all in my tests.

>
> Looking into your original report and tracing you did (thanks for that,
> really useful), it seems that the problem is that writeback bandwidth is
> updated at most every 200ms (more frequent calls are just ignored) and are
> triggered only from balance_dirty_pages() (happen when pages are dirtied) and
> inode writeback code so if the workload tends to have short spikes of activity
> and extended periods of quiet time, then writeback bandwidth may indeed be
> seriously miscomputed because we just won't update writeback throughput
> after most of writeback has happened as you observed.
>
> I think the fix for this can be relatively simple. We just need to make
> sure we update writeback bandwidth reasonably quickly after the IO
> finishes. I'll write a patch and see if it helps.

Thank you! Please keep us posted.

      reply	other threads:[~2021-06-22 12:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17  9:53 [PATCH] backing_dev_info: introduce min_bw/max_bw limits Michael Stapelberg
2021-06-18  8:18 ` Miklos Szeredi
2021-06-18  8:31   ` Michael Stapelberg
2021-06-18 14:41     ` Miklos Szeredi
2021-06-21  9:20       ` Michael Stapelberg
2021-06-22 12:12         ` Jan Kara
2021-06-22 12:29           ` Michael Stapelberg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH9Oa-YxL1iu_TVn6bL3Nd4qzYSVDPaO9a96sX4u7dhq+ewasA@mail.gmail.com \
    --to=stapelberg+linux@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=dennis@kernel.org \
    --cc=dsterba@suse.com \
    --cc=guro@fb.com \
    --cc=jack@suse.cz \
    --cc=johannes.thumshirn@wdc.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).