linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Righi <righi.andrea@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Josef Bacik <josef@toxicpanda.com>, Tejun Heo <tj@kernel.org>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jens Axboe <axboe@kernel.dk>, Dennis Zhou <dennis@kernel.org>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/3] cgroup: fsio throttle controller
Date: Tue, 29 Jan 2019 19:39:38 +0100	[thread overview]
Message-ID: <20190129183938.GA2960@xps-13> (raw)
In-Reply-To: <20190128192620.GB10240@redhat.com>

On Mon, Jan 28, 2019 at 02:26:20PM -0500, Vivek Goyal wrote:
> On Mon, Jan 28, 2019 at 06:41:29PM +0100, Andrea Righi wrote:
> > Hi Vivek,
> > 
> > sorry for the late reply.
> > 
> > On Mon, Jan 21, 2019 at 04:47:15PM -0500, Vivek Goyal wrote:
> > > On Sat, Jan 19, 2019 at 11:08:27AM +0100, Andrea Righi wrote:
> > > 
> > > [..]
> > > > Alright, let's skip the root cgroup for now. I think the point here is
> > > > if we want to provide sync() isolation among cgroups or not.
> > > > 
> > > > According to the manpage:
> > > > 
> > > >        sync()  causes  all  pending  modifications  to filesystem metadata and cached file data to be
> > > >        written to the underlying filesystems.
> > > > 
> > > > And:
> > > >        According to the standard specification (e.g., POSIX.1-2001), sync() schedules the writes, but
> > > >        may  return  before  the actual writing is done.  However Linux waits for I/O completions, and
> > > >        thus sync() or syncfs() provide the same guarantees as fsync called on every file in the  sys‐
> > > >        tem or filesystem respectively.
> > > > 
> > > > Excluding the root cgroup, do you think a sync() issued inside a
> > > > specific cgroup should wait for I/O completions only for the writes that
> > > > have been generated by that cgroup?
> > > 
> > > Can we account I/O towards the cgroup which issued "sync" only if write
> > > rate of sync cgroup is higher than cgroup to which page belongs to. Will
> > > that solve problem, assuming its doable?
> > 
> > Maybe this would mitigate the problem, in part, but it doesn't solve it.
> > 
> > The thing is, if a dirty page belongs to a slow cgroup and a fast cgroup
> > issues "sync", the fast cgroup needs to wait a lot, because writeback is
> > happening at the speed of the slow cgroup.
> 
> Hi Andrea,
> 
> But that's true only for I/O which has already been submitted to block
> layer, right? Any new I/O yet to be submitted could still be attributed
> to faster cgroup requesting sync.

Right. If we could bump up the new I/O yet to be submitted I think we
could effectively prevent the priority inversion problem (the ongoing
writeback I/O should be negligible).

> 
> Until and unless cgroups limits are absurdly low, it should not take very
> long for already submitted I/O to finish. If yes, then in practice, it
> might not be a big problem?

I was actually doing my tests with a very low limit (1MB/s both for rbps
and wbps), but this shows the problem very well I think.

Here's what I'm doing:

 [ slow cgroup (1Mbps read/write) ]

   $ cat /sys/fs/cgroup/unified/cg1/io.max
   259:0 rbps=1048576 wbps=1048576 riops=max wiops=max
   $ cat /proc/self/cgroup
   0::/cg1

   $ fio --rw=write --bs=1M --size=32M --numjobs=16 --name=writer --time_based --runtime=30

 [ fast cgroup (root cgroup, no limitation) ]

   # cat /proc/self/cgroup
   0::/

   # time sync
   real	9m32,618s
   user	0m0,000s
   sys	0m0,018s

With this simple test I can easily trigger hung task timeout warnings
and make the whole system totally sluggish (even the processes running
in the root cgroup).

When fio ends, writeback is still taking forever to complete, as you can
see by the insane amount that sync takes to complete.

-Andrea

  reply	other threads:[~2019-01-29 18:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-18 10:31 [RFC PATCH 0/3] cgroup: fsio throttle controller Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 1/3] fsio-throttle: documentation Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 2/3] fsio-throttle: controller infrastructure Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 3/3] fsio-throttle: instrumentation Andrea Righi
2019-01-18 11:04 ` [RFC PATCH 0/3] cgroup: fsio throttle controller Paolo Valente
2019-01-18 11:10   ` Andrea Righi
2019-01-18 11:11     ` Paolo Valente
2019-01-18 16:35 ` Josef Bacik
2019-01-18 17:07   ` Paolo Valente
2019-01-18 17:12     ` Josef Bacik
2019-01-18 19:02     ` Andrea Righi
2019-01-18 18:44   ` Andrea Righi
2019-01-18 19:46     ` Josef Bacik
2019-01-19 10:08       ` Andrea Righi
2019-01-21 21:47         ` Vivek Goyal
2019-01-28 17:41           ` Andrea Righi
2019-01-28 19:26             ` Vivek Goyal
2019-01-29 18:39               ` Andrea Righi [this message]
2019-01-29 18:50                 ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190129183938.GA2960@xps-13 \
    --to=righi.andrea@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).