linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Valente <paolo.valente@linaro.org>
To: Andrea Righi <righi.andrea@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jens Axboe <axboe@kernel.dk>, Vivek Goyal <vgoyal@redhat.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Dennis Zhou <dennis@kernel.org>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/3] cgroup: fsio throttle controller
Date: Fri, 18 Jan 2019 12:04:17 +0100	[thread overview]
Message-ID: <E5872191-EE96-442E-B668-DD148157E665@linaro.org> (raw)
In-Reply-To: <20190118103127.325-1-righi.andrea@gmail.com>



> Il giorno 18 gen 2019, alle ore 11:31, Andrea Righi <righi.andrea@gmail.com> ha scritto:
> 
> This is a redesign of my old cgroup-io-throttle controller:
> https://lwn.net/Articles/330531/
> 
> I'm resuming this old patch to point out a problem that I think is still
> not solved completely.
> 
> = Problem =
> 
> The io.max controller works really well at limiting synchronous I/O
> (READs), but a lot of I/O requests are initiated outside the context of
> the process that is ultimately responsible for its creation (e.g.,
> WRITEs).
> 
> Throttling at the block layer in some cases is too late and we may end
> up slowing down processes that are not responsible for the I/O that
> is being processed at that level.
> 
> = Proposed solution =
> 
> The main idea of this controller is to split I/O measurement and I/O
> throttling: I/O is measured at the block layer for READS, at page cache
> (dirty pages) for WRITEs, and processes are limited while they're
> generating I/O at the VFS level, based on the measured I/O.
> 

Hi Andrea,
what the about the case where two processes are dirtying the same
pages?  Which will be charged?

Thanks,
Paolo

> = Example =
> 
> Here's a trivial example: create 2 cgroups, set an io.max limit of
> 10MB/s, run a write-intensive workload on both and after a while, from a
> root cgroup, run "sync".
> 
> # cat /proc/self/cgroup
> 0::/cg1
> # fio --rw=write --bs=1M --size=32M --numjobs=16 --name=seeker --time_based --runtime=30
> 
> # cat /proc/self/cgroup
> 0::/cg2
> # fio --rw=write --bs=1M --size=32M --numjobs=16 --name=seeker --time_based --runtime=30
> 
> - io.max controller:
> 
> # echo "259:0 rbps=10485760 wbps=10485760" > /sys/fs/cgroup/unified/cg1/io.max
> # echo "259:0 rbps=10485760 wbps=10485760" > /sys/fs/cgroup/unified/cg2/io.max
> 
> # cat /proc/self/cgroup
> 0::/
> # time sync
> 
> real	0m51,241s
> user	0m0,000s
> sys	0m0,113s
> 
> Ideally "sync" should complete almost immediately, because the root
> cgroup is unlimited and it's not doing any I/O at all, but instead it's
> blocked for more than 50 sec with io.max, because the writeback is
> throttled to satisfy the io.max limits.
> 
> - fsio controller:
> 
> # echo "259:0 10 10" > /sys/fs/cgroup/unified/cg1/fsio.max_mbs
> # echo "259:0 10 10" > /sys/fs/cgroup/unified/cg2/fsio.max_mbs
> 
> [you can find details about the syntax in the documentation patch]
> 
> # cat /proc/self/cgroup
> 0::/
> # time sync
> 
> real	0m0,146s
> user	0m0,003s
> sys	0m0,001s
> 
> = Questions =
> 
> Q: Do we need another controller?
> A: Probably no, I think it would be better to integrate this policy (or
>   something similar) in the current blkio controller, this is just to
>   highlight the problem and get some ideas on how to address it.
> 
> Q: What about proportional limits / latency?
> A: It should be trivial to add latency-based limits if we integrate this in the
>   current I/O controller. About proportional limits (weights), they're
>   strictly related to I/O scheduling and since this controller doesn't touch
>   I/O dispatching policies it's not trivial to implement proportional limits
>   (bandwidth limiting is definitely more straightforward).
> 
> Q: Applying delays at the VFS layer doesn't prevent I/O spikes during
>   writeback, right?
> A: Correct, the tradeoff here is to tolerate I/O bursts during writeback to
>   avoid priority inversion problems in the system.
> 
> Andrea Righi (3):
>  fsio-throttle: documentation
>  fsio-throttle: controller infrastructure
>  fsio-throttle: instrumentation
> 
> Documentation/cgroup-v1/fsio-throttle.txt | 142 +++++++++
> block/blk-core.c                          |  10 +
> include/linux/cgroup_subsys.h             |   4 +
> include/linux/fsio-throttle.h             |  43 +++
> include/linux/writeback.h                 |   7 +-
> init/Kconfig                              |  11 +
> kernel/cgroup/Makefile                    |   1 +
> kernel/cgroup/fsio-throttle.c             | 501 ++++++++++++++++++++++++++++++
> mm/filemap.c                              |  20 +-
> mm/page-writeback.c                       |  14 +-
> 10 files changed, 749 insertions(+), 4 deletions(-)
> 


  parent reply	other threads:[~2019-01-18 11:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-18 10:31 [RFC PATCH 0/3] cgroup: fsio throttle controller Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 1/3] fsio-throttle: documentation Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 2/3] fsio-throttle: controller infrastructure Andrea Righi
2019-01-18 10:31 ` [RFC PATCH 3/3] fsio-throttle: instrumentation Andrea Righi
2019-01-18 11:04 ` Paolo Valente [this message]
2019-01-18 11:10   ` [RFC PATCH 0/3] cgroup: fsio throttle controller Andrea Righi
2019-01-18 11:11     ` Paolo Valente
2019-01-18 16:35 ` Josef Bacik
2019-01-18 17:07   ` Paolo Valente
2019-01-18 17:12     ` Josef Bacik
2019-01-18 19:02     ` Andrea Righi
2019-01-18 18:44   ` Andrea Righi
2019-01-18 19:46     ` Josef Bacik
2019-01-19 10:08       ` Andrea Righi
2019-01-21 21:47         ` Vivek Goyal
2019-01-28 17:41           ` Andrea Righi
2019-01-28 19:26             ` Vivek Goyal
2019-01-29 18:39               ` Andrea Righi
2019-01-29 18:50                 ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E5872191-EE96-442E-B668-DD148157E665@linaro.org \
    --to=paolo.valente@linaro.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=righi.andrea@gmail.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).