LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@fb.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, dchinner@redhat.com,
	sedat.dilek@gmail.com
Subject: Re: [PATCH 7/8] wbt: add general throttling mechanism
Date: Tue, 3 May 2016 13:07:33 -0600
Message-ID: <5728F6F5.5050701@kernel.dk> (raw)
In-Reply-To: <5728EA8A.9040405@kernel.dk>

On 05/03/2016 12:14 PM, Jens Axboe wrote:
> On 05/03/2016 10:59 AM, Jens Axboe wrote:
>> On 05/03/2016 09:48 AM, Jan Kara wrote:
>>> On Tue 03-05-16 17:40:32, Jan Kara wrote:
>>>> On Tue 03-05-16 11:34:10, Jan Kara wrote:
>>>>> Yeah, once I'll hunt down that regression with old disk, I can have
>>>>> a look
>>>>> into how writeback throttling plays together with blkio-controller.
>>>>
>>>> So I've tried the following script (note that you need cgroup v2 for
>>>> writeback IO to be throttled):
>>>>
>>>> ---
>>>> mkdir /sys/fs/cgroup/group1
>>>> echo 1000 >/sys/fs/cgroup/group1/io.weight
>>>> dd if=/dev/zero of=/mnt/file1 bs=1M count=10000&
>>>> DD1=$!
>>>> echo $DD1 >/sys/fs/cgroup/group1/cgroup.procs
>>>>
>>>> mkdir /sys/fs/cgroup/group2
>>>> echo 100 >/sys/fs/cgroup/group2/io.weight
>>>> #echo "259:65536 wbps=5000000" >/sys/fs/cgroup/group2/io.max
>>>> echo "259:65536 wbps=max" >/sys/fs/cgroup/group2/io.max
>>>> dd if=/dev/zero of=/mnt/file2 bs=1M count=10000&
>>>> DD2=$!
>>>> echo $DD2 >/sys/fs/cgroup/group2/cgroup.procs
>>>>
>>>> while true; do
>>>>          sleep 1
>>>>          kill -USR1 $DD1
>>>>          kill -USR1 $DD2
>>>>          echo
>>>> '======================================================='
>>>> done
>>>> ---
>>>>
>>>> and watched the progress of the dd processes in different cgroups.
>>>> The 1/10
>>>> weight difference has no effect with your writeback patches - the
>>>> situation
>>>> after one minute:
>>>>
>>>> 3120+1 records in
>>>> 3120+1 records out
>>>> 3272392704 bytes (3.3 GB) copied, 63.7119 s, 51.4 MB/s
>>>> 3217+1 records in
>>>> 3217+1 records out
>>>> 3374010368 bytes (3.4 GB) copied, 63.5819 s, 53.1 MB/s
>>>>
>>>> I should add that even without your patches the progress doesn't quite
>>>> correspond to the weight ratio:
>>>
>>> Forgot to fill in corresponding data for unpatched kernel here:
>>>
>>> 5962+2 records in
>>> 5962+2 records out
>>> 6252281856 bytes (6.3 GB) copied, 64.1719 s, 97.4 MB/s
>>> 1502+0 records in
>>> 1502+0 records out
>>> 1574961152 bytes (1.6 GB) copied, 64.207 s, 24.5 MB/s
>>
>> Thanks for testing this, I'll see what we can do about that. It stands
>> to reason that we'll throttle a heavier writer more, statistically. But
>> I'm assuming this above test was run basically with just the writes
>> going, so no real competition? And hence we end up throttling them
>> equally much, destroying the weighting in the process. But for both
>> cases, we basically don't pay any attention to cgroup weights.
>>
>>>> but still there is noticeable difference to cgroups with different
>>>> weights.
>>>>
>>>> OTOH blk-throttle combines well with your patches: Limiting one
>>>> cgroup to
>>>> 5 M/s results in numbers like:
>>>>
>>>> 3883+2 records in
>>>> 3883+2 records out
>>>> 4072091648 bytes (4.1 GB) copied, 36.6713 s, 111 MB/s
>>>> 413+0 records in
>>>> 413+0 records out
>>>> 433061888 bytes (433 MB) copied, 36.8939 s, 11.7 MB/s
>>>>
>>>> which is fine and comparable with unpatched kernel. Higher throughput
>>>> number is because we do buffered writes and dd reports what it wrote
>>>> into
>>>> page cache. And there is no wonder blk-throttle combines fine - it
>>>> throttles bios which happens before we reach writeback throttling
>>>> mechanism.
>>
>> OK, that's good, at least that part works fine. And yes, the throttle
>> path is hit before we end up in the make_request_fn, which is where wbt
>> drops in.
>>
>>>> So I belive this demonstrates that your writeback throttling just
>>>> doesn't
>>>> work well with selective scheduling policy that happens below it
>>>> because it
>>>> can essentially lead to IO priority inversion issues...
>>
>> It this testing still done on the QD=1 ATA disk? Not too surprising that
>> this falls apart, since we have very little room to maneuver. I wonder
>> if a normal SATA with NCQ would behave better in this regard. I'll have
>> to test a bit and think about how we can best handle this case.
>
> I think what we'll do for now is just disable wbt IFF we have a non-root
> cgroup attached to CFQ. Done here:
>
> http://git.kernel.dk/cgit/linux-block/commit/?h=wb-buf-throttle&id=7315756efe76bbdf83076fc9dbc569bbb4da5d32

That was a bit too untested.. This should be better, it taps into where 
cfq normally notices a difference in blkcg:

http://git.kernel.dk/cgit/linux-block/commit/?h=wb-buf-throttle&id=9b89e1bb666bd036a4cb1313479435087fb86ba0


-- 
Jens Axboe

  reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 15:55 [PATCHSET v5] Make background writeback great again for the first time Jens Axboe
2016-04-26 15:55 ` [PATCH 1/8] block: add WRITE_BG Jens Axboe
2016-04-26 15:55 ` [PATCH 2/8] writeback: add wbc_to_write_cmd() Jens Axboe
2016-04-26 15:55 ` [PATCH 3/8] writeback: use WRITE_BG for kupdate and background writeback Jens Axboe
2016-04-26 15:55 ` [PATCH 4/8] writeback: track if we're sleeping on progress in balance_dirty_pages() Jens Axboe
2016-04-26 15:55 ` [PATCH 5/8] block: add code to track actual device queue depth Jens Axboe
2016-04-26 15:55 ` [PATCH 6/8] block: add scalable completion tracking of requests Jens Axboe
2016-05-05  7:52   ` Ming Lei
2016-04-26 15:55 ` [PATCH 7/8] wbt: add general throttling mechanism Jens Axboe
2016-04-27 12:06   ` xiakaixu
2016-04-27 15:21     ` Jens Axboe
2016-04-28  3:29       ` xiakaixu
2016-04-28 11:05   ` Jan Kara
2016-04-28 18:53     ` Jens Axboe
2016-04-28 19:03       ` Jens Axboe
2016-05-03  9:34       ` Jan Kara
2016-05-03 14:23         ` Jens Axboe
2016-05-03 15:22           ` Jan Kara
2016-05-03 15:32             ` Jens Axboe
2016-05-03 15:40         ` Jan Kara
2016-05-03 15:48           ` Jan Kara
2016-05-03 16:59             ` Jens Axboe
2016-05-03 18:14               ` Jens Axboe
2016-05-03 19:07                 ` Jens Axboe [this message]
2016-04-26 15:55 ` [PATCH 8/8] writeback: throttle buffered writeback Jens Axboe
2016-04-27 18:01 ` [PATCHSET v5] Make background writeback great again for the first time Jan Kara
2016-04-27 18:17   ` Jens Axboe
2016-04-27 20:37     ` Jens Axboe
2016-04-27 20:59       ` Jens Axboe
2016-04-28  4:06         ` xiakaixu
2016-04-28 18:36           ` Jens Axboe
2016-04-28 11:54         ` Jan Kara
2016-04-28 18:46           ` Jens Axboe
2016-05-03 12:17             ` Jan Kara
2016-05-03 12:40               ` Chris Mason
2016-05-03 13:06                 ` Jan Kara
2016-05-03 13:42                   ` Chris Mason
2016-05-03 13:57                     ` Jan Kara
2016-05-11 16:36               ` Jan Kara
2016-05-13 18:29                 ` Jens Axboe
2016-05-16  7:47                   ` Jan Kara
2016-08-31 17:05 [PATCHSET v6] Throttled background buffered writeback Jens Axboe
2016-08-31 17:05 ` [PATCH 7/8] wbt: add general throttling mechanism Jens Axboe
2016-09-01 18:05   ` Omar Sandoval
2016-09-01 18:51     ` Jens Axboe
2016-09-07 14:46 [PATCH 0/8] Throttled background buffered writeback v7 Jens Axboe
2016-09-07 14:46 ` [PATCH 7/8] wbt: add general throttling mechanism Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5728F6F5.5050701@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=axboe@fb.com \
    --cc=dchinner@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sedat.dilek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git