Re: [PATCHSET block/for-next] IO cost model based work-conserving porportional controller

From: Paolo Valente <paolo.valente@linaro.org>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	newella@fb.com, clm@fb.com, Josef Bacik <josef@toxicpanda.com>,
	dennisz@fb.com, Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-block <linux-block@vger.kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org, ast@kernel.org,
	daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com,
	yhs@fb.com, bpf@vger.kernel.org
Subject: Re: [PATCHSET block/for-next] IO cost model based work-conserving porportional controller
Date: Fri, 6 Sep 2019 11:07:17 +0200	[thread overview]
Message-ID: <EFFA2298-8614-4AFC-9208-B36976F6548C@linaro.org> (raw)
In-Reply-To: <20190905165540.GJ2263813@devbig004.ftw2.facebook.com>

> Il giorno 5 set 2019, alle ore 18:55, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> So, I'm currently verifying iocost in the FB fleet.  Around three
> thousand machines running v5.2 (+ some backports) with btrfs on a
> handful of different models of consumer grade SSDs.  I haven't seen
> complete loss of control as you're reporting.  Given that you're
> reporting the same thing on io.latency, which is deployed on multiple
> orders of magnitude more machines at this point, it's likely that
> there's something common affecting your test setup.

Yep, I had that doubt too, so I extended my tests to one more PC and
two more drives: a fast SAMSUNG NVMe SSD 970 PRO and an HITACHI
HTS72755 HDD, using the QoS configurations suggested in your last
email.  As for the filesystem, I'm interested in ext4, because it is
the most widely used file system, and, with some workloads, it makes
it hard to control I/O while keeping throughput high.  I'll provide hw
and sw details in my reply to your next question.  I'm willing to run
tests with btrfs too, at a later time.

Something is wrong with io.cost also with the other PC and the other
drives.  In the next table, each pair of numbers contains the target's
throughput and the total throughput:

                  none                 io.cost               bfq
SAMSUNG SSD    11.373  3295.517     6.468  3273.892    10.802  1862.288
HITACHI HDD    0.026    11.531      0.042    30.299     0.067    76.642

With the SAMSUNG SSD, io.cost gives to the target less throughput than
none (and bfq is behaving badly too, but this is my problem).  On the
HDD, io.cost gives to the target a little bit more than half the
throughput guaranteed by bfq, and reaches less than half the total
throughput reached by bfq.

I do agree that three thousand is an overwhelming number of machines,
and I'll probably never have that many resources for my tests.  Still,
it seems rather unlikely that two different PCs, and three different
drives, all suffer from a common anomaly that causes troubles only to
io.cost and io.latency.

I try to never overlook also me being the problematic link in the
chain.  But I'm executing this test with the public script I mentioned
in my previous emails; and all steps seem correct.

>  Can you please
> describe your test configuration and if you aren't already try testing
> on btrfs?
> 

PC 1: Thinkpad W520, Ubuntu 18.04 (no configuration change w.r.t.
defaults), PLEXTOR SATA PX-256M5S SSD, HITACHI HTS72755 HDD, ext4.

PC 2: Thinkpad X1 Extreme, Ubuntu 19.04 (no configuration change
w.r.t.  defaults), SAMSUNG NVMe SSD 970 PRO, ext4.

If you need more details, just ask.

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun