False waker detection in BFQ

* False waker detection in BFQ
@ 2021-05-05 16:20 Jan Kara
  2021-05-20 15:05 ` Paolo Valente
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2021-05-05 16:20 UTC (permalink / raw)
  To: Paolo Valente; +Cc: linux-block

Hi Paolo!

I have two processes doing direct IO writes like:

dd if=/dev/zero of=/mnt/file$i bs=128k oflag=direct count=4000M

Now each of these processes belongs to a different cgroup and it has
different bfq.weight. I was looking into why these processes do not split
bandwidth according to BFQ weights. Or actually the bandwidth is split
accordingly initially but eventually degrades into 50/50 split. After some
debugging I've found out that due to luck, one of the processes is decided
to be a waker of the other process and at that point we loose isolation
between the two cgroups. This pretty reliably happens sometime during the
run of these two processes on my test VM. So can we tweak the waker logic
to reduce the chances for false positives? Essentially when there are only
two processes doing heavy IO against the device, the logic in
bfq_check_waker() is such that they are very likely to eventually become
wakers of one another. AFAICT the only condition that needs to get
fulfilled is that they need to submit IO within 4 ms of the completion of
IO of the other process 3 times.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread