hybrid polling on an nvme doesn't seem to work with iodepth > 1 on 5.10.0-rc5

* hybrid polling on an nvme doesn't seem to work with iodepth > 1 on 5.10.0-rc5
@ 2020-12-10 20:51 Andres Freund
  2020-12-10 23:12 ` Pavel Begunkov
  0 siblings, 1 reply; 14+ messages in thread
From: Andres Freund @ 2020-12-10 20:51 UTC (permalink / raw)
  To: linux-block, Jens Axboe

Hi,

When using hybrid polling (i.e echo 0 >
/sys/block/nvme1n1/queue/io_poll_delay) I see stalls with fio when using
an iodepth > 1. Sometimes fio hangs, other times the performance is
really poor. I reproduced this with SSDs from different vendors.

$ echo -1 | sudo tee /sys/block/nvme1n1/queue/io_poll_delay
$ fio --ioengine io_uring --rw write --filesize 1GB --overwrite=1 --name=test --direct=1 --bs=$((1024*4)) --time_based=1 --runtime=10 --hipri --iodepth 1
93.4k iops

$ fio --ioengine io_uring --rw write --filesize 1GB --overwrite=1 --name=test --direct=1 --bs=$((1024*4)) --time_based=1 --runtime=10 --hipri --iodepth 32
426k iops

$ echo 0 | sudo tee /sys/block/nvme1n1/queue/io_poll_delay
$ fio --ioengine io_uring --rw write --filesize 1GB --overwrite=1 --name=test --direct=1 --bs=$((1024*4)) --time_based=1 --runtime=10 --hipri --iodepth 1
94.3k iops

$ fio --ioengine io_uring --rw write --filesize 1GB --overwrite=1 --name=test --direct=1 --bs=$((1024*4)) --time_based=1 --runtime=10 --hipri --iodepth 32
167 iops
fio took 33s

However, if I ask fio / io_uring to perform all those IOs at once, the performance is pretty decent again (but obviously that's not that desirable)

$ fio --ioengine io_uring --rw write --filesize 1GB --overwrite=1 --name=test --direct=1 --bs=$((1024*4)) --time_based=1 --runtime=10 --hipri --iodepth 32 --iodepth_batch_submit=32 --iodepth_batch_complete_min=32
394k iops

So it looks like there's something wrong around tracking what needs to
be polled for in hybrid mode.

Greetings,

Andres Freund

^ permalink raw reply	[flat|nested] 14+ messages in thread