[PATCH 0/5] Initial support for polled IO

* [PATCH 0/5] Initial support for polled IO
@ 2015-11-06 17:20 Jens Axboe
  2015-11-06 17:20 ` [PATCH 1/5] block: change ->make_request_fn() and users to return a queue cookie Jens Axboe
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Jens Axboe @ 2015-11-06 17:20 UTC (permalink / raw)
  To: linux-kernel, linux-block; +Cc: keith.busch, hch

Hi,

This is a basic framework for supporting polled IO with the block
layer stack, and added support for sync O_DIRECT IO and the NVMe
driver.

There are a few things missing to truly productize this, but it's
very useful for testing. For now, it's a per-device opt-in feature.
To enable it, you echo 1 to /sys/block/<dev>/queue/io_poll.

Some basic test results:

# dd if=/dev/nvme2n1 of=/dev/null bs=4096 iflag=direct count=200k
[...]
838860800 bytes (839 MB) copied, 3.98791 s, 210 MB/s
# echo 1 > /sys/block/nvme2n1/queue/io_poll
# dd if=/dev/nvme2n1 of=/dev/null bs=4096 iflag=direct count=200k
[...]
838860800 bytes (839 MB) copied, 2.15479 s, 389 MB/s

This is a DRAM backed NVMe device, per IO latencies drop from ~19.5usec
to ~10.5usec.

# dd if=/dev/nvme0n1 of=/dev/null bs=4096 iflag=direct count=200k
[...]
838860800 bytes (839 MB) copied, 5.90349 s, 142 MB/s
# echo 1 > /sys/block/nvme0n1/queue/io_poll
# dd if=/dev/nvme0n1 of=/dev/null bs=4096 iflag=direct count=200k
838860800 bytes (839 MB) copied, 3.15852 s, 266 MB/s

Samsung NVMe device, ~28.8 -> ~15.4 usec

# dd if=/dev/nvme1n1 of=/dev/null bs=4096 iflag=direct count=200k
[...]
838860800 bytes (839 MB) copied, 1.78069 s, 471 MB/s
# echo 1 > /sys/block/nvme1n1/queue/io_poll 
# dd if=/dev/nvme1n1 of=/dev/null bs=4096 iflag=direct count=200k
[...]
838860800 bytes (839 MB) copied, 1.31546 s, 638 MB/s

Intel NVMe device, ~8.7usec -> ~6.4usec.

Three different devices, different but notable wins on all of them.
Contrary to intuition, sometimes the slower devices benefit more, since
the slower completion yields a deeper C-state on the processor.

I'd like to get this framework in so we can more easily experiment
with polling. I've got another branch, mq-stats, that wires up a
scalable device IO completion stats collection. We could potentially
use that for enabling smart decisions about when to poll and for how
long. We'll also work on enabling libaio support for this.

Thanks, Jens

^ permalink raw reply	[flat|nested] 11+ messages in thread