All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/9] block & aio: kernel aio and loop mq conversion
@ 2014-08-14 15:50 ` Ming Lei
  0 siblings, 0 replies; 75+ messages in thread
From: Ming Lei @ 2014-08-14 15:50 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel, Andrew Morton, Dave Kleikamp
  Cc: Zach Brown, Benjamin LaHaise, Christoph Hellwig, Kent Overstreet,
	linux-aio, linux-fsdevel, Dave Chinner

Hi,

The 1st two patches introduce kernel AIO support, most of
is borrowed from Dave's work last year, and thanks to ITER_BVEC,
it is much simper to implement kernel AIO now. AIO model
is quite suitable for implementing kernel stuff, and can
help improve both throughput and CPU utilization. Lots of
kernel components should benefit from it, such as:
	- loop driver,
	- all kinds of kernel I/O target driver(SCSI, USB storage
	or UAS, ...)
	- kernel socket users might benefit from it too if socket
	AIO is mature

The following 6 patches convert current loop driver into blk-mq:
	- loop's scalability gets improved much
	- loop driver gets quite simplified, and the conversion can
	be throught as cleanup too

The 9th patch uses kernel AIO with O_DIRECT to improve loop's
performance in single job situation, and avoid double cache
issue for loop driver too.

With the change, loop block's performance can be doubled in my
fio test(randread, single job, libaio). If more fio jobs are used,
the throughput can be improved much more because of blk-mq.

Given loop is used quite widely, especially in VM environment,
also the change is quite small, hope it can be merged finally.

V1:
	- improve failure path in aio_kernel_submit()


 block/blk-mq.c            |   11 +-
 block/blk-mq.h            |    1 -
 drivers/block/loop.c      |  474 +++++++++++++++++++++++++--------------------
 drivers/block/loop.h      |   15 +-
 fs/aio.c                  |  121 ++++++++++++
 fs/direct-io.c            |    9 +-
 include/linux/aio.h       |   15 +-
 include/linux/blk-mq.h    |   13 ++
 include/uapi/linux/loop.h |    1 +
 9 files changed, 439 insertions(+), 221 deletions(-)


Thanks,
--
Ming Lei



^ permalink raw reply	[flat|nested] 75+ messages in thread
* Re: [PATCH v1 5/9] block: loop: convert to blk-mq
@ 2014-08-29 10:41 Maxim Patlasov
  0 siblings, 0 replies; 75+ messages in thread
From: Maxim Patlasov @ 2014-08-29 10:41 UTC (permalink / raw)
  To: Zach Brown
  Cc: Ming Lei, Benjamin LaHaise, axboe, Christoph Hellwig,
	linux-kernel, Andrew Morton, Dave Kleikamp, Kent Overstreet, AIO,
	linux-fsdevel, Dave Chinner,

On 8/28/14, Zach Brown<zab@zabbo.net>  wrote:

> On Wed, Aug 27, 2014 at 09:19:36PM +0400, Maxim Patlasov wrote:
>> On 08/27/2014 08:29 PM, Benjamin LaHaise wrote:
>>> On Wed, Aug 27, 2014 at 08:08:59PM +0400, Maxim Patlasov wrote:
>>> ...
>>>> 1) /dev/loop0 of 3.17.0-rc1 with Ming's patches applied -- 11K iops
>>>> 2) the same as above, but call loop_queue_work() directly from
>>>> loop_queue_rq() -- 270K iops
>>>> 3) /dev/nullb0 of 3.17.0-rc1 -- 380K iops
>>>>
>>>> Taking into account so big difference (11K vs. 270K), would it be
>>>> worthy
>>>> to implement pure non-blocking version of aio_kernel_submit() returning
>>>> error if blocking needed? Then loop driver (or any other in-kernel
>>>> user)
>>>> might firstly try that non-blocking submit as fast-path, and, only if
>>>> it's failed, fall back to queueing.
>>> What filesystem is the backing file for loop0 on?  O_DIRECT access as
>>> Ming's patches use should be non-blocking, and if not, that's something
>>> to fix.
>> I used loop0 directly on top of null_blk driver (because my goal was to
>> measure the overhead of processing requests in a separate thread).
> The relative overhead while doing nothing else.  While zooming way down
> in to micro benchmarks is fun and all, testing on an fs on brd might be
> more representitive and so more compelling.

The measurements on an fs on brd are even more outrageous (the same fio 
script I posted a few messages above):

1) Baseline. no loopback device involved.

fio on /dev/ram0:                           467K iops
fio on ext4 over /dev/ram0:                 378K iops

2) Loopback device from 3.17.0-rc1 with Ming's patches (v1) applied:

fio on /dev/loop0 over /dev/ram0:            10K iops
fio on ext4 over /dev/loop0 over /dev/ram0:   9K iops

3) the same as above, but avoid extra context switch (call 
loop_queue_work() directly from loop_queue_rq()):

fio on /dev/loop0 over /dev/ram0:           267K iops
fio on ext4 over /dev/loop0 over /dev/ram0: 223K iops

The problem is not about huge relative overhead while doing nothing 
else. It's rather about introducing extra latency (~100 microseconds on 
commodity h/w I used) which might be noticeable on modern SSDs (and h/w 
RAIDs with caching).

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2014-08-29 11:14 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-14 15:50 [PATCH v1 0/9] block & aio: kernel aio and loop mq conversion Ming Lei
2014-08-14 15:50 ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 1/9] aio: add aio_kernel_() interface Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 18:07   ` Zach Brown
2014-08-14 18:07     ` Zach Brown
2014-08-15 13:20     ` Ming Lei
2014-08-15 13:20       ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 2/9] fd/direct-io: introduce should_dirty for kernel aio Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 3/9] blk-mq: export blk_mq_freeze_queue and blk_mq_unfreeze_queue Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 4/9] blk-mq: introduce init_flush_rq_fn callback in 'blk_mq_ops' Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-15 16:19   ` Jens Axboe
2014-08-15 16:19     ` Jens Axboe
2014-08-16  7:49     ` Ming Lei
2014-08-16  7:49       ` Ming Lei
2014-08-17 18:39       ` Jens Axboe
2014-08-17 18:39         ` Jens Axboe
2014-08-14 15:50 ` [PATCH v1 5/9] block: loop: convert to blk-mq Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-15 16:31   ` Christoph Hellwig
2014-08-15 16:31     ` Christoph Hellwig
2014-08-15 16:36     ` Jens Axboe
2014-08-15 16:46       ` Jens Axboe
2014-08-16  8:06         ` Ming Lei
2014-08-16  8:06           ` Ming Lei
2014-08-17 17:48           ` Jens Axboe
2014-08-17 17:48             ` Jens Axboe
2014-08-18  1:22             ` Ming Lei
2014-08-18  1:22               ` Ming Lei
2014-08-18 11:53               ` Ming Lei
2014-08-18 11:53                 ` Ming Lei
2014-08-19 20:50                 ` Jens Axboe
2014-08-20  1:23                   ` Ming Lei
2014-08-20 16:09                     ` Jens Axboe
2014-08-21  2:54                       ` Ming Lei
2014-08-21  2:58                         ` Jens Axboe
2014-08-21  3:13                           ` Ming Lei
2014-08-21  3:15                             ` Ming Lei
2014-08-21  3:16                             ` Jens Axboe
2014-08-21  3:34                           ` Ming Lei
2014-08-21  5:44                   ` Ming Lei
2014-08-27 16:08                     ` Maxim Patlasov
2014-08-27 16:08                       ` Maxim Patlasov
2014-08-27 16:29                       ` Benjamin LaHaise
2014-08-27 16:29                         ` Benjamin LaHaise
2014-08-27 17:19                         ` Maxim Patlasov
2014-08-27 17:19                           ` Maxim Patlasov
2014-08-27 17:56                           ` Zach Brown
2014-08-27 17:56                             ` Zach Brown
2014-08-28  2:10                             ` Ming Lei
2014-08-28  2:10                               ` Ming Lei
2014-08-28  2:06                       ` Ming Lei
2014-08-28  2:06                         ` Ming Lei
2014-08-29 11:14                         ` Maxim Patlasov
2014-08-29 11:14                           ` Maxim Patlasov
2014-08-14 15:50 ` [PATCH v1 6/9] block: loop: say goodby to bio Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 7/9] block: loop: introduce lo_discard() and lo_req_flush() Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 8/9] block: loop: don't handle REQ_FUA explicitly Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 9/9] block: loop: support to submit I/O via kernel aio based Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 16:53 ` [PATCH v1 0/9] block & aio: kernel aio and loop mq conversion Jens Axboe
2014-08-14 16:53   ` Jens Axboe
2014-08-15 12:59   ` Ming Lei
2014-08-15 12:59     ` Ming Lei
2014-08-15 13:11     ` Christoph Hellwig
2014-08-15 13:11       ` Christoph Hellwig
2014-08-15 14:32       ` Ming Lei
2014-08-15 14:32         ` Ming Lei
2014-08-29 10:41 [PATCH v1 5/9] block: loop: convert to blk-mq Maxim Patlasov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.