All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxim Patlasov <mpatlasov@parallels.com>
To: Ming Lei <ming.lei@canonical.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Kleikamp <dave.kleikamp@oracle.com>,
	"Zach Brown" <zab@zabbo.net>, Benjamin LaHaise <bcrl@kvack.org>,
	Kent Overstreet <kmo@daterainc.com>, AIO <linux-aio@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v1 5/9] block: loop: convert to blk-mq
Date: Fri, 29 Aug 2014 15:14:45 +0400	[thread overview]
Message-ID: <540060A5.1030502@parallels.com> (raw)
In-Reply-To: <CACVXFVOx6n88kFhCk60YOfVa22qw1B3Gs0+Mj_ng2T5zy1Wb3A@mail.gmail.com>

On 08/28/2014 06:06 AM, Ming Lei wrote:
> On 8/28/14, Maxim Patlasov <mpatlasov@parallels.com> wrote:
>> On 08/21/2014 09:44 AM, Ming Lei wrote:
>>> On Wed, Aug 20, 2014 at 4:50 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> Reworked a bit more:
>>>>
>>>> http://git.kernel.dk/?p=linux-block.git;a=commit;h=a323185a761b9a54dc340d383695b4205ea258b6
>>> One big problem of the commit is that it is basically a serialized
>>> workqueue
>>> because of single &hctx->run_work, and per-req work_struct has to be
>>> used for concurrent implementation.  So looks the approach isn't flexible
>>> enough compared with doing that in driver, or any idea about how to fix
>>> that?
>>>
>> I'm interested what's the price of handling requests in a separate
>> thread at large. I used the following fio script:
>>
>>       [global]
>>       direct=1
>>       bsrange=512-512
>>       timeout=10
>>       numjobs=1
>>       ioengine=sync
>>
>>       filename=/dev/loop0 # or /dev/nullb0
>>
>>       [f1]
>>       rw=randwrite
>>
>> to compare the performance of:
>>
>> 1) /dev/loop0 of 3.17.0-rc1 with Ming's patches applied -- 11K iops
> If you enable BLK_MQ_F_WQ_CONTEXT, it isn't strange to see this
> result since blk-mq implements a serialized workqueue.

BLK_MQ_F_WQ_CONTEXT is not in 3.17.0-rc1, so I couldn't enable it.

>
>> 2) the same as above, but call loop_queue_work() directly from
>> loop_queue_rq() -- 270K iops
>> 3) /dev/nullb0 of 3.17.0-rc1 -- 380K iops
> In my recent investigation and discussion with Jens, using workqueue
> may introduce some regression for cases like loop over null_blk, tmpfs.
>
> And 270K vs. 380K is a bit similar with my result, and it was observed that
> context switch is increased by more than 50% with introducing workqueue.

The figures are similar, but the comparison is not. Both 270K and 380K 
refer to configurations where no extra context switch involved.

>
> I will post V3 which will use previous kthread, with blk-mq & kernel aio, which
> should make full use of blk-mq and kernel aio, and won't introduce regression
> for cases like above.

That would be great!

>
>> Taking into account so big difference (11K vs. 270K), would it be worthy
>> to implement pure non-blocking version of aio_kernel_submit() returning
>> error if blocking needed? Then loop driver (or any other in-kernel user)
> The kernel aio submit is very similar with user space's implementation,
> except for block plug&unplug usage in user space aio submit path.
>
> If it is blocked in aio_kernel_submit(), you should observe similar thing
> with io_submit() too.

Yes, I agree. My point was that there is a room for optimization as my 
experiments demonstrate. The question is whether it's worthy to 
sophisticate kernel aio (and fs-specific code too) for the sake of that 
optimization.

In fact, in a simple case like block fs on top of loopback device on top 
of a file on another block fs, what kernel aio does for loopback driver 
is a subtle way of converting incoming bio-s to outgoing bio-s. In case 
you know where the image file is placed (e.g. by fiemap), such a 
conversion may be done with zero overhead and anything that makes the 
overhead noticeable is suspicious. And it is easy to imagine other 
use-cases when that extra context switch is avoidable.

Thanks,
Maxim

WARNING: multiple messages have this Message-ID (diff)
From: Maxim Patlasov <mpatlasov@parallels.com>
To: Ming Lei <ming.lei@canonical.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Kleikamp <dave.kleikamp@oracle.com>,
	"Zach Brown" <zab@zabbo.net>, Benjamin LaHaise <bcrl@kvack.org>,
	Kent Overstreet <kmo@daterainc.com>, AIO <linux-aio@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v1 5/9] block: loop: convert to blk-mq
Date: Fri, 29 Aug 2014 15:14:45 +0400	[thread overview]
Message-ID: <540060A5.1030502@parallels.com> (raw)
In-Reply-To: <CACVXFVOx6n88kFhCk60YOfVa22qw1B3Gs0+Mj_ng2T5zy1Wb3A@mail.gmail.com>

On 08/28/2014 06:06 AM, Ming Lei wrote:
> On 8/28/14, Maxim Patlasov <mpatlasov@parallels.com> wrote:
>> On 08/21/2014 09:44 AM, Ming Lei wrote:
>>> On Wed, Aug 20, 2014 at 4:50 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> Reworked a bit more:
>>>>
>>>> http://git.kernel.dk/?p=linux-block.git;a=commit;h=a323185a761b9a54dc340d383695b4205ea258b6
>>> One big problem of the commit is that it is basically a serialized
>>> workqueue
>>> because of single &hctx->run_work, and per-req work_struct has to be
>>> used for concurrent implementation.  So looks the approach isn't flexible
>>> enough compared with doing that in driver, or any idea about how to fix
>>> that?
>>>
>> I'm interested what's the price of handling requests in a separate
>> thread at large. I used the following fio script:
>>
>>       [global]
>>       direct=1
>>       bsrange=512-512
>>       timeout=10
>>       numjobs=1
>>       ioengine=sync
>>
>>       filename=/dev/loop0 # or /dev/nullb0
>>
>>       [f1]
>>       rw=randwrite
>>
>> to compare the performance of:
>>
>> 1) /dev/loop0 of 3.17.0-rc1 with Ming's patches applied -- 11K iops
> If you enable BLK_MQ_F_WQ_CONTEXT, it isn't strange to see this
> result since blk-mq implements a serialized workqueue.

BLK_MQ_F_WQ_CONTEXT is not in 3.17.0-rc1, so I couldn't enable it.

>
>> 2) the same as above, but call loop_queue_work() directly from
>> loop_queue_rq() -- 270K iops
>> 3) /dev/nullb0 of 3.17.0-rc1 -- 380K iops
> In my recent investigation and discussion with Jens, using workqueue
> may introduce some regression for cases like loop over null_blk, tmpfs.
>
> And 270K vs. 380K is a bit similar with my result, and it was observed that
> context switch is increased by more than 50% with introducing workqueue.

The figures are similar, but the comparison is not. Both 270K and 380K 
refer to configurations where no extra context switch involved.

>
> I will post V3 which will use previous kthread, with blk-mq & kernel aio, which
> should make full use of blk-mq and kernel aio, and won't introduce regression
> for cases like above.

That would be great!

>
>> Taking into account so big difference (11K vs. 270K), would it be worthy
>> to implement pure non-blocking version of aio_kernel_submit() returning
>> error if blocking needed? Then loop driver (or any other in-kernel user)
> The kernel aio submit is very similar with user space's implementation,
> except for block plug&unplug usage in user space aio submit path.
>
> If it is blocked in aio_kernel_submit(), you should observe similar thing
> with io_submit() too.

Yes, I agree. My point was that there is a room for optimization as my 
experiments demonstrate. The question is whether it's worthy to 
sophisticate kernel aio (and fs-specific code too) for the sake of that 
optimization.

In fact, in a simple case like block fs on top of loopback device on top 
of a file on another block fs, what kernel aio does for loopback driver 
is a subtle way of converting incoming bio-s to outgoing bio-s. In case 
you know where the image file is placed (e.g. by fiemap), such a 
conversion may be done with zero overhead and anything that makes the 
overhead noticeable is suspicious. And it is easy to imagine other 
use-cases when that extra context switch is avoidable.

Thanks,
Maxim

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

  reply	other threads:[~2014-08-29 11:14 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-14 15:50 [PATCH v1 0/9] block & aio: kernel aio and loop mq conversion Ming Lei
2014-08-14 15:50 ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 1/9] aio: add aio_kernel_() interface Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 18:07   ` Zach Brown
2014-08-14 18:07     ` Zach Brown
2014-08-15 13:20     ` Ming Lei
2014-08-15 13:20       ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 2/9] fd/direct-io: introduce should_dirty for kernel aio Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 3/9] blk-mq: export blk_mq_freeze_queue and blk_mq_unfreeze_queue Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 4/9] blk-mq: introduce init_flush_rq_fn callback in 'blk_mq_ops' Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-15 16:19   ` Jens Axboe
2014-08-15 16:19     ` Jens Axboe
2014-08-16  7:49     ` Ming Lei
2014-08-16  7:49       ` Ming Lei
2014-08-17 18:39       ` Jens Axboe
2014-08-17 18:39         ` Jens Axboe
2014-08-14 15:50 ` [PATCH v1 5/9] block: loop: convert to blk-mq Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-15 16:31   ` Christoph Hellwig
2014-08-15 16:31     ` Christoph Hellwig
2014-08-15 16:36     ` Jens Axboe
2014-08-15 16:46       ` Jens Axboe
2014-08-16  8:06         ` Ming Lei
2014-08-16  8:06           ` Ming Lei
2014-08-17 17:48           ` Jens Axboe
2014-08-17 17:48             ` Jens Axboe
2014-08-18  1:22             ` Ming Lei
2014-08-18  1:22               ` Ming Lei
2014-08-18 11:53               ` Ming Lei
2014-08-18 11:53                 ` Ming Lei
2014-08-19 20:50                 ` Jens Axboe
2014-08-20  1:23                   ` Ming Lei
2014-08-20 16:09                     ` Jens Axboe
2014-08-21  2:54                       ` Ming Lei
2014-08-21  2:58                         ` Jens Axboe
2014-08-21  3:13                           ` Ming Lei
2014-08-21  3:15                             ` Ming Lei
2014-08-21  3:16                             ` Jens Axboe
2014-08-21  3:34                           ` Ming Lei
2014-08-21  5:44                   ` Ming Lei
2014-08-27 16:08                     ` Maxim Patlasov
2014-08-27 16:08                       ` Maxim Patlasov
2014-08-27 16:29                       ` Benjamin LaHaise
2014-08-27 16:29                         ` Benjamin LaHaise
2014-08-27 17:19                         ` Maxim Patlasov
2014-08-27 17:19                           ` Maxim Patlasov
2014-08-27 17:56                           ` Zach Brown
2014-08-27 17:56                             ` Zach Brown
2014-08-28  2:10                             ` Ming Lei
2014-08-28  2:10                               ` Ming Lei
2014-08-28  2:06                       ` Ming Lei
2014-08-28  2:06                         ` Ming Lei
2014-08-29 11:14                         ` Maxim Patlasov [this message]
2014-08-29 11:14                           ` Maxim Patlasov
2014-08-14 15:50 ` [PATCH v1 6/9] block: loop: say goodby to bio Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 7/9] block: loop: introduce lo_discard() and lo_req_flush() Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 8/9] block: loop: don't handle REQ_FUA explicitly Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 15:50 ` [PATCH v1 9/9] block: loop: support to submit I/O via kernel aio based Ming Lei
2014-08-14 15:50   ` Ming Lei
2014-08-14 16:53 ` [PATCH v1 0/9] block & aio: kernel aio and loop mq conversion Jens Axboe
2014-08-14 16:53   ` Jens Axboe
2014-08-15 12:59   ` Ming Lei
2014-08-15 12:59     ` Ming Lei
2014-08-15 13:11     ` Christoph Hellwig
2014-08-15 13:11       ` Christoph Hellwig
2014-08-15 14:32       ` Ming Lei
2014-08-15 14:32         ` Ming Lei
2014-08-29 10:41 [PATCH v1 5/9] block: loop: convert to blk-mq Maxim Patlasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540060A5.1030502@parallels.com \
    --to=mpatlasov@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=dave.kleikamp@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=kmo@daterainc.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.