All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rafal Mielniczuk <rafal.mielniczuk@citrix.com>
To: Jens Axboe <axboe@fb.com>,
	Marcus Granado <Marcus.Granado@citrix.com>,
	"Bob Liu" <bob.liu@oracle.com>,
	Arianna Avanzini <avanzini.arianna@gmail.com>
Cc: Felipe Franciosi <felipe.franciosi@citrix.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Christoph Hellwig" <hch@infradead.org>,
	David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	Jonathan Davies <Jonathan.Davies@citrix.com>
Subject: Re: [Xen-devel] [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback
Date: Mon, 10 Aug 2015 11:03:21 +0000	[thread overview]
Message-ID: <A1D98E0E70C35541AEBDE192A520C5434DABD6@AMSPEX01CL03.citrite.net> (raw)
In-Reply-To: 55935848.7080909@fb.com

On 01/07/15 04:03, Jens Axboe wrote:
> On 06/30/2015 08:21 AM, Marcus Granado wrote:
>> Hi,
>>
>> Our measurements for the multiqueue patch indicate a clear improvement
>> in iops when more queues are used.
>>
>> The measurements were obtained under the following conditions:
>>
>> - using blkback as the dom0 backend with the multiqueue patch applied to
>> a dom0 kernel 4.0 on 8 vcpus.
>>
>> - using a recent Ubuntu 15.04 kernel 3.19 with multiqueue frontend
>> applied to be used as a guest on 4 vcpus
>>
>> - using a micron RealSSD P320h as the underlying local storage on a Dell
>> PowerEdge R720 with 2 Xeon E5-2643 v2 cpus.
>>
>> - fio 2.2.7-22-g36870 as the generator of synthetic loads in the guest.
>> We used direct_io to skip caching in the guest and ran fio for 60s
>> reading a number of block sizes ranging from 512 bytes to 4MiB. Queue
>> depth of 32 for each queue was used to saturate individual vcpus in the
>> guest.
>>
>> We were interested in observing storage iops for different values of
>> block sizes. Our expectation was that iops would improve when increasing
>> the number of queues, because both the guest and dom0 would be able to
>> make use of more vcpus to handle these requests.
>>
>> These are the results (as aggregate iops for all the fio threads) that
>> we got for the conditions above with sequential reads:
>>
>> fio_threads  io_depth  block_size   1-queue_iops  8-queue_iops
>>      8           32       512           158K         264K
>>      8           32        1K           157K         260K
>>      8           32        2K           157K         258K
>>      8           32        4K           148K         257K
>>      8           32        8K           124K         207K
>>      8           32       16K            84K         105K
>>      8           32       32K            50K          54K
>>      8           32       64K            24K          27K
>>      8           32      128K            11K          13K
>>
>> 8-queue iops was better than single queue iops for all the block sizes.
>> There were very good improvements as well for sequential writes with
>> block size 4K (from 80K iops with single queue to 230K iops with 8
>> queues), and no regressions were visible in any measurement performed.
> Great results! And I don't know why this code has lingered for so long, 
> so thanks for helping get some attention to this again.
>
> Personally I'd be really interested in the results for the same set of 
> tests, but without the blk-mq patches. Do you have them, or could you 
> potentially run them?
>
Hello,

We rerun the tests for sequential reads with the identical settings but with Bob Liu's multiqueue patches reverted from dom0 and guest kernels.
The results we obtained were *better* than the results we got with multiqueue patches applied:

fio_threads  io_depth  block_size   1-queue_iops  8-queue_iops  *no-mq-patches_iops*
     8           32       512           158K         264K         321K
     8           32        1K           157K         260K         328K
     8           32        2K           157K         258K         336K
     8           32        4K           148K         257K         308K
     8           32        8K           124K         207K         188K
     8           32       16K            84K         105K         82K
     8           32       32K            50K          54K         36K
     8           32       64K            24K          27K         16K
     8           32      128K            11K          13K         11K

We noticed that the requests are not merged by the guest when the multiqueue patches are applied,
which results in a regression for small block sizes (RealSSD P320h's optimal block size is around 32-64KB).

We observed similar regression for the Dell MZ-5EA1000-0D3 100 GB 2.5" Internal SSD

As I understand blk-mq layer bypasses I/O scheduler which also effectively disables merges.
Could you explain why it is difficult to enable merging in the blk-mq layer?
That could help closing the performance gap we observed.

Otherwise, the tests shows that the multiqueue patches does not improve the performance,
at least when it comes to sequential read/writes operations.

Rafal



  parent reply	other threads:[~2015-08-10 11:03 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-11 23:57 [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback Arianna Avanzini
2014-09-11 23:57 ` [PATCH RFC v2 1/5] xen, blkfront: port to the the multi-queue block layer API Arianna Avanzini
2014-09-11 23:57 ` Arianna Avanzini
2014-09-13 19:29   ` Christoph Hellwig
2014-09-13 19:29   ` Christoph Hellwig
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-09-11 23:57 ` [PATCH RFC v2 2/5] xen, blkfront: introduce support for multiple block rings Arianna Avanzini
2014-09-11 23:57 ` Arianna Avanzini
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-09-11 23:57 ` [PATCH RFC v2 3/5] xen, blkfront: negotiate the number of block rings with the backend Arianna Avanzini
2014-09-11 23:57 ` Arianna Avanzini
2014-09-12 10:46   ` David Vrabel
2014-09-12 10:46   ` David Vrabel
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-09-11 23:57 ` [PATCH RFC v2 4/5] xen, blkback: introduce support for multiple block rings Arianna Avanzini
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-10-01 20:18   ` Konrad Rzeszutek Wilk
2014-09-11 23:57 ` Arianna Avanzini
2014-09-11 23:57 ` [PATCH RFC v2 5/5] xen, blkback: negotiate of the number of block rings with the frontend Arianna Avanzini
2014-09-11 23:57 ` Arianna Avanzini
2014-09-12 10:58   ` David Vrabel
2014-09-12 10:58   ` David Vrabel
2014-10-01 20:23   ` Konrad Rzeszutek Wilk
2014-10-01 20:23   ` Konrad Rzeszutek Wilk
2014-10-01 20:27 ` [PATCH RFC v2 0/5] Multi-queue support for xen-blkfront and xen-blkback Konrad Rzeszutek Wilk
2015-04-28  7:36   ` Christoph Hellwig
2015-04-28  7:46     ` Arianna Avanzini
2015-04-28  7:46     ` Arianna Avanzini
2015-05-13 10:29       ` Bob Liu
2015-05-13 10:29       ` Bob Liu
2015-06-30 14:21         ` Marcus Granado
2015-06-30 14:21         ` [Xen-devel] " Marcus Granado
2015-07-01  0:04           ` Bob Liu
2015-07-01  0:04           ` Bob Liu
2015-07-01  3:02           ` Jens Axboe
2015-07-01  3:02           ` [Xen-devel] " Jens Axboe
2015-08-10 11:03             ` Rafal Mielniczuk
2015-08-10 11:03             ` Rafal Mielniczuk [this message]
2015-08-10 11:14               ` Bob Liu
2015-08-10 11:14               ` [Xen-devel] " Bob Liu
2015-08-10 15:52               ` Jens Axboe
2015-08-10 15:52               ` [Xen-devel] " Jens Axboe
2015-08-11  6:07                 ` Bob Liu
2015-08-11  6:07                 ` [Xen-devel] " Bob Liu
2015-08-11  9:45                   ` Rafal Mielniczuk
2015-08-11  9:45                   ` [Xen-devel] " Rafal Mielniczuk
2015-08-11 17:32                     ` Jens Axboe
2015-08-12 10:16                       ` Bob Liu
2015-08-12 16:46                         ` Rafal Mielniczuk
2015-08-14  8:29                           ` Bob Liu
2015-08-14 12:30                             ` Rafal Mielniczuk
2015-08-14 12:30                             ` [Xen-devel] " Rafal Mielniczuk
2015-08-18  9:45                               ` Rafal Mielniczuk
2015-08-18  9:45                               ` [Xen-devel] " Rafal Mielniczuk
2015-08-14  8:29                           ` Bob Liu
2015-08-12 16:46                         ` Rafal Mielniczuk
2015-08-12 10:16                       ` Bob Liu
2015-08-11 17:32                     ` Jens Axboe
2015-04-28  7:36   ` Christoph Hellwig
2014-10-01 20:27 ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A1D98E0E70C35541AEBDE192A520C5434DABD6@AMSPEX01CL03.citrite.net \
    --to=rafal.mielniczuk@citrix.com \
    --cc=Jonathan.Davies@citrix.com \
    --cc=Marcus.Granado@citrix.com \
    --cc=avanzini.arianna@gmail.com \
    --cc=axboe@fb.com \
    --cc=bob.liu@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=felipe.franciosi@citrix.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.