All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
To: Jens Axboe <axboe@kernel.dk>,
	Bart Van Assche <bvanassche@acm.org>,
	Christoph Hellwig <hch@lst.de>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"scameron@beardog.cce.hp.com" <scameron@beardog.cce.hp.com>
Cc: Bart Van Assche <bvanassche@fusionio.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: scsi-mq
Date: Thu, 19 Jun 2014 00:58:02 +0000	[thread overview]
Message-ID: <94D0CD8314A33A4D9D801C0FE68B402958B3D123@G9W0745.americas.hpqcorp.net> (raw)
In-Reply-To: <53A10B3A.6050705@kernel.dk>



> -----Original Message-----
> From: Jens Axboe [mailto:axboe@kernel.dk]
> Sent: Tuesday, 17 June, 2014 10:45 PM
> To: Bart Van Assche; Christoph Hellwig; James Bottomley
> Cc: Bart Van Assche; Elliott, Robert (Server Storage); linux-
> scsi@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: scsi-mq
> 
> On 2014-06-17 07:27, Bart Van Assche wrote:
> > On 06/12/14 15:48, Christoph Hellwig wrote:
> >> Bart and Robert have helped with some very detailed measurements that they
> >> might be able to send in reply to this, although these usually involve
> >> significantly reworked low level drivers to avoid other bottle necks.
> >
> > In case someone would like to see the results of the measurements I ran,
> > these results can be found here:
> > https://docs.google.com/file/d/0B1YQOreL3_FxUXFMSjhmNDBNNTg.
> >
> > Two important conclusions from the data in that PDF document are as
> follows:
> > - A small but significant performance improvement for the traditional
> >    SCSI mid-layer (use_blk_mq=N).
> > - A very significant performance improvement for multithreaded
> >    workloads with use_blk_mq=Y. As an example, the number of I/O
> >    operations per second reported for the random write test increased
> >    with 170%. That means 2.7 times the performance
> >    of use_blk_mq=N.
> 
> Thanks for posting these numbers, Bart. The CPU utilization and IOPS
> speak a very clear message. The only mystery is why the singe threaded
> performance is down. That we need to get sort, but it's not a show
> stopper for inclusion.
> 
> If you run the single threaded tests and watch for queue depths, is
> there a difference between blk-mq=y/scsi-mq and the stock kernel?
> 
> > I think this means the scsi-mq patches are ready for wider use.
> 
> I would agree. James, I haven't seen any comments from you on this yet.
> I've run various bits of scsi-mq testing as well, and no ill effects
> seen. On top of that, Christophs patches are nicely separated and have
> general benefits even for the non-blk-mq cases. Time to shove them into
> the queue for the next merge window?
> 
> --
> Jens Axboe

We've been testing the hpsa driver extensively with the scsi-mq-wip trees.
I don't have numbers with the latest scsi-mq tree yet, but here are some
performance numbers from scsi-mq-wip.5 through 7.  

scsi-mq slightly underperformed non-scsi-mq when using multiple devices:
* normal		975K IOPS (16 devices each made from 1 drive)
* scsi-mq-wip.5	905K IOPS (16 devices each made from 1 drive)
* scsi-mq-wip.6+	969K IOPS (16 devices... 3 threads per device)

but was much better when using a single device:
* normal		166K IOPS (1 device made from 8 drives, 1 thread)          
* normal		266K IOPS (1 device made from 8 drives, 12 threads)
* scsi-mq-wip.5	880K IOPS (1 device made from 8 drives, 12 threads)

* normal		266K IOPS (1 device made from 16 drives, 12 threads)
* scsi-mq-wip.5	973K IOPS (1 device made from 16 drives, 12 threads)
* scsi-mq-wip.6+	979K IOPS (1 device made from 16 drives, 12 threads)


The headline improvement is that one device can reach the same performance 
as multiple devices - no more bottleneck in per-device queue locks limiting 
performance to around 266K IOPS per device.  Even the scsi_debug driver in
fake_rw mode hits that limit.

hpsa is limited to one submission queue, so submissions from multiple CPUs 
still meet inside the driver - SCSI Express will keep them isolated all 
the way.  hpsa supports one completion queue per CPU, so completions are 
already isolated.

The blk-mq bitmap tag allocator is working much better than its 
predecessor, but some combinations of active CPUs and devices still 
result in low queue depths for some devices.

We haven't fully tested cases where the hardware interrupt is handled
on a different CPU than the block layer wants to run its completion
processing per rq_affinity. That was previously scheduled as a softirq,
but is now handled directly in hardirq processing with IPIs.  This
changes the CPU utilization %soft and %hard metrics:
* normal 	5% hard, 25% soft
* scsi-mq	30% hard, 0% soft
(with something like 5% usr, 55% sys, 8% iowait idle, 2% idle)


Configuration:
* HP ProLiant DL380p Gen8 with 6 CPU hyperthreading cores (12 logical cores)
* lockless hpsa driver (forthcoming patches with performance 
  improvements such as eliminating locks, plus improved error handling)
* Smart Array P431 RAID controller
* 16 12 Gb/s SAS SSDs
* fio: 4 KiB random reads with options:
  direct=1, ioengine=libaio, norandommap, randrepeat=0,
  iodepth=96 or 1024, numjobs=1 or 12, thread, 
  cpus_allowed=0-11, cpus_allowed_policy=split,
  iodepth_batch=4, iodepth_batch_complete=4, userspace_reap,
  bs=4096, rw=randread
  time_based, group_reporting, gtod_reduce
* block layer queue parameters:
  nr_requests=1011, add_random=0
  nomerges=2, rq_affinity=2, max_sectors_kb=max_hw_sectors_kb
* old version of irqbalance-1.0.4, which still honors 
  /proc/irq/NN/affinity_hint (the new version defaults to
  ignoring that)

---
Rob Elliott    HP Server Storage




      parent reply	other threads:[~2014-06-19  1:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-12 13:48 scsi-mq Christoph Hellwig
2014-06-12 13:48 ` [PATCH 01/14] sd: don't use rq->cmd_len before setting it up Christoph Hellwig
2014-06-12 13:48 ` [PATCH 02/14] scsi: split __scsi_queue_insert Christoph Hellwig
2014-06-12 13:48 ` [PATCH 03/14] scsi: centralize command re-queueing in scsi_dispatch_fn Christoph Hellwig
2014-06-12 13:48 ` [PATCH 04/14] scsi: set ->scsi_done before calling scsi_dispatch_cmd Christoph Hellwig
2014-06-12 13:48 ` [PATCH 05/14] scsi: push host_lock down into scsi_{host,target}_queue_ready Christoph Hellwig
2014-06-12 13:48 ` [PATCH 06/14] scsi: convert target_busy to an atomic_t Christoph Hellwig
2014-06-12 13:48 ` [PATCH 07/14] scsi: convert host_busy to atomic_t Christoph Hellwig
2014-06-12 13:49 ` [PATCH 08/14] scsi: convert device_busy " Christoph Hellwig
2014-06-12 13:49 ` [PATCH 09/14] scsi: fix the {host,target,device}_blocked counter mess Christoph Hellwig
2014-06-12 13:49 ` [PATCH 10/14] scsi: only maintain target_blocked if the driver has a target queue limit Christoph Hellwig
2014-06-21 22:10   ` Elliott, Robert (Server Storage)
2014-06-23  7:09     ` Christoph Hellwig
2014-06-12 13:49 ` [PATCH 11/14] scsi: unwind blk_end_request_all and blk_end_request_err calls Christoph Hellwig
2014-06-12 13:49 ` [PATCH 12/14] scatterlist: allow chaining to preallocated chunks Christoph Hellwig
2014-06-12 13:49 ` [PATCH 13/14] scsi: add support for a blk-mq based I/O path Christoph Hellwig
2014-06-12 13:49 ` [PATCH 14/14] fnic: reject device resets without assigned tags for the blk-mq case Christoph Hellwig
2014-06-13  6:42 ` scsi-mq Bart Van Assche
2014-06-17 14:27 ` scsi-mq Bart Van Assche
2014-06-18  3:44   ` scsi-mq Jens Axboe
2014-06-18  7:09     ` scsi-mq Bart Van Assche
2014-06-21  0:52       ` scsi-mq Elliott, Robert (Server Storage)
2014-06-23  7:09         ` scsi-mq Christoph Hellwig
2014-06-19  0:58     ` Elliott, Robert (Server Storage) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94D0CD8314A33A4D9D801C0FE68B402958B3D123@G9W0745.americas.hpqcorp.net \
    --to=elliott@hp.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=bvanassche@fusionio.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=scameron@beardog.cce.hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.