linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: Device or HBA level QD throttling creates randomness in sequetial workload
@ 2016-10-24 18:54 Kashyap Desai
  2016-10-26 20:56 ` Omar Sandoval
  2016-10-31 17:24 ` Jens Axboe
  0 siblings, 2 replies; 13+ messages in thread
From: Kashyap Desai @ 2016-10-24 18:54 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: linux-scsi, linux-kernel, linux-block, axboe, Christoph Hellwig,
	paolo.valente

> -----Original Message-----
> From: Omar Sandoval [mailto:osandov@osandov.com]
> Sent: Monday, October 24, 2016 9:11 PM
> To: Kashyap Desai
> Cc: linux-scsi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> block@vger.kernel.org; axboe@kernel.dk; Christoph Hellwig;
> paolo.valente@linaro.org
> Subject: Re: Device or HBA level QD throttling creates randomness in
sequetial
> workload
>
> On Mon, Oct 24, 2016 at 06:35:01PM +0530, Kashyap Desai wrote:
> > >
> > > On Fri, Oct 21, 2016 at 05:43:35PM +0530, Kashyap Desai wrote:
> > > > Hi -
> > > >
> > > > I found below conversation and it is on the same line as I wanted
> > > > some input from mailing list.
> > > >
> > > > http://marc.info/?l=linux-kernel&m=147569860526197&w=2
> > > >
> > > > I can do testing on any WIP item as Omar mentioned in above
> > discussion.
> > > > https://github.com/osandov/linux/tree/blk-mq-iosched
> >
> > I tried build kernel using this repo, but looks like it is not allowed
> > to reboot due to some changes in <block> layer.
>
> Did you build the most up-to-date version of that branch? I've been
force
> pushing to it, so the commit id that you built would be useful.
> What boot failure are you seeing?

Below  is latest commit on repo.
commit b077a9a5149f17ccdaa86bc6346fa256e3c1feda
Author: Omar Sandoval <osandov@fb.com>
Date:   Tue Sep 20 11:20:03 2016 -0700

    [WIP] blk-mq: limit bio queue depth

I have latest repo from 4.9/scsi-next maintained by Martin which boots
fine.  Only Delta is  " CONFIG_SBITMAP" is enabled in WIP blk-mq-iosched
branch. I could not see any meaningful data on boot hang, so going to try
one more time tomorrow.


>
> > >
> > > Are you using blk-mq for this disk? If not, then the work there
> > > won't
> > affect you.
> >
> > YES. I am using blk-mq for my test. I also confirm if use_blk_mq is
> > disable, Sequential work load issue is not seen and <cfq> scheduling
> > works well.
>
> Ah, okay, perfect. Can you send the fio job file you're using? Hard to
tell exactly
> what's going on without the details. A sequential workload with just one
> submitter is about as easy as it gets, so this _should_ be behaving
nicely.

<FIO script>

; setup numa policy for each thread
; 'numactl --show' to determine the maximum numa nodes
[global]
ioengine=libaio
buffered=0
rw=write
bssplit=4K/100
iodepth=256
numjobs=1
direct=1
runtime=60s
allow_mounted_write=0

[job1]
filename=/dev/sdd
..
[job24]
filename=/dev/sdaa

When I tune /sys/module/scsi_mod/parameters/use_blk_mq = 1, below is a
ioscheduler detail. (It is in blk-mq mode. )
/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host10/target10:2:13/10:
2:13:0/block/sdq/queue/scheduler:none

When I have set /sys/module/scsi_mod/parameters/use_blk_mq = 0,
ioscheduler picked by SML is <cfq>.
/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host10/target10:2:13/10:
2:13:0/block/sdq/queue/scheduler:noop deadline [cfq]

I see in blk-mq performance is very low for Sequential Write work load and
I confirm that blk-mq convert Sequential work load into random stream due
to  io-scheduler change in blk-mq vs legacy block layer.

>
> > >
> > > > Is there any workaround/alternative in latest upstream kernel, if
> > > > user wants to see limited penalty  for Sequential Work load on HDD
?
> > > >
> > > > ` Kashyap
> > > >
>
> P.S., your emails are being marked as spam by Gmail. Actually, Gmail
seems to
> mark just about everything I get from Broadcom as spam due to failed
DMARC.
>
> --
> Omar

^ permalink raw reply	[flat|nested] 13+ messages in thread
* RE: Device or HBA level QD throttling creates randomness in sequetial workload
@ 2016-10-21 12:13 Kashyap Desai
  2016-10-21 21:31 ` Omar Sandoval
  0 siblings, 1 reply; 13+ messages in thread
From: Kashyap Desai @ 2016-10-21 12:13 UTC (permalink / raw)
  To: linux-scsi, linux-kernel, linux-block
  Cc: axboe, Christoph Hellwig, paolo.valente, osandov

Hi -

I found below conversation and it is on the same line as I wanted some
input from mailing list.

http://marc.info/?l=linux-kernel&m=147569860526197&w=2

I can do testing on any WIP item as Omar mentioned in above discussion.
https://github.com/osandov/linux/tree/blk-mq-iosched

Is there any workaround/alternative in latest upstream kernel, if user
wants to see limited penalty  for Sequential Work load on HDD ?

` Kashyap

> -----Original Message-----
> From: Kashyap Desai [mailto:kashyap.desai@broadcom.com]
> Sent: Thursday, October 20, 2016 3:39 PM
> To: linux-scsi@vger.kernel.org
> Subject: Device or HBA level QD throttling creates randomness in
sequetial
> workload
>
> [ Apologize, if you find more than one instance of my email.
> Web based email client has some issue, so now trying git send mail.]
>
> Hi,
>
> I am doing some performance tuning in MR driver to understand how sdev
> queue depth and hba queue depth play role in IO submission from above
layer.
> I have 24 JBOD connected to MR 12GB controller and I can see performance
for
> 4K Sequential work load as below.
>
> HBA QD for MR controller is 4065 and Per device QD is set to 32
>
> queue depth from <fio> 256 reports 300K IOPS queue depth from <fio> 128
> reports 330K IOPS queue depth from <fio> 64 reports 360K IOPS queue
depth
> from <fio> 32 reports 510K IOPS
>
> In MR driver I added debug print and confirm that more IO come to driver
as
> random IO whenever I have <fio> queue depth more than 32.
>
> I have debug using scsi logging level and blktrace as well. Below is
snippet of
> logs using scsi logging level.  In summary, if SML do flow control of IO
due to
> Device QD or HBA QD, IO coming to LLD is more random pattern.
>
> I see IO coming to driver is not sequential.
>
> [79546.912041] sd 18:2:21:0: [sdy] tag#854 CDB: Write(10) 2a 00 00 03 c0
3b 00
> 00 01 00 [79546.912049] sd 18:2:21:0: [sdy] tag#855 CDB: Write(10) 2a 00
00 03
> c0 3c 00 00 01 00 [79546.912053] sd 18:2:21:0: [sdy] tag#886 CDB:
Write(10) 2a
> 00 00 03 c0 5b 00 00 01 00
>
> <KD> After LBA "00 03 c0 3c" next command is with LBA "00 03 c0 5b".
> Two Sequence are overlapped due to sdev QD throttling.
>
> [79546.912056] sd 18:2:21:0: [sdy] tag#887 CDB: Write(10) 2a 00 00 03 c0
5c 00
> 00 01 00 [79546.912250] sd 18:2:21:0: [sdy] tag#856 CDB: Write(10) 2a 00
00 03
> c0 3d 00 00 01 00 [79546.912257] sd 18:2:21:0: [sdy] tag#888 CDB:
Write(10) 2a
> 00 00 03 c0 5d 00 00 01 00 [79546.912259] sd 18:2:21:0: [sdy] tag#857
CDB:
> Write(10) 2a 00 00 03 c0 3e 00 00 01 00 [79546.912268] sd 18:2:21:0:
[sdy]
> tag#858 CDB: Write(10) 2a 00 00 03 c0 3f 00 00 01 00
>
>  If scsi_request_fn() breaks due to unavailability of device queue (due
to below
> check), will there be any side defect as I observe ?
>                 if (!scsi_dev_queue_ready(q, sdev))
>                              break;
>
> If I reduce HBA QD and make sure IO from above layer is throttled due to
HBA
> QD, there is a same impact.
> MR driver use host wide shared tag map.
>
> Can someone help me if this can be tunable in LLD providing additional
settings
> or it is expected behavior ? Problem I am facing is, I am not able to
figure out
> optimal device queue depth for different configuration and work load.
>
> Thanks, Kashyap

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-30 18:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24 18:54 Device or HBA level QD throttling creates randomness in sequetial workload Kashyap Desai
2016-10-26 20:56 ` Omar Sandoval
2016-10-31 17:24 ` Jens Axboe
2016-11-01  5:40   ` Kashyap Desai
2017-01-30 13:52   ` Kashyap Desai
2017-01-30 16:30     ` Bart Van Assche
2017-01-30 16:32       ` Jens Axboe
2017-01-30 18:28         ` Kashyap Desai
2017-01-30 18:29           ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2016-10-21 12:13 Kashyap Desai
2016-10-21 21:31 ` Omar Sandoval
2016-10-24 13:05   ` Kashyap Desai
2016-10-24 15:41     ` Omar Sandoval

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).