All of lore.kernel.org
 help / color / mirror / Atom feed
* Adjusting minimum packet size or "wait to merge requests" in SRP
@ 2009-10-28 18:47 Chris Worley
       [not found] ` <f3177b9e0910281147u5a47f75ao8bbe156d5b04969c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Worley @ 2009-10-28 18:47 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

It appears that SRP tries to coalesce and fragment initiator I/O
requests into 64KB packets, as that looks to be the size requested
to/from the device on the target side (and the I/O scheduler is
disabled on the target).

Is there a way to control this, where no coalescing occurs when
latency is an issue and requests are small, and no fragmentation
occurs when requests are large?

Or, am I totally wrong in my assumption that SRP is coalescing/fragmenting data?

Thanks,

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found] ` <f3177b9e0910281147u5a47f75ao8bbe156d5b04969c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-10-28 19:14   ` Bart Van Assche
       [not found]     ` <e2e108260910281214y5e3b5f4u24438986672e81b3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-10-28 19:51   ` Roland Dreier
  2009-10-29 18:30   ` [Scst-devel] " Vladislav Bolkhovitin
  2 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2009-10-28 19:14 UTC (permalink / raw)
  To: Chris Worley; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

On Wed, Oct 28, 2009 at 7:47 PM, Chris Worley <worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> It appears that SRP tries to coalesce and fragment initiator I/O
> requests into 64KB packets, as that looks to be the size requested
> to/from the device on the target side (and the I/O scheduler is
> disabled on the target).
>
> Is there a way to control this, where no coalescing occurs when
> latency is an issue and requests are small, and no fragmentation
> occurs when requests are large?
>
> Or, am I totally wrong in my assumption that SRP is coalescing/fragmenting data?

Regarding avoiding coalescing of I/O requests: which I/O scheduler is
being used on the initiator system and how has it been configured via
sysfs ?

Adjusting the constant MAX_RDMA_SIZE in scst/srpt/src/ib_srpt.h might
help to avoid fragmentation of large requests by the SRP protocol.
Please post a follow-up message to the mailing list with your findings
such that MAX_RDMA_SIZE can be converted from a compile-time constant
to a sysfs variable if this would be useful.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found]     ` <e2e108260910281214y5e3b5f4u24438986672e81b3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-10-28 19:38       ` Chris Worley
       [not found]         ` <f3177b9e0910281238n1e53653eq3e667010caf8e745-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Worley @ 2009-10-28 19:38 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

On Wed, Oct 28, 2009 at 1:14 PM, Bart Van Assche
<bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Wed, Oct 28, 2009 at 7:47 PM, Chris Worley <worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> It appears that SRP tries to coalesce and fragment initiator I/O
>> requests into 64KB packets, as that looks to be the size requested
>> to/from the device on the target side (and the I/O scheduler is
>> disabled on the target).
>>
>> Is there a way to control this, where no coalescing occurs when
>> latency is an issue and requests are small, and no fragmentation
>> occurs when requests are large?
>>
>> Or, am I totally wrong in my assumption that SRP is coalescing/fragmenting data?
>
> Regarding avoiding coalescing of I/O requests: which I/O scheduler is
> being used on the initiator system and how has it been configured via
> sysfs ?

There is no scheduler running on either target or initiator on the
drives in question (sorry I worded that incorrectly initially), or so
I've been told (this information is second-hand).  I did see iostat
output from the initiator in his case, where there were long waits and
service times that I'm guessing was due to some coalescing/merging.
There was also a hint in the iostat output that a scheduler was
enabled, as there were non-zero values (occasionally) under the
[rw]qm/s columns, which, if I understand iostat correctly, means there
is a scheduler merging results.

So you're saying there is no hold-off for merging on the initiator
side of the IB/SRP stack?
>
> Adjusting the constant MAX_RDMA_SIZE in scst/srpt/src/ib_srpt.h might
> help to avoid fragmentation of large requests by the SRP protocol.
> Please post a follow-up message to the mailing list with your findings
> such that MAX_RDMA_SIZE can be converted from a compile-time constant
> to a sysfs variable if this would be useful.

Will do.

Thanks,

Chris
>
> Bart.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found] ` <f3177b9e0910281147u5a47f75ao8bbe156d5b04969c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-10-28 19:14   ` Bart Van Assche
@ 2009-10-28 19:51   ` Roland Dreier
  2009-10-29 18:30   ` [Scst-devel] " Vladislav Bolkhovitin
  2 siblings, 0 replies; 8+ messages in thread
From: Roland Dreier @ 2009-10-28 19:51 UTC (permalink / raw)
  To: Chris Worley; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel


 > It appears that SRP tries to coalesce and fragment initiator I/O
 > requests into 64KB packets, as that looks to be the size requested
 > to/from the device on the target side (and the I/O scheduler is
 > disabled on the target).

There is no code in the SRP initiator that does anything to change IO
requests that I know of.  So I think this is happening somewhere higher
in the stack.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found]         ` <f3177b9e0910281238n1e53653eq3e667010caf8e745-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-10-28 19:58           ` David Dillow
       [not found]             ` <1256759902.3544.9.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: David Dillow @ 2009-10-28 19:58 UTC (permalink / raw)
  To: Chris Worley
  Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

On Wed, 2009-10-28 at 13:38 -0600, Chris Worley wrote:
> There is no scheduler running on either target or initiator on the
> drives in question (sorry I worded that incorrectly initially), or so
> I've been told (this information is second-hand). 

So, noop scheduler, then?

Under noop, the block layer will send requests as soon as it can without
merging. If it has more requests outstanding than the queue length on
the SRP initiator, then it will merge the new request with the queued
ones if possible.

>  I did see iostat
> output from the initiator in his case, where there were long waits and
> service times that I'm guessing was due to some coalescing/merging.
> There was also a hint in the iostat output that a scheduler was
> enabled, as there were non-zero values (occasionally) under the
> [rw]qm/s columns, which, if I understand iostat correctly, means there
> is a scheduler merging results.
> 
> So you're saying there is no hold-off for merging on the initiator
> side of the IB/SRP stack?

The SRP initiator just hands off requests as quick as they are sent to
it by the block layer. You can control how big those requests are by
tuning /sys/block/$DEV/queue/max_sectors_kb up to .../max_hw_sectors_kb
which gets set by the max_sect parameter when adding the SRP target.

You can get some hold-off potentially by using a non-noop scheduler for
the block device, see /sys/block/$DEV/queue/scheduler. 'as' or
'deadline' may fit your bill, but they have a habit of breaking up
requests into smaller chunks.

Also, you want 'options ib_srp srp_sg_tablesize=255'
in /etc/modprobe.conf, as by default it only allows 12 scatter/gather
entries, which will only guarantee a 48KB request size. Using 255
guarantees you can send a 1020KB request. Of course, if the pages
coalesce in the request, you can send much larger requests before
running out of S/G entires. max_sectors_kb will limit what gets sent in
either case.

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found]             ` <1256759902.3544.9.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
@ 2009-10-28 20:25               ` Chris Worley
       [not found]                 ` <f3177b9e0910281325i5ef5ce86u758ed665329232f2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Worley @ 2009-10-28 20:25 UTC (permalink / raw)
  To: David Dillow
  Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

On Wed, Oct 28, 2009 at 1:58 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> On Wed, 2009-10-28 at 13:38 -0600, Chris Worley wrote:
>> There is no scheduler running on either target or initiator on the
>> drives in question (sorry I worded that incorrectly initially), or so
>> I've been told (this information is second-hand).
>
> So, noop scheduler, then?

Yes, "elevator=noop" on both sides.  Again, sorry to be unclear about that.

>
> Under noop, the block layer will send requests as soon as it can without
> merging. If it has more requests outstanding than the queue length on
> the SRP initiator, then it will merge the new request with the queued
> ones if possible.

So, noop will merge requests when the queue is full, but not hold-off to merge?

<snip>
>
> The SRP initiator just hands off requests as quick as they are sent to
> it by the block layer. You can control how big those requests are by
> tuning /sys/block/$DEV/queue/max_sectors_kb up to .../max_hw_sectors_kb
> which gets set by the max_sect parameter when adding the SRP target.

So the block layer may also hold-off on small requests, and decreasing
max_sectors_kb will force it to flush to the SRP initiator ASAP (or is
this just used for fragmentation of large requests)?

Note that 'm trying to minimize latency for very small requests.

Thanks,

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found]                 ` <f3177b9e0910281325i5ef5ce86u758ed665329232f2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-10-28 21:05                   ` David Dillow
  0 siblings, 0 replies; 8+ messages in thread
From: David Dillow @ 2009-10-28 21:05 UTC (permalink / raw)
  To: Chris Worley
  Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

On Wed, 2009-10-28 at 16:25 -0400, Chris Worley wrote:
> On Wed, Oct 28, 2009 at 1:58 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> > Under noop, the block layer will send requests as soon as it can without
> > merging. If it has more requests outstanding than the queue length on
> > the SRP initiator, then it will merge the new request with the queued
> > ones if possible.
> 
> So, noop will merge requests when the queue is full, but not hold-off
> to merge?

Correct.

> > The SRP initiator just hands off requests as quick as they are sent to
> > it by the block layer. You can control how big those requests are by
> > tuning /sys/block/$DEV/queue/max_sectors_kb up to .../max_hw_sectors_kb
> > which gets set by the max_sect parameter when adding the SRP target.
> 
> So the block layer may also hold-off on small requests, and decreasing
> max_sectors_kb will force it to flush to the SRP initiator ASAP (or is
> this just used for fragmentation of large requests)?

It is just used for breaking up large requests. The deadline, as, and
cfq schedulers may have some hold-off -- I've not checked -- but noop
does not.

You can check the length of the queue by looking
at /sys/class/scsi_disk/$TARGET/device/queue_depth.

That may well be 63, which is the maximum queue depth for the SRP
initiator unless you patch the source. Keep in mind that those 63
requests are shared across all LUNs on that connection, so you may queue
up before that, if you are driving many LUNs.

> Note that 'm trying to minimize latency for very small requests.

Reads or writes?
Are you doing direct IO or plain read/write?
File system or block device access?
Are you using the SCSI devices (/dev/sda etc) or DM multipath
(/dev/mpath/*)?

The SRP initiator is playing the cards it has been dealt, but you could
be getting coalescing from the rest of the system -- for example, I have
no idea if the SRP target code will do read ahead and turn a 4KB request
into a 64KB one -- I suspect it is possible. You can also turn on SCSI
logging to see what is being handed to the initiator to be sure which
side of the connection this is occurring.
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Scst-devel] Adjusting minimum packet size or "wait to merge requests" in SRP
       [not found] ` <f3177b9e0910281147u5a47f75ao8bbe156d5b04969c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-10-28 19:14   ` Bart Van Assche
  2009-10-28 19:51   ` Roland Dreier
@ 2009-10-29 18:30   ` Vladislav Bolkhovitin
  2 siblings, 0 replies; 8+ messages in thread
From: Vladislav Bolkhovitin @ 2009-10-29 18:30 UTC (permalink / raw)
  To: Chris Worley; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, scst-devel

Chris Worley, on 10/28/2009 09:47 PM wrote:
> It appears that SRP tries to coalesce and fragment initiator I/O
> requests into 64KB packets, as that looks to be the size requested
> to/from the device on the target side (and the I/O scheduler is
> disabled on the target).
> 
> Is there a way to control this, where no coalescing occurs when
> latency is an issue and requests are small, and no fragmentation
> occurs when requests are large?
> 
> Or, am I totally wrong in my assumption that SRP is coalescing/fragmenting data?

You can at any time see size of requests you are receiving on the target 
side by either enabling "scsi" logging (hopefully, you know how to do 
it) or by looking in /proc/scsi_tgt/sgv. In the latter file you will see 
general statistics for power of 2 allocations, i.e. request for 10K will 
increase 16K row.

Vlad

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-10-29 18:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-28 18:47 Adjusting minimum packet size or "wait to merge requests" in SRP Chris Worley
     [not found] ` <f3177b9e0910281147u5a47f75ao8bbe156d5b04969c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-28 19:14   ` Bart Van Assche
     [not found]     ` <e2e108260910281214y5e3b5f4u24438986672e81b3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-28 19:38       ` Chris Worley
     [not found]         ` <f3177b9e0910281238n1e53653eq3e667010caf8e745-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-28 19:58           ` David Dillow
     [not found]             ` <1256759902.3544.9.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-28 20:25               ` Chris Worley
     [not found]                 ` <f3177b9e0910281325i5ef5ce86u758ed665329232f2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-28 21:05                   ` David Dillow
2009-10-28 19:51   ` Roland Dreier
2009-10-29 18:30   ` [Scst-devel] " Vladislav Bolkhovitin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.