linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* confusion about nr of pending I/O requests
@ 2018-12-18 12:45 Paolo Valente
  2018-12-18 12:49 ` Paolo Valente
  2018-12-18 18:50 ` Jens Axboe
  0 siblings, 2 replies; 8+ messages in thread
From: Paolo Valente @ 2018-12-18 12:45 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Linus Walleij, Mark Brown, Ulf Hansson

Hi Jens,
sorry for the following silly question, but maybe you can solve very
quickly a doubt for which I'd spend much more time investigating.

While doing some tests with scsi_debug, I've just seen that (at least)
with direct I/O, the maximum number of pending I/O requests (at least
in the I/O schedulers) is equal, unexpectedly, to the queue depth of
the drive and not to
/sys/block/<dev>/queue/nr_requests

For example, after:

sudo modprobe scsi_debug max_queue=4

and with fio executed as follows:

job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20

I get this periodic trace, where four insertions are followed by four
completions, and so on, till the end of the I/O.  This trace is taken
with none, but the result is the same with bfq.

             fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
             fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
             fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
             fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
          <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
          <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
          <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
          <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
             fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
             fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
             fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
             fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
          <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
          <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
          <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
          <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]

Shouldn't the total number of pending requests reach
/sys/block/<dev>/queue/nr_requests ?

The latter is of course equal to 8.

Thanks,
Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-18 12:45 confusion about nr of pending I/O requests Paolo Valente
@ 2018-12-18 12:49 ` Paolo Valente
  2018-12-18 18:50 ` Jens Axboe
  1 sibling, 0 replies; 8+ messages in thread
From: Paolo Valente @ 2018-12-18 12:49 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Linus Walleij, Mark Brown, Ulf Hansson



> Il giorno 18 dic 2018, alle ore 13:45, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> Hi Jens,
> sorry for the following silly question, but maybe you can solve very
> quickly a doubt for which I'd spend much more time investigating.
> 
> While doing some tests with scsi_debug, I've just seen that (at least)
> with direct I/O, the maximum number of pending I/O requests (at least
> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
> the drive and not to
> /sys/block/<dev>/queue/nr_requests
> 
> For example, after:
> 
> sudo modprobe scsi_debug max_queue=4
> 
> and with fio executed as follows:
> 
> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
> 
> I get this periodic trace, where four insertions are followed by four
> completions, and so on, till the end of the I/O.  This trace is taken
> with none, but the result is the same with bfq.
> 
>             fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
>             fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
>             fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
>             fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
>          <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
>          <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
>          <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
>          <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
>             fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
>             fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
>             fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
>             fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
>          <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
>          <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
>          <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
>          <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
> 
> Shouldn't the total number of pending requests reach
> /sys/block/<dev>/queue/nr_requests ?
> 
> The latter is of course equal to 8.
> 

I forgot: I did this with a 4.18.  Maybe this is something that has
been changed/fixed?

Thanks,
Paolo

> Thanks,
> Paolo
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-18 12:45 confusion about nr of pending I/O requests Paolo Valente
  2018-12-18 12:49 ` Paolo Valente
@ 2018-12-18 18:50 ` Jens Axboe
  2018-12-18 23:35   ` Paolo Valente
  2018-12-19  3:45   ` Ming Lei
  1 sibling, 2 replies; 8+ messages in thread
From: Jens Axboe @ 2018-12-18 18:50 UTC (permalink / raw)
  To: Paolo Valente, linux-block; +Cc: Linus Walleij, Mark Brown, Ulf Hansson

On 12/18/18 5:45 AM, Paolo Valente wrote:
> Hi Jens,
> sorry for the following silly question, but maybe you can solve very
> quickly a doubt for which I'd spend much more time investigating.
> 
> While doing some tests with scsi_debug, I've just seen that (at least)
> with direct I/O, the maximum number of pending I/O requests (at least
> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
> the drive and not to
> /sys/block/<dev>/queue/nr_requests
> 
> For example, after:
> 
> sudo modprobe scsi_debug max_queue=4
> 
> and with fio executed as follows:
> 
> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
> 
> I get this periodic trace, where four insertions are followed by four
> completions, and so on, till the end of the I/O.  This trace is taken
> with none, but the result is the same with bfq.
> 
>              fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
>              fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
>              fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
>              fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
>           <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
>           <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
>           <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
>           <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
>              fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
>              fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
>              fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
>              fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
>           <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
>           <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
>           <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
>           <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
> 
> Shouldn't the total number of pending requests reach
> /sys/block/<dev>/queue/nr_requests ?
> 
> The latter is of course equal to 8.

With a scheduler, the depth is what the scheduler provides. You cannot
exceed the hardware queue depth in any situation. You just have 8
requests available for scheduling, with a max of 4 being inflight on
the device side.

If both were 4, for instance, then you would have nothing to schedule
with, as all of them could reside on the hardware side. That's why
the scheduler defaults to twice the hardware queue depth.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-18 18:50 ` Jens Axboe
@ 2018-12-18 23:35   ` Paolo Valente
  2018-12-19  3:45   ` Ming Lei
  1 sibling, 0 replies; 8+ messages in thread
From: Paolo Valente @ 2018-12-18 23:35 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Linus Walleij, Mark Brown, Ulf Hansson



> Il giorno 18 dic 2018, alle ore 19:50, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 12/18/18 5:45 AM, Paolo Valente wrote:
>> Hi Jens,
>> sorry for the following silly question, but maybe you can solve very
>> quickly a doubt for which I'd spend much more time investigating.
>> 
>> While doing some tests with scsi_debug, I've just seen that (at least)
>> with direct I/O, the maximum number of pending I/O requests (at least
>> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
>> the drive and not to
>> /sys/block/<dev>/queue/nr_requests
>> 
>> For example, after:
>> 
>> sudo modprobe scsi_debug max_queue=4
>> 
>> and with fio executed as follows:
>> 
>> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
>> 
>> I get this periodic trace, where four insertions are followed by four
>> completions, and so on, till the end of the I/O.  This trace is taken
>> with none, but the result is the same with bfq.
>> 
>>             fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
>>             fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
>>             fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
>>             fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
>>          <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
>>             fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
>>             fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
>>             fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
>>             fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
>>          <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
>>          <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
>> 
>> Shouldn't the total number of pending requests reach
>> /sys/block/<dev>/queue/nr_requests ?
>> 
>> The latter is of course equal to 8.
> 
> With a scheduler, the depth is what the scheduler provides. You cannot
> exceed the hardware queue depth in any situation. You just have 8
> requests available for scheduling, with a max of 4 being inflight on
> the device side.
> 
> If both were 4, for instance, then you would have nothing to schedule
> with, as all of them could reside on the hardware side. That's why
> the scheduler defaults to twice the hardware queue depth.
> 

That's exactly what I expected, thanks.  But not what happened.  Let
me add also dispatch lines to my filtered-trace snippet.  The
repetitive pattern becomes:

             fio-5180  [001] d...   786.931956:   8,48   I  WS 333824 + 1024 [fio]
             fio-5180  [001] d...   786.932010:   8,48   D  WS 333824 + 1024 [fio]
             fio-5180  [001] d...   786.932137:   8,48   I  WS 334848 + 1024 [fio]
             fio-5180  [001] d...   786.932160:   8,48   D  WS 334848 + 1024 [fio]
             fio-5180  [001] d...   786.932318:   8,48   I  WS 335872 + 1024 [fio]
             fio-5180  [001] d...   786.932354:   8,48   D  WS 335872 + 1024 [fio]
             fio-5180  [001] d...   786.932467:   8,48   I  WS 336896 + 1024 [fio]
             fio-5180  [001] d...   786.932489:   8,48   D  WS 336896 + 1024 [fio]
          <idle>-0     [001] d.h.   787.023945:   8,48   C  WS 333824 + 1024 [0]
          <idle>-0     [001] d.h.   787.023978:   8,48   C  WS 334848 + 1024 [0]
          <idle>-0     [001] d.h.   787.024080:   8,48   C  WS 335872 + 1024 [0]
          <idle>-0     [001] d.h.   787.024237:   8,48   C  WS 336896 + 1024 [0]

So, after the four dispatches, there are 0 requests in the scheduler,
and 4 requests inflight.  But *no* new request is inserted into the
scheduler before *all* four inflight requests are completed.

So the total number of requests available seems 4 and not 8.

Am I missing something?

Thanks,
Paolo

> -- 
> Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-18 18:50 ` Jens Axboe
  2018-12-18 23:35   ` Paolo Valente
@ 2018-12-19  3:45   ` Ming Lei
  2018-12-19  6:17     ` Paolo Valente
  1 sibling, 1 reply; 8+ messages in thread
From: Ming Lei @ 2018-12-19  3:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Paolo Valente, linux-block, Linus Walleij, Mark Brown, Ulf Hansson

On Wed, Dec 19, 2018 at 2:52 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 12/18/18 5:45 AM, Paolo Valente wrote:
> > Hi Jens,
> > sorry for the following silly question, but maybe you can solve very
> > quickly a doubt for which I'd spend much more time investigating.
> >
> > While doing some tests with scsi_debug, I've just seen that (at least)
> > with direct I/O, the maximum number of pending I/O requests (at least
> > in the I/O schedulers) is equal, unexpectedly, to the queue depth of
> > the drive and not to
> > /sys/block/<dev>/queue/nr_requests
> >
> > For example, after:
> >
> > sudo modprobe scsi_debug max_queue=4
> >
> > and with fio executed as follows:
> >
> > job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
> >
> > I get this periodic trace, where four insertions are followed by four
> > completions, and so on, till the end of the I/O.  This trace is taken
> > with none, but the result is the same with bfq.
> >
> >              fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
> >              fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
> >              fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
> >              fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
> >           <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
> >              fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
> >              fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
> >              fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
> >              fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
> >           <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
> >           <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
> >
> > Shouldn't the total number of pending requests reach
> > /sys/block/<dev>/queue/nr_requests ?
> >
> > The latter is of course equal to 8.
>
> With a scheduler, the depth is what the scheduler provides. You cannot
> exceed the hardware queue depth in any situation. You just have 8
> requests available for scheduling, with a max of 4 being inflight on
> the device side.
>
> If both were 4, for instance, then you would have nothing to schedule
> with, as all of them could reside on the hardware side. That's why
> the scheduler defaults to twice the hardware queue depth.

The default twice to hw queue depth might not be reasonable for multi LUN.

Maybe it should be set twice of sdev->queue_depth for SCSI or
hw queue depth/hctx->nr_active.  But either way may become complicated
because both can be adjusted runtime.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-19  3:45   ` Ming Lei
@ 2018-12-19  6:17     ` Paolo Valente
  2018-12-19 10:32       ` Ming Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Valente @ 2018-12-19  6:17 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Linus Walleij, Mark Brown, Ulf Hansson



> Il giorno 19 dic 2018, alle ore 04:45, Ming Lei <tom.leiming@gmail.com> ha scritto:
> 
> On Wed, Dec 19, 2018 at 2:52 AM Jens Axboe <axboe@kernel.dk> wrote:
>> 
>> On 12/18/18 5:45 AM, Paolo Valente wrote:
>>> Hi Jens,
>>> sorry for the following silly question, but maybe you can solve very
>>> quickly a doubt for which I'd spend much more time investigating.
>>> 
>>> While doing some tests with scsi_debug, I've just seen that (at least)
>>> with direct I/O, the maximum number of pending I/O requests (at least
>>> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
>>> the drive and not to
>>> /sys/block/<dev>/queue/nr_requests
>>> 
>>> For example, after:
>>> 
>>> sudo modprobe scsi_debug max_queue=4
>>> 
>>> and with fio executed as follows:
>>> 
>>> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
>>> 
>>> I get this periodic trace, where four insertions are followed by four
>>> completions, and so on, till the end of the I/O.  This trace is taken
>>> with none, but the result is the same with bfq.
>>> 
>>>             fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
>>>             fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
>>>             fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
>>>             fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
>>>          <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
>>>             fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
>>>             fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
>>>             fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
>>>             fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
>>>          <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
>>>          <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
>>> 
>>> Shouldn't the total number of pending requests reach
>>> /sys/block/<dev>/queue/nr_requests ?
>>> 
>>> The latter is of course equal to 8.
>> 
>> With a scheduler, the depth is what the scheduler provides. You cannot
>> exceed the hardware queue depth in any situation. You just have 8
>> requests available for scheduling, with a max of 4 being inflight on
>> the device side.
>> 
>> If both were 4, for instance, then you would have nothing to schedule
>> with, as all of them could reside on the hardware side. That's why
>> the scheduler defaults to twice the hardware queue depth.
> 
> The default twice to hw queue depth might not be reasonable for multi LUN.
> 
> Maybe it should be set twice of sdev->queue_depth for SCSI or
> hw queue depth/hctx->nr_active.  But either way may become complicated
> because both can be adjusted runtime.
> 

Could you please explain why it is not working (if it is not working)
in my example, where there should be only one LUN?

Thanks,
Paolo

> Thanks,
> Ming Lei


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-19  6:17     ` Paolo Valente
@ 2018-12-19 10:32       ` Ming Lei
  2018-12-19 11:45         ` Paolo Valente
  0 siblings, 1 reply; 8+ messages in thread
From: Ming Lei @ 2018-12-19 10:32 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, linux-block, Linus Walleij, Mark Brown, Ulf Hansson

On Wed, Dec 19, 2018 at 2:18 PM Paolo Valente <paolo.valente@linaro.org> wrote:
>
>
>
> > Il giorno 19 dic 2018, alle ore 04:45, Ming Lei <tom.leiming@gmail.com> ha scritto:
> >
> > On Wed, Dec 19, 2018 at 2:52 AM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> On 12/18/18 5:45 AM, Paolo Valente wrote:
> >>> Hi Jens,
> >>> sorry for the following silly question, but maybe you can solve very
> >>> quickly a doubt for which I'd spend much more time investigating.
> >>>
> >>> While doing some tests with scsi_debug, I've just seen that (at least)
> >>> with direct I/O, the maximum number of pending I/O requests (at least
> >>> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
> >>> the drive and not to
> >>> /sys/block/<dev>/queue/nr_requests
> >>>
> >>> For example, after:
> >>>
> >>> sudo modprobe scsi_debug max_queue=4
> >>>
> >>> and with fio executed as follows:
> >>>
> >>> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
> >>>
> >>> I get this periodic trace, where four insertions are followed by four
> >>> completions, and so on, till the end of the I/O.  This trace is taken
> >>> with none, but the result is the same with bfq.
> >>>
> >>>             fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
> >>>          <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
> >>>             fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
> >>>             fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
> >>>          <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
> >>>          <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
> >>>
> >>> Shouldn't the total number of pending requests reach
> >>> /sys/block/<dev>/queue/nr_requests ?
> >>>
> >>> The latter is of course equal to 8.
> >>
> >> With a scheduler, the depth is what the scheduler provides. You cannot
> >> exceed the hardware queue depth in any situation. You just have 8
> >> requests available for scheduling, with a max of 4 being inflight on
> >> the device side.
> >>
> >> If both were 4, for instance, then you would have nothing to schedule
> >> with, as all of them could reside on the hardware side. That's why
> >> the scheduler defaults to twice the hardware queue depth.
> >
> > The default twice to hw queue depth might not be reasonable for multi LUN.
> >
> > Maybe it should be set twice of sdev->queue_depth for SCSI or
> > hw queue depth/hctx->nr_active.  But either way may become complicated
> > because both can be adjusted runtime.
> >
>
> Could you please explain why it is not working (if it is not working)
> in my example, where there should be only one LUN?

I didn't say it isn't working, and I mean it isn't perfect.

The hardware queue depth is host-wide, that means it is shared by all LUNs.
Of course, lots of LUNs may be attached to one single HBA. You can setup
this setting via 'modprobe scsi_debug max_luns=16 max_queue=4' easily,
then all 4 LUNs share the 4 tags.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: confusion about nr of pending I/O requests
  2018-12-19 10:32       ` Ming Lei
@ 2018-12-19 11:45         ` Paolo Valente
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Valente @ 2018-12-19 11:45 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Linus Walleij, Mark Brown, Ulf Hansson



> Il giorno 19 dic 2018, alle ore 11:32, Ming Lei <tom.leiming@gmail.com> ha scritto:
> 
> On Wed, Dec 19, 2018 at 2:18 PM Paolo Valente <paolo.valente@linaro.org> wrote:
>> 
>> 
>> 
>>> Il giorno 19 dic 2018, alle ore 04:45, Ming Lei <tom.leiming@gmail.com> ha scritto:
>>> 
>>> On Wed, Dec 19, 2018 at 2:52 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>> 
>>>> On 12/18/18 5:45 AM, Paolo Valente wrote:
>>>>> Hi Jens,
>>>>> sorry for the following silly question, but maybe you can solve very
>>>>> quickly a doubt for which I'd spend much more time investigating.
>>>>> 
>>>>> While doing some tests with scsi_debug, I've just seen that (at least)
>>>>> with direct I/O, the maximum number of pending I/O requests (at least
>>>>> in the I/O schedulers) is equal, unexpectedly, to the queue depth of
>>>>> the drive and not to
>>>>> /sys/block/<dev>/queue/nr_requests
>>>>> 
>>>>> For example, after:
>>>>> 
>>>>> sudo modprobe scsi_debug max_queue=4
>>>>> 
>>>>> and with fio executed as follows:
>>>>> 
>>>>> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20
>>>>> 
>>>>> I get this periodic trace, where four insertions are followed by four
>>>>> completions, and so on, till the end of the I/O.  This trace is taken
>>>>> with none, but the result is the same with bfq.
>>>>> 
>>>>>            fio-20275 [001] d...  7560.655213:   8,48   I   R 281088 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.655288:   8,48   I   R 281096 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.655311:   8,48   I   R 281104 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.655331:   8,48   I   R 281112 + 8 [fio]
>>>>>         <idle>-0     [001] d.h.  7560.749868:   8,48   C   R 281088 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.749912:   8,48   C   R 281096 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.749928:   8,48   C   R 281104 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.749934:   8,48   C   R 281112 + 8 [0]
>>>>>            fio-20275 [001] d...  7560.750023:   8,48   I   R 281120 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.750196:   8,48   I   R 281128 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.750229:   8,48   I   R 281136 + 8 [fio]
>>>>>            fio-20275 [001] d...  7560.750250:   8,48   I   R 281144 + 8 [fio]
>>>>>         <idle>-0     [001] d.h.  7560.842510:   8,48   C   R 281120 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.842551:   8,48   C   R 281128 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.842556:   8,48   C   R 281136 + 8 [0]
>>>>>         <idle>-0     [001] dNh.  7560.842562:   8,48   C   R 281144 + 8 [0]
>>>>> 
>>>>> Shouldn't the total number of pending requests reach
>>>>> /sys/block/<dev>/queue/nr_requests ?
>>>>> 
>>>>> The latter is of course equal to 8.
>>>> 
>>>> With a scheduler, the depth is what the scheduler provides. You cannot
>>>> exceed the hardware queue depth in any situation. You just have 8
>>>> requests available for scheduling, with a max of 4 being inflight on
>>>> the device side.
>>>> 
>>>> If both were 4, for instance, then you would have nothing to schedule
>>>> with, as all of them could reside on the hardware side. That's why
>>>> the scheduler defaults to twice the hardware queue depth.
>>> 
>>> The default twice to hw queue depth might not be reasonable for multi LUN.
>>> 
>>> Maybe it should be set twice of sdev->queue_depth for SCSI or
>>> hw queue depth/hctx->nr_active.  But either way may become complicated
>>> because both can be adjusted runtime.
>>> 
>> 
>> Could you please explain why it is not working (if it is not working)
>> in my example, where there should be only one LUN?
> 
> I didn't say it isn't working, and I mean it isn't perfect.
> 
> The hardware queue depth is host-wide, that means it is shared by all LUNs.
> Of course, lots of LUNs may be attached to one single HBA. You can setup
> this setting via 'modprobe scsi_debug max_luns=16 max_queue=4' easily,
> then all 4 LUNs share the 4 tags.
> 

Ok, so you are talking about the opposite problem, in a sense.  What
I'm saying here is that tags had to be 8, as Jens pointed out, but
they are 4.

Thanks,
Paolo

> Thanks,
> Ming Lei


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-19 11:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-18 12:45 confusion about nr of pending I/O requests Paolo Valente
2018-12-18 12:49 ` Paolo Valente
2018-12-18 18:50 ` Jens Axboe
2018-12-18 23:35   ` Paolo Valente
2018-12-19  3:45   ` Ming Lei
2018-12-19  6:17     ` Paolo Valente
2018-12-19 10:32       ` Ming Lei
2018-12-19 11:45         ` Paolo Valente

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).