LKML Archive on lore.kernel.org
 help / color / Atom feed
* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found] <0cd6ac36b7ab644576fc0f3f5bd4a880c33855d1.camel@unipv.it>
@ 2019-11-05 18:31 ` Alan Stern
  2019-11-05 23:29   ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2019-11-05 18:31 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Jens Axboe, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, 5 Nov 2019, Andrea Vai wrote:

> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:

> > You should be able to do something like this:
> > 
> >         cd linux
> >         patch -p1 </path/to/patch2
> > 
> > and that should work with no errors.  You don't need to use git to 
> > apply a patch.
> > 
> > In case that patch2 file was mangled somewhere along the way, I
> > have 
> > attached a copy to this message.
> 
> Ok, so the "patch" command worked, the kernel compiled and ran, but
> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
> kernel).
> 
> Let me know what else could I do,

I'm out of suggestions.  If anyone else knows how to make a kernel with 
no legacy queuing support -- only multiqueue -- issue I/O requests 
sequentially, please speak up.

In the absence of any responses, after a week or so I will submit a
patch to revert the f664a3cc17b7 ("scsi: kill off the legacy IO path")  
commit.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-05 18:31 ` Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Alan Stern
@ 2019-11-05 23:29   ` Jens Axboe
  2019-11-06 16:03     ` Alan Stern
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2019-11-05 23:29 UTC (permalink / raw)
  To: Alan Stern, Andrea Vai
  Cc: Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On 11/5/19 11:31 AM, Alan Stern wrote:
> On Tue, 5 Nov 2019, Andrea Vai wrote:
> 
>> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
> 
>>> You should be able to do something like this:
>>>
>>>          cd linux
>>>          patch -p1 </path/to/patch2
>>>
>>> and that should work with no errors.  You don't need to use git to
>>> apply a patch.
>>>
>>> In case that patch2 file was mangled somewhere along the way, I
>>> have
>>> attached a copy to this message.
>>
>> Ok, so the "patch" command worked, the kernel compiled and ran, but
>> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
>> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
>> kernel).
>>
>> Let me know what else could I do,
> 
> I'm out of suggestions.  If anyone else knows how to make a kernel with
> no legacy queuing support -- only multiqueue -- issue I/O requests
> sequentially, please speak up.

Do we know for a fact that the device needs strictly serialized requests
to not stall? And writes in particular? I won't comment on how broken
that is, just trying to establish this as the problem that's making this
particular device be slow?

I've lost track of this thread, but has mq-deadline been tried as the
IO scheduler? We do have support for strictly serialized (writes)
since that's required for zoned device, wouldn't be hard at all to make
this cover a blacklisted device like this one.

> In the absence of any responses, after a week or so I will submit a
> patch to revert the f664a3cc17b7 ("scsi: kill off the legacy IO path")
> commit.

That's not going to be feasible.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-05 23:29   ` Jens Axboe
@ 2019-11-06 16:03     ` Alan Stern
  2019-11-06 22:13       ` Damien Le Moal
  0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2019-11-06 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Andrea Vai, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, 5 Nov 2019, Jens Axboe wrote:

> On 11/5/19 11:31 AM, Alan Stern wrote:
> > On Tue, 5 Nov 2019, Andrea Vai wrote:
> > 
> >> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
> > 
> >>> You should be able to do something like this:
> >>>
> >>>          cd linux
> >>>          patch -p1 </path/to/patch2
> >>>
> >>> and that should work with no errors.  You don't need to use git to
> >>> apply a patch.
> >>>
> >>> In case that patch2 file was mangled somewhere along the way, I
> >>> have
> >>> attached a copy to this message.
> >>
> >> Ok, so the "patch" command worked, the kernel compiled and ran, but
> >> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
> >> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
> >> kernel).
> >>
> >> Let me know what else could I do,
> > 
> > I'm out of suggestions.  If anyone else knows how to make a kernel with
> > no legacy queuing support -- only multiqueue -- issue I/O requests
> > sequentially, please speak up.
> 
> Do we know for a fact that the device needs strictly serialized requests
> to not stall?

Not exactly, but that is far and away the most likely explanation for
the device's behavior.  We tried making a bunch of changes, some of
which helped a little bit, but all of them left a very large
performance gap.  I/O monitoring showed that the only noticeable
difference in the kernel-device interaction caused by the $SUBJECT
commit was the non-sequential access pattern.

> And writes in particular?

Andrea has tested only the write behavior.  Possibly reading will be
affected too, but my guess is that it won't be.

> I won't comment on how broken
> that is, just trying to establish this as the problem that's making this
> particular device be slow?

It seems reasonable that the access pattern could make a significant
difference.  The device's behavior suggests that it buffers incoming
data and pauses from time to time to write the accumulated data into
non-volatile storage.  If its algorithm for allocating, erasing, and
writing data blocks is optimized for the sequential case, you can
easily imagine that non-sequential accesses would cause it to pause
more often and for longer times -- which is exactly what we observed.
These extra pauses are what resulted in the overall performance 
decrease.

So far we have had no way to perform a direct test.  That is, we don't
know of any setting that would change a single kernel between
sequential and non-sequential access.  If you can suggest a simple way
to force a kernel without the $SUBJECT commit to do non-sequential
writes, I'm sure Andrea will be happy to try it out and see if it
causes a slowdown.

> I've lost track of this thread, but has mq-deadline been tried as the
> IO scheduler? We do have support for strictly serialized (writes)
> since that's required for zoned device, wouldn't be hard at all to make
> this cover a blacklisted device like this one.

Please spell out the exact procedure in detail so that Andrea can try 
it.  He's not a kernel hacker, and I know very little about the block 
layer.

Alan Stern


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-06 16:03     ` Alan Stern
@ 2019-11-06 22:13       ` Damien Le Moal
  2019-11-07  7:04         ` Andrea Vai
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2019-11-06 22:13 UTC (permalink / raw)
  To: Alan Stern, Jens Axboe
  Cc: Andrea Vai, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 2019/11/07 1:04, Alan Stern wrote:
> On Tue, 5 Nov 2019, Jens Axboe wrote:
> 
>> On 11/5/19 11:31 AM, Alan Stern wrote:
>>> On Tue, 5 Nov 2019, Andrea Vai wrote:
>>>
>>>> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
>>>
>>>>> You should be able to do something like this:
>>>>>
>>>>>          cd linux
>>>>>          patch -p1 </path/to/patch2
>>>>>
>>>>> and that should work with no errors.  You don't need to use git to
>>>>> apply a patch.
>>>>>
>>>>> In case that patch2 file was mangled somewhere along the way, I
>>>>> have
>>>>> attached a copy to this message.
>>>>
>>>> Ok, so the "patch" command worked, the kernel compiled and ran, but
>>>> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
>>>> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
>>>> kernel).
>>>>
>>>> Let me know what else could I do,
>>>
>>> I'm out of suggestions.  If anyone else knows how to make a kernel with
>>> no legacy queuing support -- only multiqueue -- issue I/O requests
>>> sequentially, please speak up.
>>
>> Do we know for a fact that the device needs strictly serialized requests
>> to not stall?
> 
> Not exactly, but that is far and away the most likely explanation for
> the device's behavior.  We tried making a bunch of changes, some of
> which helped a little bit, but all of them left a very large
> performance gap.  I/O monitoring showed that the only noticeable
> difference in the kernel-device interaction caused by the $SUBJECT
> commit was the non-sequential access pattern.
> 
>> And writes in particular?
> 
> Andrea has tested only the write behavior.  Possibly reading will be
> affected too, but my guess is that it won't be.
> 
>> I won't comment on how broken
>> that is, just trying to establish this as the problem that's making this
>> particular device be slow?
> 
> It seems reasonable that the access pattern could make a significant
> difference.  The device's behavior suggests that it buffers incoming
> data and pauses from time to time to write the accumulated data into
> non-volatile storage.  If its algorithm for allocating, erasing, and
> writing data blocks is optimized for the sequential case, you can
> easily imagine that non-sequential accesses would cause it to pause
> more often and for longer times -- which is exactly what we observed.
> These extra pauses are what resulted in the overall performance 
> decrease.
> 
> So far we have had no way to perform a direct test.  That is, we don't
> know of any setting that would change a single kernel between
> sequential and non-sequential access.  If you can suggest a simple way
> to force a kernel without the $SUBJECT commit to do non-sequential
> writes, I'm sure Andrea will be happy to try it out and see if it
> causes a slowdown.
> 
>> I've lost track of this thread, but has mq-deadline been tried as the
>> IO scheduler? We do have support for strictly serialized (writes)
>> since that's required for zoned device, wouldn't be hard at all to make
>> this cover a blacklisted device like this one.
> 
> Please spell out the exact procedure in detail so that Andrea can try 
> it.  He's not a kernel hacker, and I know very little about the block 
> layer.

Please simply try your write tests after doing this:

echo mq-deadline > /sys/block/<name of your USB disk>/queue/scheduler

And confirm that mq-deadline is selected with:

cat /sys/block/<name of your USB disk>/queue/scheduler
[mq-deadline] kyber bfq none


> 
> Alan Stern
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-06 22:13       ` Damien Le Moal
@ 2019-11-07  7:04         ` Andrea Vai
  2019-11-07  7:54           ` Damien Le Moal
  0 siblings, 1 reply; 15+ messages in thread
From: Andrea Vai @ 2019-11-07  7:04 UTC (permalink / raw)
  To: Damien Le Moal, Alan Stern, Jens Axboe
  Cc: Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> 
> 
> Please simply try your write tests after doing this:
> 
> echo mq-deadline > /sys/block/<name of your USB
> disk>/queue/scheduler
> 
> And confirm that mq-deadline is selected with:
> 
> cat /sys/block/<name of your USB disk>/queue/scheduler
> [mq-deadline] kyber bfq none

ok, which kernel should I test with this: the fresh git cloned, or the
one just patched with Alan's patch, or doesn't matter which one?

Thanks, and bye,
Andrea


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07  7:04         ` Andrea Vai
@ 2019-11-07  7:54           ` Damien Le Moal
  2019-11-07 18:59             ` Andrea Vai
  0 siblings, 1 reply; 15+ messages in thread
From: Damien Le Moal @ 2019-11-07  7:54 UTC (permalink / raw)
  To: Andrea Vai, Alan Stern, Jens Axboe
  Cc: Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 2019/11/07 16:04, Andrea Vai wrote:
> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>
>>
>> Please simply try your write tests after doing this:
>>
>> echo mq-deadline > /sys/block/<name of your USB
>> disk>/queue/scheduler
>>
>> And confirm that mq-deadline is selected with:
>>
>> cat /sys/block/<name of your USB disk>/queue/scheduler
>> [mq-deadline] kyber bfq none
> 
> ok, which kernel should I test with this: the fresh git cloned, or the
> one just patched with Alan's patch, or doesn't matter which one?

Probably all of them to see if there are any differences.

> 
> Thanks, and bye,
> Andrea
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07  7:54           ` Damien Le Moal
@ 2019-11-07 18:59             ` Andrea Vai
  2019-11-08  8:42               ` Damien Le Moal
  2019-11-09 22:28               ` Ming Lei
  0 siblings, 2 replies; 15+ messages in thread
From: Andrea Vai @ 2019-11-07 18:59 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[Sorry for the duplicate message, it didn't reach the lists due to
html formatting]
Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
<Damien.LeMoal@wdc.com> ha scritto:
>
> On 2019/11/07 16:04, Andrea Vai wrote:
> > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> >>
> >>
> >> Please simply try your write tests after doing this:
> >>
> >> echo mq-deadline > /sys/block/<name of your USB
> >> disk>/queue/scheduler
> >>
> >> And confirm that mq-deadline is selected with:
> >>
> >> cat /sys/block/<name of your USB disk>/queue/scheduler
> >> [mq-deadline] kyber bfq none
> >
> > ok, which kernel should I test with this: the fresh git cloned, or the
> > one just patched with Alan's patch, or doesn't matter which one?
>
> Probably all of them to see if there are any differences.

with both kernels, the output of
cat /sys/block/sdh/queue/schedule

already contains [mq-deadline]: is it correct to assume that the echo
command and the subsequent testing is useless? What to do now?

Thanks, and bye
Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07 18:59             ` Andrea Vai
@ 2019-11-08  8:42               ` Damien Le Moal
  2019-11-08 14:33                 ` Jens Axboe
  2019-11-09 10:09                 ` Ming Lei
  2019-11-09 22:28               ` Ming Lei
  1 sibling, 2 replies; 15+ messages in thread
From: Damien Le Moal @ 2019-11-08  8:42 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On 2019/11/08 4:00, Andrea Vai wrote:
> [Sorry for the duplicate message, it didn't reach the lists due to
> html formatting]
> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> <Damien.LeMoal@wdc.com> ha scritto:
>>
>> On 2019/11/07 16:04, Andrea Vai wrote:
>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>>>
>>>>
>>>> Please simply try your write tests after doing this:
>>>>
>>>> echo mq-deadline > /sys/block/<name of your USB
>>>> disk>/queue/scheduler
>>>>
>>>> And confirm that mq-deadline is selected with:
>>>>
>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
>>>> [mq-deadline] kyber bfq none
>>>
>>> ok, which kernel should I test with this: the fresh git cloned, or the
>>> one just patched with Alan's patch, or doesn't matter which one?
>>
>> Probably all of them to see if there are any differences.
> 
> with both kernels, the output of
> cat /sys/block/sdh/queue/schedule
> 
> already contains [mq-deadline]: is it correct to assume that the echo
> command and the subsequent testing is useless? What to do now?

Probably, yes. Have you obtained a blktrace of the workload during these
tests ? Any significant difference in the IO pattern (IO size and
randomness) and IO timing (any device idle time where the device has no
command to process) ? Asking because the problem may be above the block
layer, with the file system for instance.

> 
> Thanks, and bye
> Andrea
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08  8:42               ` Damien Le Moal
@ 2019-11-08 14:33                 ` Jens Axboe
  2019-11-11 10:46                   ` Andrea Vai
  2019-11-09 10:09                 ` Ming Lei
  1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2019-11-08 14:33 UTC (permalink / raw)
  To: Damien Le Moal, Andrea Vai
  Cc: Alan Stern, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 11/8/19 1:42 AM, Damien Le Moal wrote:
> On 2019/11/08 4:00, Andrea Vai wrote:
>> [Sorry for the duplicate message, it didn't reach the lists due to
>> html formatting]
>> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
>> <Damien.LeMoal@wdc.com> ha scritto:
>>>
>>> On 2019/11/07 16:04, Andrea Vai wrote:
>>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>>>>
>>>>>
>>>>> Please simply try your write tests after doing this:
>>>>>
>>>>> echo mq-deadline > /sys/block/<name of your USB
>>>>> disk>/queue/scheduler
>>>>>
>>>>> And confirm that mq-deadline is selected with:
>>>>>
>>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
>>>>> [mq-deadline] kyber bfq none
>>>>
>>>> ok, which kernel should I test with this: the fresh git cloned, or the
>>>> one just patched with Alan's patch, or doesn't matter which one?
>>>
>>> Probably all of them to see if there are any differences.
>>
>> with both kernels, the output of
>> cat /sys/block/sdh/queue/schedule
>>
>> already contains [mq-deadline]: is it correct to assume that the echo
>> command and the subsequent testing is useless? What to do now?
> 
> Probably, yes. Have you obtained a blktrace of the workload during these
> tests ? Any significant difference in the IO pattern (IO size and
> randomness) and IO timing (any device idle time where the device has no
> command to process) ? Asking because the problem may be above the block
> layer, with the file system for instance.

blktrace would indeed be super useful, especially if you can do that
with a kernel that's fast for you, and one with the current kernel
where it's slow.

Given that your device is sdh, you simply do:

# blktrace /dev/sdh

and then run the test, then ctrl-c the blktrace. Then do:

# blkparse sdh > output

and save that output file. Do both runs, and bzip2 them up. The shorter
the run you can reproduce with the better, to cut down on the size of
the traces.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08  8:42               ` Damien Le Moal
  2019-11-08 14:33                 ` Jens Axboe
@ 2019-11-09 10:09                 ` Ming Lei
  1 sibling, 0 replies; 15+ messages in thread
From: Ming Lei @ 2019-11-09 10:09 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Andrea Vai, Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Fri, Nov 08, 2019 at 08:42:53AM +0000, Damien Le Moal wrote:
> On 2019/11/08 4:00, Andrea Vai wrote:
> > [Sorry for the duplicate message, it didn't reach the lists due to
> > html formatting]
> > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > <Damien.LeMoal@wdc.com> ha scritto:
> >>
> >> On 2019/11/07 16:04, Andrea Vai wrote:
> >>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> >>>>
> >>>>
> >>>> Please simply try your write tests after doing this:
> >>>>
> >>>> echo mq-deadline > /sys/block/<name of your USB
> >>>> disk>/queue/scheduler
> >>>>
> >>>> And confirm that mq-deadline is selected with:
> >>>>
> >>>> cat /sys/block/<name of your USB disk>/queue/scheduler
> >>>> [mq-deadline] kyber bfq none
> >>>
> >>> ok, which kernel should I test with this: the fresh git cloned, or the
> >>> one just patched with Alan's patch, or doesn't matter which one?
> >>
> >> Probably all of them to see if there are any differences.
> > 
> > with both kernels, the output of
> > cat /sys/block/sdh/queue/schedule
> > 
> > already contains [mq-deadline]: is it correct to assume that the echo
> > command and the subsequent testing is useless? What to do now?
> 
> Probably, yes. Have you obtained a blktrace of the workload during these
> tests ? Any significant difference in the IO pattern (IO size and
> randomness) and IO timing (any device idle time where the device has no
> command to process) ? Asking because the problem may be above the block
> layer, with the file system for instance.

You may get the IO pattern via the previous trace 

https://lore.kernel.org/linux-usb/20190710024439.GA2621@ming.t460p/

IMO, if it is related write order, one possibility could be that
the queue lock is killed in .make_request_fn().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07 18:59             ` Andrea Vai
  2019-11-08  8:42               ` Damien Le Moal
@ 2019-11-09 22:28               ` Ming Lei
  2019-11-11 10:50                 ` Andrea Vai
  1 sibling, 1 reply; 15+ messages in thread
From: Ming Lei @ 2019-11-09 22:28 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> [Sorry for the duplicate message, it didn't reach the lists due to
> html formatting]
> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> <Damien.LeMoal@wdc.com> ha scritto:
> >
> > On 2019/11/07 16:04, Andrea Vai wrote:
> > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> > >>
> > >>
> > >> Please simply try your write tests after doing this:
> > >>
> > >> echo mq-deadline > /sys/block/<name of your USB
> > >> disk>/queue/scheduler
> > >>
> > >> And confirm that mq-deadline is selected with:
> > >>
> > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > >> [mq-deadline] kyber bfq none
> > >
> > > ok, which kernel should I test with this: the fresh git cloned, or the
> > > one just patched with Alan's patch, or doesn't matter which one?
> >
> > Probably all of them to see if there are any differences.
> 
> with both kernels, the output of
> cat /sys/block/sdh/queue/schedule
> 
> already contains [mq-deadline]: is it correct to assume that the echo
> command and the subsequent testing is useless? What to do now?

Another thing we could try is to use 'none' via the following command:

 echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points to the usb storage disk

Because USB storage HBA is single hw queue, which depth is 1. This way
should change to dispatch IO in the order of bio submission.

Andrea, could you switch io scheduler to none and update us if difference
can be made?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08 14:33                 ` Jens Axboe
@ 2019-11-11 10:46                   ` Andrea Vai
  0 siblings, 0 replies; 15+ messages in thread
From: Andrea Vai @ 2019-11-11 10:46 UTC (permalink / raw)
  To: Jens Axboe, Damien Le Moal
  Cc: Alan Stern, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno ven, 08/11/2019 alle 07.33 -0700, Jens Axboe ha scritto:
> On 11/8/19 1:42 AM, Damien Le Moal wrote:
> > On 2019/11/08 4:00, Andrea Vai wrote:
> >> [Sorry for the duplicate message, it didn't reach the lists due
> to
> >> html formatting]
> >> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> >> <Damien.LeMoal@wdc.com> ha scritto:
> >>>
> >>> On 2019/11/07 16:04, Andrea Vai wrote:
> >>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> scritto:
> >>>>>
> >>>>>
> >>>>> Please simply try your write tests after doing this:
> >>>>>
> >>>>> echo mq-deadline > /sys/block/<name of your USB
> >>>>> disk>/queue/scheduler
> >>>>>
> >>>>> And confirm that mq-deadline is selected with:
> >>>>>
> >>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
> >>>>> [mq-deadline] kyber bfq none
> >>>>
> >>>> ok, which kernel should I test with this: the fresh git cloned,
> or the
> >>>> one just patched with Alan's patch, or doesn't matter which
> one?
> >>>
> >>> Probably all of them to see if there are any differences.
> >>
> >> with both kernels, the output of
> >> cat /sys/block/sdh/queue/schedule
> >>
> >> already contains [mq-deadline]: is it correct to assume that the
> echo
> >> command and the subsequent testing is useless? What to do now?
> > 
> > Probably, yes. Have you obtained a blktrace of the workload during
> these
> > tests ? Any significant difference in the IO pattern (IO size and
> > randomness) and IO timing (any device idle time where the device
> has no
> > command to process) ? Asking because the problem may be above the
> block
> > layer, with the file system for instance.
> 
> blktrace would indeed be super useful, especially if you can do that
> with a kernel that's fast for you, and one with the current kernel
> where it's slow.
> 
> Given that your device is sdh, you simply do:
> 
> # blktrace /dev/sdh
> 
> and then run the test, then ctrl-c the blktrace. Then do:
> 
> # blkparse sdh > output
> 
> and save that output file. Do both runs, and bzip2 them up. The
> shorter
> the run you can reproduce with the better, to cut down on the size
> of
> the traces.

Sorry, the next message from Ming...

-----
You may get the IO pattern via the previous trace 
https://lore.kernel.org/linux-usb/20190710024439.GA2621@ming.t460p/

IMO, if it is related write order, one possibility could be that
the queue lock is killed in .make_request_fn().
-----

...made me wonder if I should really do the blkparse trace test, or
not. So please confirm if it's needed (testing is quite time-consuming 
, so I'd like to do it if it's needed).

Thanks, and bye,
Andrea


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-09 22:28               ` Ming Lei
@ 2019-11-11 10:50                 ` Andrea Vai
  2019-11-11 11:05                   ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Andrea Vai @ 2019-11-11 10:50 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > [Sorry for the duplicate message, it didn't reach the lists due to
> > html formatting]
> > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > <Damien.LeMoal@wdc.com> ha scritto:
> > >
> > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> scritto:
> > > >>
> > > >>
> > > >> Please simply try your write tests after doing this:
> > > >>
> > > >> echo mq-deadline > /sys/block/<name of your USB
> > > >> disk>/queue/scheduler
> > > >>
> > > >> And confirm that mq-deadline is selected with:
> > > >>
> > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > >> [mq-deadline] kyber bfq none
> > > >
> > > > ok, which kernel should I test with this: the fresh git
> cloned, or the
> > > > one just patched with Alan's patch, or doesn't matter which
> one?
> > >
> > > Probably all of them to see if there are any differences.
> > 
> > with both kernels, the output of
> > cat /sys/block/sdh/queue/schedule
> > 
> > already contains [mq-deadline]: is it correct to assume that the
> echo
> > command and the subsequent testing is useless? What to do now?
> 
> Another thing we could try is to use 'none' via the following
> command:
> 
>  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> to the usb storage disk
> 
> Because USB storage HBA is single hw queue, which depth is 1. This
> way
> should change to dispatch IO in the order of bio submission.
> 
> Andrea, could you switch io scheduler to none and update us if
> difference
> can be made?

Of course I would to it, but I see that with the "good" kernel the
output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf) is

noop deadline [cfq]

, i.e. it doesn't show "none". Does it matter? (sorry if it's a silly
question)

Thanks, and bye
Andrea



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-11 10:50                 ` Andrea Vai
@ 2019-11-11 11:05                   ` Ming Lei
  2019-11-11 11:13                     ` Andrea Vai
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2019-11-11 11:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Mon, Nov 11, 2019 at 11:50:49AM +0100, Andrea Vai wrote:
> Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> > On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > > [Sorry for the duplicate message, it didn't reach the lists due to
> > > html formatting]
> > > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > > <Damien.LeMoal@wdc.com> ha scritto:
> > > >
> > > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> > scritto:
> > > > >>
> > > > >>
> > > > >> Please simply try your write tests after doing this:
> > > > >>
> > > > >> echo mq-deadline > /sys/block/<name of your USB
> > > > >> disk>/queue/scheduler
> > > > >>
> > > > >> And confirm that mq-deadline is selected with:
> > > > >>
> > > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > > >> [mq-deadline] kyber bfq none
> > > > >
> > > > > ok, which kernel should I test with this: the fresh git
> > cloned, or the
> > > > > one just patched with Alan's patch, or doesn't matter which
> > one?
> > > >
> > > > Probably all of them to see if there are any differences.
> > > 
> > > with both kernels, the output of
> > > cat /sys/block/sdh/queue/schedule
> > > 
> > > already contains [mq-deadline]: is it correct to assume that the
> > echo
> > > command and the subsequent testing is useless? What to do now?
> > 
> > Another thing we could try is to use 'none' via the following
> > command:
> > 
> >  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> > to the usb storage disk
> > 
> > Because USB storage HBA is single hw queue, which depth is 1. This
> > way
> > should change to dispatch IO in the order of bio submission.
> > 
> > Andrea, could you switch io scheduler to none and update us if
> > difference
> > can be made?
> 
> Of course I would to it, but I see that with the "good" kernel the
> output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf) is
> 
> noop deadline [cfq]

Not sure if cfq makes a difference, and I guess you may get same result
with noop or deadline. However, if you only see good write performance with
cfq, you may try 'bfq' and see if it works as cfq.

> 
> , i.e. it doesn't show "none". Does it matter? (sorry if it's a silly
> question)

We are talking about new kernel in which there can't be 'noop deadline [cfq]'
any more. And you should see the following output from '/sys/block/sdf/queue/scheduler'
in the new kernel:

	[mq-deadline] kyber bfq none


thanks,
Ming


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-11 11:05                   ` Ming Lei
@ 2019-11-11 11:13                     ` Andrea Vai
  0 siblings, 0 replies; 15+ messages in thread
From: Andrea Vai @ 2019-11-11 11:13 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno lun, 11/11/2019 alle 19.05 +0800, Ming Lei ha scritto:
> On Mon, Nov 11, 2019 at 11:50:49AM +0100, Andrea Vai wrote:
> > Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> > > On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > > > [Sorry for the duplicate message, it didn't reach the lists
> due to
> > > > html formatting]
> > > > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > > > <Damien.LeMoal@wdc.com> ha scritto:
> > > > >
> > > > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal
> ha
> > > scritto:
> > > > > >>
> > > > > >>
> > > > > >> Please simply try your write tests after doing this:
> > > > > >>
> > > > > >> echo mq-deadline > /sys/block/<name of your USB
> > > > > >> disk>/queue/scheduler
> > > > > >>
> > > > > >> And confirm that mq-deadline is selected with:
> > > > > >>
> > > > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > > > >> [mq-deadline] kyber bfq none
> > > > > >
> > > > > > ok, which kernel should I test with this: the fresh git
> > > cloned, or the
> > > > > > one just patched with Alan's patch, or doesn't matter
> which
> > > one?
> > > > >
> > > > > Probably all of them to see if there are any differences.
> > > > 
> > > > with both kernels, the output of
> > > > cat /sys/block/sdh/queue/schedule
> > > > 
> > > > already contains [mq-deadline]: is it correct to assume that
> the
> > > echo
> > > > command and the subsequent testing is useless? What to do now?
> > > 
> > > Another thing we could try is to use 'none' via the following
> > > command:
> > > 
> > >  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh'
> points
> > > to the usb storage disk
> > > 
> > > Because USB storage HBA is single hw queue, which depth is 1.
> This
> > > way
> > > should change to dispatch IO in the order of bio submission.
> > > 
> > > Andrea, could you switch io scheduler to none and update us if
> > > difference
> > > can be made?
> > 
> > Of course I would to it, but I see that with the "good" kernel the
> > output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf)
> is
> > 
> > noop deadline [cfq]
> 
> Not sure if cfq makes a difference, and I guess you may get same
> result
> with noop or deadline. However, if you only see good write
> performance with
> cfq, you may try 'bfq' and see if it works as cfq.
> 
> > 
> > , i.e. it doesn't show "none". Does it matter? (sorry if it's a
> silly
> > question)
> 
> We are talking about new kernel in which there can't be 'noop
> deadline [cfq]'
> any more. And you should see the following output from
> '/sys/block/sdf/queue/scheduler'
> in the new kernel:
> 
> 	[mq-deadline] kyber bfq none
> 
> 

ok sorry I misunderstood, assumed you wanted me to compare the "none"
setting in the old kernel with the same setting in the new kernel. Now
it's clear to me that you want me to compare the different scheduler
settings in the new kernel.

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0cd6ac36b7ab644576fc0f3f5bd4a880c33855d1.camel@unipv.it>
2019-11-05 18:31 ` Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Alan Stern
2019-11-05 23:29   ` Jens Axboe
2019-11-06 16:03     ` Alan Stern
2019-11-06 22:13       ` Damien Le Moal
2019-11-07  7:04         ` Andrea Vai
2019-11-07  7:54           ` Damien Le Moal
2019-11-07 18:59             ` Andrea Vai
2019-11-08  8:42               ` Damien Le Moal
2019-11-08 14:33                 ` Jens Axboe
2019-11-11 10:46                   ` Andrea Vai
2019-11-09 10:09                 ` Ming Lei
2019-11-09 22:28               ` Ming Lei
2019-11-11 10:50                 ` Andrea Vai
2019-11-11 11:05                   ` Ming Lei
2019-11-11 11:13                     ` Andrea Vai

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git