linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found] <307581a490b610c3025ee80f79a465a89d68ed19.camel@unipv.it>
@ 2019-08-20 17:13 ` Alan Stern
  2019-08-23 10:39   ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-08-20 17:13 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

On Mon, 19 Aug 2019, Andrea Vai wrote:

> Hi Alan,
>   I attach the two traces, collected as follows:
> 
> - start the trace;
> - wait 10 seconds;
> - plug the drive;
> - wait 5 seconds;
> - mount the drive;
> - wait 5 seconds;
> - copy a 500 byte file;
> - wait 5 seconds;
> - unmount the drive;
> - wait 5 seconds;
> - stop the trace.

Still no noticeable differences between the two traces.  They both 
include a 1.2 second delay shortly after the writing starts, and the 
initialization sequences are the same.

I really don't know where to look for this.  The only thing I can think
of at this point is to repeat this test, but using a file large enough
for the difference in writing speed to show up plainly.

By the way, it would be best to run the tests with the smallest
possible number of other USB devices plugged in.  None at all, if you
can arrange it.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-08-20 17:13 ` Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Alan Stern
@ 2019-08-23 10:39   ` Andrea Vai
  2019-08-23 20:42     ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-08-23 10:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

[-- Attachment #1: Type: text/plain, Size: 4851 bytes --]

Il giorno mar, 20/08/2019 alle 13.13 -0400, Alan Stern ha scritto:
> On Mon, 19 Aug 2019, Andrea Vai wrote:
> 
> > Hi Alan,
> >   I attach the two traces, collected as follows:
> > 
> > - start the trace;
> > - wait 10 seconds;
> > - plug the drive;
> > - wait 5 seconds;
> > - mount the drive;
> > - wait 5 seconds;
> > - copy a 500 byte file;
> > - wait 5 seconds;
> > - unmount the drive;
> > - wait 5 seconds;
> > - stop the trace.
> 
> Still no noticeable differences between the two traces.  They both 
> include a 1.2 second delay shortly after the writing starts, and
> the 
> initialization sequences are the same.
> 
> I really don't know where to look for this.  The only thing I can
> think
> of at this point is to repeat this test, but using a file large
> enough
> for the difference in writing speed to show up plainly.
> 
> By the way, it would be best to run the tests with the smallest
> possible number of other USB devices plugged in.  None at all, if
> you
> can arrange it.

Thanks, I went some steps further on this.
The following considerations all apply to the "bad" kernel.

Increasing the filesize lead me to find out that using a file sized
less than roughly 10MB the problem does not happen.

I found these results by making sets of 10 tries for each filesize,
using a filesize of 1kB, 10kB, 100kB, 1MB, 10MB, 100MB, 500MB (so, we
have 70 usbmon logs on these). If we define "fast" a copy that takes
(roughly(*)) no more time to complete than all the other tries in its
set, and "slow" elsewhere (=one or more tries in its set are
(sensibly(*)) faster), I noticed that in each set with a filesize of
10MB or more the behavior can be very different: sometimes the copy is
still "fast", sometimes is "slow". The frequency of the "slow" copies
increases with the filesize. Also, among the "slow" copies in a set,
the time can be very different.

Also, I found that if the file is not present on the target location
(i.e. the USB pendrive), the problem does not happen (I have ten
usbmon logs here, taken in the worst scenario (500MB filesize)).

Tell me which log(s) would you like me to send you: I can sum up here
all the sets of tries, and the time their copies took to complete (in
seconds):

1kB: 26, 27, 26, 26, 27, 26, 26, 27, 26, 27
10kB: 27, 27, 26, 26, 27, 26, 27, 26, 27, 27
100kB: 26, 26, 26, 27, 26, 26, 26, 27, 27, 27
1MB: 26, 27, 27, 27, 27, 27, 27, 27, 27, 26
10MB: 27, 31, 37, 27, 38, 27, 39, 27, 30, 28
100MB: 32, 32, 144, 32, 145, 32, 123, 32, 153, 123
500MB: 56, 1396, 747, 131, 795, 764, 292, 1021, 807, 516

Also, note that the first copy is always "fast", because each file was
initially not present on the pendrive. As said, I did one test of 10
tries by deleting the file on the pendrive before copying it again,
and the results are

500MB: 56, 56, 57, 57, 56, 56, 60, 25***, 55, 56 (***Note the "fake"
25s, doesn't matter because I forgot to plug the pendrive :-/ )

I have made a script to semi-automate all the tests I have done. I
attach the script here, so anyone interested could check it for any
mistake (remember I am not very skilled so I may have wrote buggy
code, done wrong assumptions, etc.). Please note that I decreased the
time between the trace start and the drive plugging from 10s to 5s
(simply to reduce the time needed to me to look at the countdown). Of
course I can do again the test(s) you need with a bigger amount of
$wait.

The script has been run with the command

# for k in {1..10}; do size=1000; ./test_usbmon $size && ping -a -c 5 8.8.8.8 ; done
(example for 1kB filesize)

or, in the set of "delete before copy",

# for k in {1..10}; do size=500000000; ./cancellaTestFile $size && ./test_usbmon $size && ping -a -c 5 8.8.8.8 ; done

The ping command is there just to have a sound alarm when finished.

I also attach the script to delete the file ("cancellaTestFile").

I took care to plug the pendrive exactly at the end of the countdown,
to keep the times in the logs more simple to detect and manage by you.

I have also logged all the terminal output log of the script.

Last note: I ran all the tests without any other USB device connected
but the pendrive (well, actually there is a card reader connected to
the internal USB connector, but on another bus. I didn't want to open
the case and disconnect it but of course I can do it if needed).
Thanks for pointing it out.

Thanks, and bye
Andrea

(*) as an example, on a set that shows the total elapsed time in
seconds being

26, 27, 27, 27, 27, 27, 27, 27, 27, 26

I have assumed all of the copies to be "fast", while in the set

32, 32, 144, 32, 145, 32, 123, 32, 153, 123

I have assumed 5 of the copies as "fast" (the ones that took 32
seconds) and the other "slow". Not going to deepen in some standard
deviation evaluation, etc., but if you'd like to I can provide some
more scientific detailed data :-)


[-- Attachment #2: test_usbmon --]
[-- Type: application/x-shellscript, Size: 1607 bytes --]

[-- Attachment #3: cancellaTestFile --]
[-- Type: application/x-shellscript, Size: 296 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-08-23 10:39   ` Andrea Vai
@ 2019-08-23 20:42     ` Alan Stern
  2019-08-26  6:09       ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-08-23 20:42 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

On Fri, 23 Aug 2019, Andrea Vai wrote:

> Il giorno mar, 20/08/2019 alle 13.13 -0400, Alan Stern ha scritto:
> > On Mon, 19 Aug 2019, Andrea Vai wrote:
> > 
> > > Hi Alan,
> > >   I attach the two traces, collected as follows:
> > > 
> > > - start the trace;
> > > - wait 10 seconds;
> > > - plug the drive;
> > > - wait 5 seconds;
> > > - mount the drive;
> > > - wait 5 seconds;
> > > - copy a 500 byte file;
> > > - wait 5 seconds;
> > > - unmount the drive;
> > > - wait 5 seconds;
> > > - stop the trace.
> > 
> > Still no noticeable differences between the two traces.  They both 
> > include a 1.2 second delay shortly after the writing starts, and
> > the 
> > initialization sequences are the same.
> > 
> > I really don't know where to look for this.  The only thing I can
> > think
> > of at this point is to repeat this test, but using a file large
> > enough
> > for the difference in writing speed to show up plainly.
> > 
> > By the way, it would be best to run the tests with the smallest
> > possible number of other USB devices plugged in.  None at all, if
> > you
> > can arrange it.
> 
> Thanks, I went some steps further on this.
> The following considerations all apply to the "bad" kernel.
> 
> Increasing the filesize lead me to find out that using a file sized
> less than roughly 10MB the problem does not happen.
> 
> I found these results by making sets of 10 tries for each filesize,
> using a filesize of 1kB, 10kB, 100kB, 1MB, 10MB, 100MB, 500MB (so, we
> have 70 usbmon logs on these). If we define "fast" a copy that takes
> (roughly(*)) no more time to complete than all the other tries in its
> set, and "slow" elsewhere (=one or more tries in its set are
> (sensibly(*)) faster), I noticed that in each set with a filesize of
> 10MB or more the behavior can be very different: sometimes the copy is
> still "fast", sometimes is "slow". The frequency of the "slow" copies
> increases with the filesize. Also, among the "slow" copies in a set,
> the time can be very different.
> 
> Also, I found that if the file is not present on the target location
> (i.e. the USB pendrive), the problem does not happen (I have ten
> usbmon logs here, taken in the worst scenario (500MB filesize)).
> 
> Tell me which log(s) would you like me to send you: I can sum up here
> all the sets of tries, and the time their copies took to complete (in
> seconds):
> 
> 1kB: 26, 27, 26, 26, 27, 26, 26, 27, 26, 27
> 10kB: 27, 27, 26, 26, 27, 26, 27, 26, 27, 27
> 100kB: 26, 26, 26, 27, 26, 26, 26, 27, 27, 27
> 1MB: 26, 27, 27, 27, 27, 27, 27, 27, 27, 26
> 10MB: 27, 31, 37, 27, 38, 27, 39, 27, 30, 28
> 100MB: 32, 32, 144, 32, 145, 32, 123, 32, 153, 123
> 500MB: 56, 1396, 747, 131, 795, 764, 292, 1021, 807, 516
> 
> Also, note that the first copy is always "fast", because each file was
> initially not present on the pendrive. As said, I did one test of 10
> tries by deleting the file on the pendrive before copying it again,
> and the results are
> 
> 500MB: 56, 56, 57, 57, 56, 56, 60, 25***, 55, 56 (***Note the "fake"
> 25s, doesn't matter because I forgot to plug the pendrive :-/ )
> 
> I have made a script to semi-automate all the tests I have done. I
> attach the script here, so anyone interested could check it for any
> mistake (remember I am not very skilled so I may have wrote buggy
> code, done wrong assumptions, etc.). Please note that I decreased the
> time between the trace start and the drive plugging from 10s to 5s
> (simply to reduce the time needed to me to look at the countdown). Of
> course I can do again the test(s) you need with a bigger amount of
> $wait.
> 
> The script has been run with the command
> 
> # for k in {1..10}; do size=1000; ./test_usbmon $size && ping -a -c 5 8.8.8.8 ; done
> (example for 1kB filesize)
> 
> or, in the set of "delete before copy",
> 
> # for k in {1..10}; do size=500000000; ./cancellaTestFile $size && ./test_usbmon $size && ping -a -c 5 8.8.8.8 ; done
> 
> The ping command is there just to have a sound alarm when finished.
> 
> I also attach the script to delete the file ("cancellaTestFile").
> 
> I took care to plug the pendrive exactly at the end of the countdown,
> to keep the times in the logs more simple to detect and manage by you.
> 
> I have also logged all the terminal output log of the script.
> 
> Last note: I ran all the tests without any other USB device connected
> but the pendrive (well, actually there is a card reader connected to
> the internal USB connector, but on another bus. I didn't want to open
> the case and disconnect it but of course I can do it if needed).
> Thanks for pointing it out.
> 
> Thanks, and bye
> Andrea
> 
> (*) as an example, on a set that shows the total elapsed time in
> seconds being
> 
> 26, 27, 27, 27, 27, 27, 27, 27, 27, 26
> 
> I have assumed all of the copies to be "fast", while in the set
> 
> 32, 32, 144, 32, 145, 32, 123, 32, 153, 123
> 
> I have assumed 5 of the copies as "fast" (the ones that took 32
> seconds) and the other "slow". Not going to deepen in some standard
> deviation evaluation, etc., but if you'd like to I can provide some
> more scientific detailed data :-)

Wow, that sounds like a lot of work.

Let's start with the 39-second run for the 10-MB file.  If you can put 
the trace files on a server somewhere, available for downloading, that 
would avoid sending a lot of uninteresting data to the mailing list.

Odd that the delays never occur when you're writing a new file.  (If
nothing else, that gives you a way to work around the problem!)  It's
hard to say what it means, though.  Maybe the flash drive doesn't like 
overwriting used blocks.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-08-23 20:42     ` Alan Stern
@ 2019-08-26  6:09       ` Andrea Vai
  2019-08-26 16:33         ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-08-26  6:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

[Sorry, I had previously sent the message to the list but has been
rejected. Sorry for any duplicate]

Il giorno ven, 23/08/2019 alle 16.42 -0400, Alan Stern ha scritto:
> On Fri, 23 Aug 2019, Andrea Vai wrote:
> 
> > Il giorno mar, 20/08/2019 alle 13.13 -0400, Alan Stern ha scritto:
> > > On Mon, 19 Aug 2019, Andrea Vai wrote:
> > > 
> > > > Hi Alan,
> > > >   I attach the two traces, collected as follows:
> > > > 
> > > > - start the trace;
> > > > - wait 10 seconds;
> > > > - plug the drive;
> > > > - wait 5 seconds;
> > > > - mount the drive;
> > > > - wait 5 seconds;
> > > > - copy a 500 byte file;
> > > > - wait 5 seconds;
> > > > - unmount the drive;
> > > > - wait 5 seconds;
> > > > - stop the trace.
> > > 
> > > Still no noticeable differences between the two traces.  They
> both 
> > > include a 1.2 second delay shortly after the writing starts, and
> > > the 
> > > initialization sequences are the same.
> > > 
> > > I really don't know where to look for this.  The only thing I
> can
> > > think
> > > of at this point is to repeat this test, but using a file large
> > > enough
> > > for the difference in writing speed to show up plainly.
> > > 
> > > By the way, it would be best to run the tests with the smallest
> > > possible number of other USB devices plugged in.  None at all,
> if
> > > you
> > > can arrange it.
> > 
> > Thanks, I went some steps further on this.
> > The following considerations all apply to the "bad" kernel.
> > 
> > Increasing the filesize lead me to find out that using a file
> sized
> > less than roughly 10MB the problem does not happen.
> > 
> > I found these results by making sets of 10 tries for each
> filesize,
> > using a filesize of 1kB, 10kB, 100kB, 1MB, 10MB, 100MB, 500MB (so,
> we
> > have 70 usbmon logs on these). If we define "fast" a copy that
> takes
> > (roughly(*)) no more time to complete than all the other tries in
> its
> > set, and "slow" elsewhere (=one or more tries in its set are
> > (sensibly(*)) faster), I noticed that in each set with a filesize
> of
> > 10MB or more the behavior can be very different: sometimes the
> copy is
> > still "fast", sometimes is "slow". The frequency of the "slow"
> copies
> > increases with the filesize. Also, among the "slow" copies in a
> set,
> > the time can be very different.
> > 
> > Also, I found that if the file is not present on the target
> location
> > (i.e. the USB pendrive), the problem does not happen (I have ten
> > usbmon logs here, taken in the worst scenario (500MB filesize)).
> > 
> > Tell me which log(s) would you like me to send you: I can sum up
> here
> > all the sets of tries, and the time their copies took to complete
> (in
> > seconds):
> > 
> > 1kB: 26, 27, 26, 26, 27, 26, 26, 27, 26, 27
> > 10kB: 27, 27, 26, 26, 27, 26, 27, 26, 27, 27
> > 100kB: 26, 26, 26, 27, 26, 26, 26, 27, 27, 27
> > 1MB: 26, 27, 27, 27, 27, 27, 27, 27, 27, 26
> > 10MB: 27, 31, 37, 27, 38, 27, 39, 27, 30, 28
> > 100MB: 32, 32, 144, 32, 145, 32, 123, 32, 153, 123
> > 500MB: 56, 1396, 747, 131, 795, 764, 292, 1021, 807, 516
> > 
> > Also, note that the first copy is always "fast", because each file
> was
> > initially not present on the pendrive. As said, I did one test of
> 10
> > tries by deleting the file on the pendrive before copying it
> again,
> > and the results are
> > 
> > 500MB: 56, 56, 57, 57, 56, 56, 60, 25***, 55, 56 (***Note the
> "fake"
> > 25s, doesn't matter because I forgot to plug the pendrive :-/ )
> > 
> > I have made a script to semi-automate all the tests I have done. I
> > attach the script here, so anyone interested could check it for
> any
> > mistake (remember I am not very skilled so I may have wrote buggy
> > code, done wrong assumptions, etc.). Please note that I decreased
> the
> > time between the trace start and the drive plugging from 10s to 5s
> > (simply to reduce the time needed to me to look at the countdown).
> Of
> > course I can do again the test(s) you need with a bigger amount of
> > $wait.
> > 
> > The script has been run with the command
> > 
> > # for k in {1..10}; do size=1000; ./test_usbmon $size && ping -a
> -c 5 8.8.8.8 ; done
> > (example for 1kB filesize)
> > 
> > or, in the set of "delete before copy",
> > 
> > # for k in {1..10}; do size=500000000; ./cancellaTestFile $size &&
> ./test_usbmon $size && ping -a -c 5 8.8.8.8 ; done
> > 
> > The ping command is there just to have a sound alarm when
> finished.
> > 
> > I also attach the script to delete the file ("cancellaTestFile").
> > 
> > I took care to plug the pendrive exactly at the end of the
> countdown,
> > to keep the times in the logs more simple to detect and manage by
> you.
> > 
> > I have also logged all the terminal output log of the script.
> > 
> > Last note: I ran all the tests without any other USB device
> connected
> > but the pendrive (well, actually there is a card reader connected
> to
> > the internal USB connector, but on another bus. I didn't want to
> open
> > the case and disconnect it but of course I can do it if needed).
> > Thanks for pointing it out.
> > 
> > Thanks, and bye
> > Andrea
> > 
> > (*) as an example, on a set that shows the total elapsed time in
> > seconds being
> > 
> > 26, 27, 27, 27, 27, 27, 27, 27, 27, 26
> > 
> > I have assumed all of the copies to be "fast", while in the set
> > 
> > 32, 32, 144, 32, 145, 32, 123, 32, 153, 123
> > 
> > I have assumed 5 of the copies as "fast" (the ones that took 32
> > seconds) and the other "slow". Not going to deepen in some
> standard
> > deviation evaluation, etc., but if you'd like to I can provide
> some
> > more scientific detailed data :-)
> 
> Wow, that sounds like a lot of work.

just a drop in the ocean, compared to yours :-)

> Let's start with the 39-second run for the 10-MB file.  If you can
> put 
> the trace files on a server somewhere, available for downloading,
> that 
> would avoid sending a lot of uninteresting data to the mailing list.

ok, so you can grab them at

http://fisica.unipv.it/transfer/usbmon_logs.zip

(they will be automatically removed from there in a couple of weeks).

For each size there is a .txt file (which contains the terminal
output) and 10 bad.mon.out_.... trace files. The file suffix "NonCanc"
means there has not been file deletion before copy; while "Canc" means
the opposite.

Each trace file name is identified by a timestamp that is also
referenced inside the txt, so if you want to get i.e. the 39-sec trial
for the 10MB filesize you have to open the ...10MB....txt, search for
the 39 seconds total time string ("Dopo stop trace: 39"), look at the
beginning of that trial, a dozen rows before, take note of the
timestamp, and open the corresponding bad.mon.out file (of course, if
there are more trials with the same time, you have to identify it by
counting its position (7th in the example above)).

To make it more simple:

$ seconds=39; size=10MB; grep -B14 "Dopo stop trace: $seconds" log_10trials_"$size"_NonCanc.txt

should show you more straightly the part(s) you need.

> Odd that the delays never occur when you're writing a new file.  (If
> nothing else, that gives you a way to work around the problem!) 

Thank you, didn't realize that :-) I will try it.

Thanks, and bye
Andrea


-- 
Andrea Vai
University of Pavia
Department of Physics
Via Bassi, 6
27100 Pavia PV
Tel. +39 0382 987489
Mob. +39 328 3354086

http://fisica.unipv.it
http://www.andreavai.it


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-08-26  6:09       ` Andrea Vai
@ 2019-08-26 16:33         ` Alan Stern
  2019-09-18 15:25           ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-08-26 16:33 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

On Mon, 26 Aug 2019, Andrea Vai wrote:

> ok, so you can grab them at
> 
> http://fisica.unipv.it/transfer/usbmon_logs.zip
> 
> (they will be automatically removed from there in a couple of weeks).
> 
> For each size there is a .txt file (which contains the terminal
> output) and 10 bad.mon.out_.... trace files. The file suffix "NonCanc"
> means there has not been file deletion before copy; while "Canc" means
> the opposite.
> 
> Each trace file name is identified by a timestamp that is also
> referenced inside the txt, so if you want to get i.e. the 39-sec trial
> for the 10MB filesize you have to open the ...10MB....txt, search for
> the 39 seconds total time string ("Dopo stop trace: 39"), look at the
> beginning of that trial, a dozen rows before, take note of the
> timestamp, and open the corresponding bad.mon.out file (of course, if
> there are more trials with the same time, you have to identify it by
> counting its position (7th in the example above)).
> 
> To make it more simple:
> 
> $ seconds=39; size=10MB; grep -B14 "Dopo stop trace: $seconds" log_10trials_"$size"_NonCanc.txt
> 
> should show you more straightly the part(s) you need.
> 
> > Odd that the delays never occur when you're writing a new file.  (If
> > nothing else, that gives you a way to work around the problem!) 
> 
> Thank you, didn't realize that :-) I will try it.

In fact, even the traces where the file doesn't exist beforehand show 
some delays.  Just not as many delays as the traces where the file does 
exist.  And again, each delay is in the middle of a write command, not 
between commands.

I suppose changes to the upper software layers could affect which
blocks are assigned when a new file is written.  Perhaps one kernel
re-uses the same old blocks that had been previously occupied and the
other kernel allocates a completely new set of blocks.  That might
change the drive's behavior.  The quick way to tell is to record two
usbmon traces, one under the "good" kernel and one under the "bad"  
kernel, where each test involves writing over a file that already
exists (say, 50 MB) -- the same file for both tests.  The block numbers
will appear in the traces.

Also, I wonder if the changing the size of the data transfers would
make any difference.  This is easy to try; just write "64" to
/sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
drive letter) after the drive is plugged in but before the test starts.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-08-26 16:33         ` Alan Stern
@ 2019-09-18 15:25           ` Andrea Vai
  2019-09-18 16:30             ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-09-18 15:25 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, linux-usb, linux-scsi,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

Il giorno lun 26 ago 2019 alle ore 18:33 Alan Stern <
stern@rowland.harvard.edu> ha scritto:
> [...]
> In fact, even the traces where the file doesn't exist beforehand
> show 
> some delays.  Just not as many delays as the traces where the file
> does 
> exist.  And again, each delay is in the middle of a write command,
> not 
> between commands.
> 
> I suppose changes to the upper software layers could affect which
> blocks are assigned when a new file is written.  Perhaps one kernel
> re-uses the same old blocks that had been previously occupied and
> the
> other kernel allocates a completely new set of blocks.  That might
> change the drive's behavior.  The quick way to tell is to record two
> usbmon traces, one under the "good" kernel and one under the "bad"  
> kernel, where each test involves writing over a file that already
> exists (say, 50 MB) -- the same file for both tests.  The block
> numbers
> will appear in the traces.

ok, I performed 10 tests for each kernel, so we have 20 traces.
 
> Also, I wonder if the changing the size of the data transfers would
> make any difference.  This is easy to try; just write "64" to
> /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
> drive letter) after the drive is plugged in but before the test
> starts.

ok, so I duplicated the tests above for the "64" case (it was
initially set as "120", if it is relevant to know), leading to 40 tests named as

bad.mon.out_50000000_64_TIMESTAMP
bad.mon.out_50000000_non64_TIMESTAMP
good.mon.out_50000000_64_TIMESTAMP
good.mon.out_50000000_non64_TIMESTAMP

where "64" denotes the ones done with that value in max_sectors_kb,
and "not64" the ones without it (as far as I can tell, it has been
always "120").

So, we have 40 traces total. Each set of 10 trials is identified by
a text file, which contains the output log of the test script (and the
timestamps), also available in the download zipfile.

Just to summarize here the times, they are respectively (number
expressed  in seconds):

BAD:
  Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
log_10trials_50MB_BAD_NonCanc_non64.txt
  64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
  not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
GOOD:
  Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
log_10trials_50MB_GOOD_NonCanc_non64.txt
  64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
  not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31

Finally, one note about the workaround proposed by Alan, "delete the
file before copying". My original problem occurred while using a
backup software (dar - see http://dar.linux.free.fr/). So, I tried now
to do the backup by deleting the existing file beforehand, and it
still takes a lot of time with bad kernel: a 900 file backup takes
~160sec with GOOD kernel, and >40min with BAD kernel. I also tried the
"64" tweak in the BAD kernel and it becomes ~300s. Then, I also tried
the "64" case with good kernel, and became ~140s. Detailed data:

GOOD (not "64): 155s, 151s
GOOD ("64"): 142s, 141s

BAD (not "64"): 47minutes, 43minutes
BAD ("64"): 315s, 288s, 268s, 239s, 302s

The command ran is:
$ SECONDS=0; rm /run/media/andrea/BAK_ANDVAI/aero.1.dar && dar -c /run/media/andrea/BAK_ANDVAI/aero -R /home/andrea/Musica/MP3/Aerosmith && umount /run/media/andrea/BAK_ANDVAI; echo "Ci ho messo: $SECONDS secondi."

Speculations:
- It seems the "64" value plays a role, more evident on "bad" kernels
(~halves the time) and less (but still existing?) on "good" kernels;
- dar with the bad kernel, with the "delete beforehand" action, is
still an order of magnitude slower than with the good kernel (so, it
behaves the same way as in the "overwrite" case). Maybe it depends on
the way dar itself writes data... I don't know if you can understand
it, or we should ask for a light to the dar developer(s) about it.

You can grab the traces at

http://fisica.unipv.it/transfer/usbmon_logs_2.zip

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-18 15:25           ` Andrea Vai
@ 2019-09-18 16:30             ` Alan Stern
  2019-09-19  7:33               ` Andrea Vai
  2019-09-19  8:26               ` Damien Le Moal
  0 siblings, 2 replies; 102+ messages in thread
From: Alan Stern @ 2019-09-18 16:30 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

On Wed, 18 Sep 2019, Andrea Vai wrote:

> > Also, I wonder if the changing the size of the data transfers would
> > make any difference.  This is easy to try; just write "64" to
> > /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
> > drive letter) after the drive is plugged in but before the test
> > starts.
> 
> ok, so I duplicated the tests above for the "64" case (it was
> initially set as "120", if it is relevant to know), leading to 40 tests named as
> 
> bad.mon.out_50000000_64_TIMESTAMP
> bad.mon.out_50000000_non64_TIMESTAMP
> good.mon.out_50000000_64_TIMESTAMP
> good.mon.out_50000000_non64_TIMESTAMP
> 
> where "64" denotes the ones done with that value in max_sectors_kb,
> and "not64" the ones without it (as far as I can tell, it has been
> always "120").
> 
> So, we have 40 traces total. Each set of 10 trials is identified by
> a text file, which contains the output log of the test script (and the
> timestamps), also available in the download zipfile.
> 
> Just to summarize here the times, they are respectively (number
> expressed  in seconds):
> 
> BAD:
>   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
> log_10trials_50MB_BAD_NonCanc_non64.txt
>   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
>   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
> GOOD:
>   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
> log_10trials_50MB_GOOD_NonCanc_non64.txt
>   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
>   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31

The improvement from using "64" with the bad kernel is quite large.  
That alone would be a big help for you.

However, I did see what appears to be a very significant difference 
between the bad and good kernel traces.  It has to do with the order in 
which the blocks are accessed.

Here is an extract from one of the bad traces.  I have erased all the 
information except for the columns containing the block numbers to be 
written:

00019628 00
00019667 00
00019628 80
00019667 80
00019629 00
00019668 00
00019629 80
00019668 80

Here is the equivalent portion from one of the good traces:

00019628 00
00019628 80
00019629 00
00019629 80
0001962a 00
0001962a 80
0001962b 00
0001962b 80

Notice that under the good kernel, the block numbers increase
monotonically in a single sequence.  But under the bad kernel, the
block numbers are not monotonic -- it looks like there are two separate
threads each with its own strictly increasing sequence.

This is exactly the sort of difference one might expect to see from
the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
identified as the cause of the problem.  With multiqueue I/O, it's not 
surprising to see multiple sequences of block numbers.

Add it's not at all surprising that a consumer-grade USB storage device 
might do a much worse job of handling non-sequential writes than 
sequential ones.

Which leads to a simple question for the SCSI or block-layer 
maintainers:  Is there a sysfs setting Andrea can tweak which will 
effectively restrict a particular disk device down to a single I/O
queue, forcing sequential access?

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-18 16:30             ` Alan Stern
@ 2019-09-19  7:33               ` Andrea Vai
  2019-09-19 17:54                 ` Alan Stern
  2019-09-19  8:26               ` Damien Le Moal
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-09-19  7:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

Il giorno mer, 18/09/2019 alle 12.30 -0400, Alan Stern ha scritto:
> On Wed, 18 Sep 2019, Andrea Vai wrote:
> [...]
> > BAD:
> >   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
> > log_10trials_50MB_BAD_NonCanc_non64.txt
> >   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
> >   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
> > GOOD:
> >   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
> > log_10trials_50MB_GOOD_NonCanc_non64.txt
> >   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
> >   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31
> 
> The improvement from using "64" with the bad kernel is quite
> large.  
> That alone would be a big help for you.

Well, not so much, actually, because the backup would take twice the
time, that is quite annoying for me. But, apart from that, and from
the efforts of Alan and other people following this issue (thanks), I
would like to point out what I am not sure to have ever made clear
about my support request: I have understood that my problem is quite
specific, and don't want anyone to waste their time to help
specifically *me* (I can buy another media, use the "64" tweak, or
find any other workaround). But since we have identified the problem
as kernel-related, I am worried for other users, maybe new to linux,
that can have the same problem, and the evidence for them would be
that linux is extremely slow to copy file over some USB media. So,
among all the technical comments, I would like to make clear (if it's
not already clear) that in my opinion it would be important to solve
the problem without the need of user workarounds. Does it make sense?
Are we moving towards that goal?

BTW, another question: Alan refers to the slow media as a "consumer-
grade USB storage device". What could I do to identify and buy a "good
media"? Are there any features to look for?

Many thanks, and sorry if I ask anything obvious.
Bye,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-18 16:30             ` Alan Stern
  2019-09-19  7:33               ` Andrea Vai
@ 2019-09-19  8:26               ` Damien Le Moal
  2019-09-19  8:55                 ` Ming Lei
  2019-09-19 14:01                 ` Alan Stern
  1 sibling, 2 replies; 102+ messages in thread
From: Damien Le Moal @ 2019-09-19  8:26 UTC (permalink / raw)
  To: Alan Stern, Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg

On 2019/09/18 18:30, Alan Stern wrote:
> On Wed, 18 Sep 2019, Andrea Vai wrote:
> 
>>> Also, I wonder if the changing the size of the data transfers would
>>> make any difference.  This is easy to try; just write "64" to
>>> /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
>>> drive letter) after the drive is plugged in but before the test
>>> starts.
>>
>> ok, so I duplicated the tests above for the "64" case (it was
>> initially set as "120", if it is relevant to know), leading to 40 tests named as
>>
>> bad.mon.out_50000000_64_TIMESTAMP
>> bad.mon.out_50000000_non64_TIMESTAMP
>> good.mon.out_50000000_64_TIMESTAMP
>> good.mon.out_50000000_non64_TIMESTAMP
>>
>> where "64" denotes the ones done with that value in max_sectors_kb,
>> and "not64" the ones without it (as far as I can tell, it has been
>> always "120").
>>
>> So, we have 40 traces total. Each set of 10 trials is identified by
>> a text file, which contains the output log of the test script (and the
>> timestamps), also available in the download zipfile.
>>
>> Just to summarize here the times, they are respectively (number
>> expressed  in seconds):
>>
>> BAD:
>>   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
>> log_10trials_50MB_BAD_NonCanc_non64.txt
>>   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
>>   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
>> GOOD:
>>   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
>> log_10trials_50MB_GOOD_NonCanc_non64.txt
>>   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
>>   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31
> 
> The improvement from using "64" with the bad kernel is quite large.  
> That alone would be a big help for you.
> 
> However, I did see what appears to be a very significant difference 
> between the bad and good kernel traces.  It has to do with the order in 
> which the blocks are accessed.
> 
> Here is an extract from one of the bad traces.  I have erased all the 
> information except for the columns containing the block numbers to be 
> written:
> 
> 00019628 00
> 00019667 00
> 00019628 80
> 00019667 80
> 00019629 00
> 00019668 00
> 00019629 80
> 00019668 80
> 
> Here is the equivalent portion from one of the good traces:
> 
> 00019628 00
> 00019628 80
> 00019629 00
> 00019629 80
> 0001962a 00
> 0001962a 80
> 0001962b 00
> 0001962b 80
> 
> Notice that under the good kernel, the block numbers increase
> monotonically in a single sequence.  But under the bad kernel, the
> block numbers are not monotonic -- it looks like there are two separate
> threads each with its own strictly increasing sequence.
> 
> This is exactly the sort of difference one might expect to see from
> the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
> identified as the cause of the problem.  With multiqueue I/O, it's not 
> surprising to see multiple sequences of block numbers.
> 
> Add it's not at all surprising that a consumer-grade USB storage device 
> might do a much worse job of handling non-sequential writes than 
> sequential ones.
> 
> Which leads to a simple question for the SCSI or block-layer 
> maintainers:  Is there a sysfs setting Andrea can tweak which will 
> effectively restrict a particular disk device down to a single I/O
> queue, forcing sequential access?

The scheduling inefficiency you are seeing may be coming from the fact that the
block layer does a direct issue of requests, bypassing the elevator, under some
conditions. One of these is sync requests on a multiqueue device. We hit this
problem on Zoned disks which can easily return an error for write requests
without the elevator throttling writes per zones (zone write locking). This
problem was discovered by Hans (on CC).

I discussed this with Hannes yesterday and we think we have a fix, but we'll
need to do a lot of testing as all block devices are potentially impacted by the
change, including stacked drivers (DM). Performance regression is scary with any
change in that area (see blk_mq_make_request() and use of
blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19  8:26               ` Damien Le Moal
@ 2019-09-19  8:55                 ` Ming Lei
  2019-09-19  9:09                   ` Damien Le Moal
  2019-09-19 14:01                 ` Alan Stern
  1 sibling, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-09-19  8:55 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Alan Stern, Andrea Vai, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg

On Thu, Sep 19, 2019 at 08:26:32AM +0000, Damien Le Moal wrote:
> On 2019/09/18 18:30, Alan Stern wrote:
> > On Wed, 18 Sep 2019, Andrea Vai wrote:
> > 
> >>> Also, I wonder if the changing the size of the data transfers would
> >>> make any difference.  This is easy to try; just write "64" to
> >>> /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
> >>> drive letter) after the drive is plugged in but before the test
> >>> starts.
> >>
> >> ok, so I duplicated the tests above for the "64" case (it was
> >> initially set as "120", if it is relevant to know), leading to 40 tests named as
> >>
> >> bad.mon.out_50000000_64_TIMESTAMP
> >> bad.mon.out_50000000_non64_TIMESTAMP
> >> good.mon.out_50000000_64_TIMESTAMP
> >> good.mon.out_50000000_non64_TIMESTAMP
> >>
> >> where "64" denotes the ones done with that value in max_sectors_kb,
> >> and "not64" the ones without it (as far as I can tell, it has been
> >> always "120").
> >>
> >> So, we have 40 traces total. Each set of 10 trials is identified by
> >> a text file, which contains the output log of the test script (and the
> >> timestamps), also available in the download zipfile.
> >>
> >> Just to summarize here the times, they are respectively (number
> >> expressed  in seconds):
> >>
> >> BAD:
> >>   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
> >> log_10trials_50MB_BAD_NonCanc_non64.txt
> >>   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
> >>   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
> >> GOOD:
> >>   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
> >> log_10trials_50MB_GOOD_NonCanc_non64.txt
> >>   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
> >>   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31
> > 
> > The improvement from using "64" with the bad kernel is quite large.  
> > That alone would be a big help for you.
> > 
> > However, I did see what appears to be a very significant difference 
> > between the bad and good kernel traces.  It has to do with the order in 
> > which the blocks are accessed.
> > 
> > Here is an extract from one of the bad traces.  I have erased all the 
> > information except for the columns containing the block numbers to be 
> > written:
> > 
> > 00019628 00
> > 00019667 00
> > 00019628 80
> > 00019667 80
> > 00019629 00
> > 00019668 00
> > 00019629 80
> > 00019668 80
> > 
> > Here is the equivalent portion from one of the good traces:
> > 
> > 00019628 00
> > 00019628 80
> > 00019629 00
> > 00019629 80
> > 0001962a 00
> > 0001962a 80
> > 0001962b 00
> > 0001962b 80
> > 
> > Notice that under the good kernel, the block numbers increase
> > monotonically in a single sequence.  But under the bad kernel, the
> > block numbers are not monotonic -- it looks like there are two separate
> > threads each with its own strictly increasing sequence.
> > 
> > This is exactly the sort of difference one might expect to see from
> > the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
> > identified as the cause of the problem.  With multiqueue I/O, it's not 
> > surprising to see multiple sequences of block numbers.
> > 
> > Add it's not at all surprising that a consumer-grade USB storage device 
> > might do a much worse job of handling non-sequential writes than 
> > sequential ones.
> > 
> > Which leads to a simple question for the SCSI or block-layer 
> > maintainers:  Is there a sysfs setting Andrea can tweak which will 
> > effectively restrict a particular disk device down to a single I/O
> > queue, forcing sequential access?
> 
> The scheduling inefficiency you are seeing may be coming from the fact that the
> block layer does a direct issue of requests, bypassing the elevator, under some
> conditions. One of these is sync requests on a multiqueue device. We hit this
> problem on Zoned disks which can easily return an error for write requests
> without the elevator throttling writes per zones (zone write locking). This
> problem was discovered by Hans (on CC).
> 
> I discussed this with Hannes yesterday and we think we have a fix, but we'll
> need to do a lot of testing as all block devices are potentially impacted by the
> change, including stacked drivers (DM). Performance regression is scary with any
> change in that area (see blk_mq_make_request() and use of
> blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).

Not sure this one is same with yours, for USB, mq-deadline is used at
default, and direct issue won't be possible. Direct issue is only used
in case of none or underlying queues of DM multipath.

thanks, 
Ming

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19  8:55                 ` Ming Lei
@ 2019-09-19  9:09                   ` Damien Le Moal
  2019-09-19  9:21                     ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Damien Le Moal @ 2019-09-19  9:09 UTC (permalink / raw)
  To: Ming Lei
  Cc: Alan Stern, Andrea Vai, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg

On 2019/09/19 10:56, Ming Lei wrote:
> On Thu, Sep 19, 2019 at 08:26:32AM +0000, Damien Le Moal wrote:
>> On 2019/09/18 18:30, Alan Stern wrote:
>>> On Wed, 18 Sep 2019, Andrea Vai wrote:
>>>
>>>>> Also, I wonder if the changing the size of the data transfers would
>>>>> make any difference.  This is easy to try; just write "64" to
>>>>> /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
>>>>> drive letter) after the drive is plugged in but before the test
>>>>> starts.
>>>>
>>>> ok, so I duplicated the tests above for the "64" case (it was
>>>> initially set as "120", if it is relevant to know), leading to 40 tests named as
>>>>
>>>> bad.mon.out_50000000_64_TIMESTAMP
>>>> bad.mon.out_50000000_non64_TIMESTAMP
>>>> good.mon.out_50000000_64_TIMESTAMP
>>>> good.mon.out_50000000_non64_TIMESTAMP
>>>>
>>>> where "64" denotes the ones done with that value in max_sectors_kb,
>>>> and "not64" the ones without it (as far as I can tell, it has been
>>>> always "120").
>>>>
>>>> So, we have 40 traces total. Each set of 10 trials is identified by
>>>> a text file, which contains the output log of the test script (and the
>>>> timestamps), also available in the download zipfile.
>>>>
>>>> Just to summarize here the times, they are respectively (number
>>>> expressed  in seconds):
>>>>
>>>> BAD:
>>>>   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
>>>> log_10trials_50MB_BAD_NonCanc_non64.txt
>>>>   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
>>>>   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
>>>> GOOD:
>>>>   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
>>>> log_10trials_50MB_GOOD_NonCanc_non64.txt
>>>>   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
>>>>   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31
>>>
>>> The improvement from using "64" with the bad kernel is quite large.  
>>> That alone would be a big help for you.
>>>
>>> However, I did see what appears to be a very significant difference 
>>> between the bad and good kernel traces.  It has to do with the order in 
>>> which the blocks are accessed.
>>>
>>> Here is an extract from one of the bad traces.  I have erased all the 
>>> information except for the columns containing the block numbers to be 
>>> written:
>>>
>>> 00019628 00
>>> 00019667 00
>>> 00019628 80
>>> 00019667 80
>>> 00019629 00
>>> 00019668 00
>>> 00019629 80
>>> 00019668 80
>>>
>>> Here is the equivalent portion from one of the good traces:
>>>
>>> 00019628 00
>>> 00019628 80
>>> 00019629 00
>>> 00019629 80
>>> 0001962a 00
>>> 0001962a 80
>>> 0001962b 00
>>> 0001962b 80
>>>
>>> Notice that under the good kernel, the block numbers increase
>>> monotonically in a single sequence.  But under the bad kernel, the
>>> block numbers are not monotonic -- it looks like there are two separate
>>> threads each with its own strictly increasing sequence.
>>>
>>> This is exactly the sort of difference one might expect to see from
>>> the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
>>> identified as the cause of the problem.  With multiqueue I/O, it's not 
>>> surprising to see multiple sequences of block numbers.
>>>
>>> Add it's not at all surprising that a consumer-grade USB storage device 
>>> might do a much worse job of handling non-sequential writes than 
>>> sequential ones.
>>>
>>> Which leads to a simple question for the SCSI or block-layer 
>>> maintainers:  Is there a sysfs setting Andrea can tweak which will 
>>> effectively restrict a particular disk device down to a single I/O
>>> queue, forcing sequential access?
>>
>> The scheduling inefficiency you are seeing may be coming from the fact that the
>> block layer does a direct issue of requests, bypassing the elevator, under some
>> conditions. One of these is sync requests on a multiqueue device. We hit this
>> problem on Zoned disks which can easily return an error for write requests
>> without the elevator throttling writes per zones (zone write locking). This
>> problem was discovered by Hans (on CC).
>>
>> I discussed this with Hannes yesterday and we think we have a fix, but we'll
>> need to do a lot of testing as all block devices are potentially impacted by the
>> change, including stacked drivers (DM). Performance regression is scary with any
>> change in that area (see blk_mq_make_request() and use of
>> blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).
> 
> Not sure this one is same with yours, for USB, mq-deadline is used at
> default, and direct issue won't be possible. Direct issue is only used
> in case of none or underlying queues of DM multipath.

For a multi-queue zoned disk, mq-deadline is also set, but we have observed
unaligned write IO errors for sync writes because of mq-deadline being bypassed
and as a result zones not being write-locked.

In blk_mq_make_request(), at the end, you have:

	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
			!data.hctx->dispatch_busy)) {
		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
	} else {
		blk_mq_sched_insert_request(rq, false, true, true);
	}

Which I read as "for a sync req on a multi-queue device, direct issue",
regardless of the elevator being none or something else.

The correct test should probably be:

	} else if (!q->elevator &&
		   ((q->nr_hw_queues > 1 && is_sync) || 	
		     !data.hctx->dispatch_busy))) {
		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
	} else {
		blk_mq_sched_insert_request(rq, false, true, true);
	}

That is, never bypass the elevator if one is set. Thoughts ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19  9:09                   ` Damien Le Moal
@ 2019-09-19  9:21                     ` Ming Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Ming Lei @ 2019-09-19  9:21 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Alan Stern, Andrea Vai, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg

On Thu, Sep 19, 2019 at 09:09:33AM +0000, Damien Le Moal wrote:
> On 2019/09/19 10:56, Ming Lei wrote:
> > On Thu, Sep 19, 2019 at 08:26:32AM +0000, Damien Le Moal wrote:
> >> On 2019/09/18 18:30, Alan Stern wrote:
> >>> On Wed, 18 Sep 2019, Andrea Vai wrote:
> >>>
> >>>>> Also, I wonder if the changing the size of the data transfers would
> >>>>> make any difference.  This is easy to try; just write "64" to
> >>>>> /sys/block/sd?/queue/max_sectors_kb (where the ? is the appropriate
> >>>>> drive letter) after the drive is plugged in but before the test
> >>>>> starts.
> >>>>
> >>>> ok, so I duplicated the tests above for the "64" case (it was
> >>>> initially set as "120", if it is relevant to know), leading to 40 tests named as
> >>>>
> >>>> bad.mon.out_50000000_64_TIMESTAMP
> >>>> bad.mon.out_50000000_non64_TIMESTAMP
> >>>> good.mon.out_50000000_64_TIMESTAMP
> >>>> good.mon.out_50000000_non64_TIMESTAMP
> >>>>
> >>>> where "64" denotes the ones done with that value in max_sectors_kb,
> >>>> and "not64" the ones without it (as far as I can tell, it has been
> >>>> always "120").
> >>>>
> >>>> So, we have 40 traces total. Each set of 10 trials is identified by
> >>>> a text file, which contains the output log of the test script (and the
> >>>> timestamps), also available in the download zipfile.
> >>>>
> >>>> Just to summarize here the times, they are respectively (number
> >>>> expressed  in seconds):
> >>>>
> >>>> BAD:
> >>>>   Logs: log_10trials_50MB_BAD_NonCanc_64.txt,
> >>>> log_10trials_50MB_BAD_NonCanc_non64.txt
> >>>>   64: 34, 34, 35, 39, 37, 32, 42, 44, 43, 40
> >>>>   not64: 61, 71, 59, 71, 62, 75, 62, 70, 62, 68
> >>>> GOOD:
> >>>>   Logs: log_10trials_50MB_GOOD_NonCanc_64.txt,
> >>>> log_10trials_50MB_GOOD_NonCanc_non64.txt
> >>>>   64: 34, 32, 35, 34, 35, 33, 34, 33, 33, 33
> >>>>   not64: 32, 30, 32, 31, 31, 30, 32, 30, 32, 31
> >>>
> >>> The improvement from using "64" with the bad kernel is quite large.  
> >>> That alone would be a big help for you.
> >>>
> >>> However, I did see what appears to be a very significant difference 
> >>> between the bad and good kernel traces.  It has to do with the order in 
> >>> which the blocks are accessed.
> >>>
> >>> Here is an extract from one of the bad traces.  I have erased all the 
> >>> information except for the columns containing the block numbers to be 
> >>> written:
> >>>
> >>> 00019628 00
> >>> 00019667 00
> >>> 00019628 80
> >>> 00019667 80
> >>> 00019629 00
> >>> 00019668 00
> >>> 00019629 80
> >>> 00019668 80
> >>>
> >>> Here is the equivalent portion from one of the good traces:
> >>>
> >>> 00019628 00
> >>> 00019628 80
> >>> 00019629 00
> >>> 00019629 80
> >>> 0001962a 00
> >>> 0001962a 80
> >>> 0001962b 00
> >>> 0001962b 80
> >>>
> >>> Notice that under the good kernel, the block numbers increase
> >>> monotonically in a single sequence.  But under the bad kernel, the
> >>> block numbers are not monotonic -- it looks like there are two separate
> >>> threads each with its own strictly increasing sequence.
> >>>
> >>> This is exactly the sort of difference one might expect to see from
> >>> the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
> >>> identified as the cause of the problem.  With multiqueue I/O, it's not 
> >>> surprising to see multiple sequences of block numbers.
> >>>
> >>> Add it's not at all surprising that a consumer-grade USB storage device 
> >>> might do a much worse job of handling non-sequential writes than 
> >>> sequential ones.
> >>>
> >>> Which leads to a simple question for the SCSI or block-layer 
> >>> maintainers:  Is there a sysfs setting Andrea can tweak which will 
> >>> effectively restrict a particular disk device down to a single I/O
> >>> queue, forcing sequential access?
> >>
> >> The scheduling inefficiency you are seeing may be coming from the fact that the
> >> block layer does a direct issue of requests, bypassing the elevator, under some
> >> conditions. One of these is sync requests on a multiqueue device. We hit this
> >> problem on Zoned disks which can easily return an error for write requests
> >> without the elevator throttling writes per zones (zone write locking). This
> >> problem was discovered by Hans (on CC).
> >>
> >> I discussed this with Hannes yesterday and we think we have a fix, but we'll
> >> need to do a lot of testing as all block devices are potentially impacted by the
> >> change, including stacked drivers (DM). Performance regression is scary with any
> >> change in that area (see blk_mq_make_request() and use of
> >> blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).
> > 
> > Not sure this one is same with yours, for USB, mq-deadline is used at
> > default, and direct issue won't be possible. Direct issue is only used
> > in case of none or underlying queues of DM multipath.
> 
> For a multi-queue zoned disk, mq-deadline is also set, but we have observed
> unaligned write IO errors for sync writes because of mq-deadline being bypassed
> and as a result zones not being write-locked.
> 
> In blk_mq_make_request(), at the end, you have:
> 
> 	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
> 			!data.hctx->dispatch_busy)) {
> 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> 	} else {
> 		blk_mq_sched_insert_request(rq, false, true, true);
> 	}
> 
> Which I read as "for a sync req on a multi-queue device, direct issue",
> regardless of the elevator being none or something else.

Yeah, looks elevator is bypassed in the above case, which seems a bug.
USB storage has only single queue.

> 
> The correct test should probably be:
> 
> 	} else if (!q->elevator &&
> 		   ((q->nr_hw_queues > 1 && is_sync) || 	
> 		     !data.hctx->dispatch_busy))) {
> 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> 	} else {
> 		blk_mq_sched_insert_request(rq, false, true, true);
> 	}
> 
> That is, never bypass the elevator if one is set. Thoughts ?

IMO, elevator shouldn't be bypassed any time, looks it is bypassed
in the following branch too, but may not be reached for zone device.

blk_mq_make_request()
 ...
 } else if (plug && !blk_queue_nomerges(q)) {
	...
	if (same_queue_rq) {
                        data.hctx = same_queue_rq->mq_hctx;
                        trace_block_unplug(q, 1, true);
                        blk_mq_try_issue_directly(data.hctx, same_queue_rq,
                                        &cookie);
                }
 }


Thanks,
Ming

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19  8:26               ` Damien Le Moal
  2019-09-19  8:55                 ` Ming Lei
@ 2019-09-19 14:01                 ` Alan Stern
  2019-09-19 14:14                   ` Damien Le Moal
  1 sibling, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-09-19 14:01 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Andrea Vai, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

On Thu, 19 Sep 2019, Damien Le Moal wrote:

> > This is exactly the sort of difference one might expect to see from
> > the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
> > identified as the cause of the problem.  With multiqueue I/O, it's not 
> > surprising to see multiple sequences of block numbers.
> > 
> > Add it's not at all surprising that a consumer-grade USB storage device 
> > might do a much worse job of handling non-sequential writes than 
> > sequential ones.
> > 
> > Which leads to a simple question for the SCSI or block-layer 
> > maintainers:  Is there a sysfs setting Andrea can tweak which will 
> > effectively restrict a particular disk device down to a single I/O
> > queue, forcing sequential access?
> 
> The scheduling inefficiency you are seeing may be coming from the fact that the
> block layer does a direct issue of requests, bypassing the elevator, under some
> conditions. One of these is sync requests on a multiqueue device. We hit this
> problem on Zoned disks which can easily return an error for write requests
> without the elevator throttling writes per zones (zone write locking). This
> problem was discovered by Hans (on CC).

Is there any way for Andrea to check whether this is the underlying
cause?

> I discussed this with Hannes yesterday and we think we have a fix, but we'll
> need to do a lot of testing as all block devices are potentially impacted by the
> change, including stacked drivers (DM). Performance regression is scary with any
> change in that area (see blk_mq_make_request() and use of
> blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).

No doubt Andrea will be happy to test your fix when it's ready.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19 14:01                 ` Alan Stern
@ 2019-09-19 14:14                   ` Damien Le Moal
  2019-09-20  7:03                     ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Damien Le Moal @ 2019-09-19 14:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrea Vai, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

On 2019/09/19 16:01, Alan Stern wrote:
> On Thu, 19 Sep 2019, Damien Le Moal wrote:
> 
>>> This is exactly the sort of difference one might expect to see from
>>> the commit f664a3cc17b7 ("scsi: kill off the legacy IO path") you
>>> identified as the cause of the problem.  With multiqueue I/O, it's not 
>>> surprising to see multiple sequences of block numbers.
>>>
>>> Add it's not at all surprising that a consumer-grade USB storage device 
>>> might do a much worse job of handling non-sequential writes than 
>>> sequential ones.
>>>
>>> Which leads to a simple question for the SCSI or block-layer 
>>> maintainers:  Is there a sysfs setting Andrea can tweak which will 
>>> effectively restrict a particular disk device down to a single I/O
>>> queue, forcing sequential access?
>>
>> The scheduling inefficiency you are seeing may be coming from the fact that the
>> block layer does a direct issue of requests, bypassing the elevator, under some
>> conditions. One of these is sync requests on a multiqueue device. We hit this
>> problem on Zoned disks which can easily return an error for write requests
>> without the elevator throttling writes per zones (zone write locking). This
>> problem was discovered by Hans (on CC).
> 
> Is there any way for Andrea to check whether this is the underlying
> cause?>
>> I discussed this with Hannes yesterday and we think we have a fix, but we'll
>> need to do a lot of testing as all block devices are potentially impacted by the
>> change, including stacked drivers (DM). Performance regression is scary with any
>> change in that area (see blk_mq_make_request() and use of
>> blk_mq_try_issue_directly() vs blk_mq_sched_insert_request()).
> 
> No doubt Andrea will be happy to test your fix when it's ready.

Hannes posted an RFC series:

https://www.spinics.net/lists/linux-scsi/msg133848.html

Andrea can try it. But If the USB device is not multi-queue, this fix will
probably have no effect.

Andrea,

What is the device in question ? Is it a USB external HDD ? SSD ? Flash key ?

Best regards.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19  7:33               ` Andrea Vai
@ 2019-09-19 17:54                 ` Alan Stern
  2019-09-20  7:25                   ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-09-19 17:54 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

On Thu, 19 Sep 2019, Andrea Vai wrote:

> BTW, another question: Alan refers to the slow media as a "consumer-
> grade USB storage device". What could I do to identify and buy a "good
> media"? Are there any features to look for?

In general, USB flash drives should not be expected to work as well as 
an actual disk drive connected over USB.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19 14:14                   ` Damien Le Moal
@ 2019-09-20  7:03                     ` Andrea Vai
  2019-09-25 19:30                       ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-09-20  7:03 UTC (permalink / raw)
  To: Damien Le Moal, Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg

Il giorno gio, 19/09/2019 alle 14.14 +0000, Damien Le Moal ha scritto:
> On 2019/09/19 16:01, Alan Stern wrote:
> [...]
> > No doubt Andrea will be happy to test your fix when it's ready.

Yes, of course.

> 
> Hannes posted an RFC series:
> 
> https://www.spinics.net/lists/linux-scsi/msg133848.html
> 
> Andrea can try it.

Ok, but I would need some instructions please, because I am not able
to understand how to "try it". Sorry for that.

> Andrea,
> 
> What is the device in question ? Is it a USB external HDD ? SSD ?
> Flash key ?

It is a USB flash key (a.k.a. pendrive, flash drive, etc.):

ID 0951:1666 Kingston Technology DataTraveler 100 G3/G4/SE9 G2

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-19 17:54                 ` Alan Stern
@ 2019-09-20  7:25                   ` Andrea Vai
  2019-09-20  7:44                     ` Greg KH
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-09-20  7:25 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Thumshirn, Jens Axboe, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH

Il giorno gio, 19/09/2019 alle 13.54 -0400, Alan Stern ha scritto:
> 
> In general, USB flash drives should not be expected to work as well
> as 
> an actual disk drive connected over USB.

Ok, so I think I'll buy some different hardware. Would an SSD drive
(connected over USB) behave like a flash drive or like an "actual disk
drive" from this point of view?

Many thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-20  7:25                   ` Andrea Vai
@ 2019-09-20  7:44                     ` Greg KH
  0 siblings, 0 replies; 102+ messages in thread
From: Greg KH @ 2019-09-20  7:44 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Alan Stern, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen

On Fri, Sep 20, 2019 at 09:25:17AM +0200, Andrea Vai wrote:
> Il giorno gio, 19/09/2019 alle 13.54 -0400, Alan Stern ha scritto:
> > 
> > In general, USB flash drives should not be expected to work as well
> > as 
> > an actual disk drive connected over USB.
> 
> Ok, so I think I'll buy some different hardware. Would an SSD drive
> (connected over USB) behave like a flash drive or like an "actual disk
> drive" from this point of view?

It all depends on the drive.  Some are a lot better than others, and
it's almost impossible to tell until you buy the thing and try it out :(

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-20  7:03                     ` Andrea Vai
@ 2019-09-25 19:30                       ` Alan Stern
  2019-09-25 19:36                         ` Jens Axboe
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-09-25 19:30 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Johannes Thumshirn, Jens Axboe, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

[-- Attachment #1: Type: TEXT/PLAIN, Size: 920 bytes --]

On Fri, 20 Sep 2019, Andrea Vai wrote:

> Il giorno gio, 19/09/2019 alle 14.14 +0000, Damien Le Moal ha scritto:
> > On 2019/09/19 16:01, Alan Stern wrote:
> > [...]
> > > No doubt Andrea will be happy to test your fix when it's ready.
> 
> Yes, of course.
> 
> > 
> > Hannes posted an RFC series:
> > 
> > https://www.spinics.net/lists/linux-scsi/msg133848.html
> > 
> > Andrea can try it.
> 
> Ok, but I would need some instructions please, because I am not able
> to understand how to "try it". Sorry for that.

I have attached the two patches to this email.  You should start with a 
recent kernel source tree and apply the patches by doing:

	git apply patch1 patch2

or something similar.  Then build a kernel from the new source code and 
test it.

Ultimately, if nobody can find a way to restore the sequential I/O 
behavior we had prior to commit f664a3cc17b7, that commit may have to 
be reverted.

Alan Stern

[-- Attachment #2: Type: TEXT/PLAIN, Size: 980 bytes --]

From: Hannes Reinecke <hare@suse.com>

When blk_mq_request_issue_directly() returns BLK_STS_RESOURCE we
need to requeue the I/O, but adding it to the global request list
will mess up with the passed-in request list. So re-add the request
to the original list and leave it to the caller to handle situations
where the list wasn't completely emptied.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b038ec680e84..44ff3c1442a4 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1899,8 +1899,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
 		if (ret != BLK_STS_OK) {
 			if (ret == BLK_STS_RESOURCE ||
 					ret == BLK_STS_DEV_RESOURCE) {
-				blk_mq_request_bypass_insert(rq,
-							list_empty(list));
+				list_add(list, &rq->queuelist);
 				break;
 			}
 			blk_mq_end_request(rq, ret);
-- 
2.16.4

[-- Attachment #3: Type: TEXT/PLAIN, Size: 1721 bytes --]

From: Hannes Reinecke <hare@suse.com>

A scheduler might be attached even for devices exposing more than
one hardware queue, so the check for the number of hardware queue
is pointless and should be removed.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 44ff3c1442a4..faab542e4836 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1931,7 +1931,6 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
 
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
-	const int is_sync = op_is_sync(bio->bi_opf);
 	const int is_flush_fua = op_is_flush(bio->bi_opf);
 	struct blk_mq_alloc_data data = { .flags = 0};
 	struct request *rq;
@@ -1977,7 +1976,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 		/* bypass scheduler for flush rq */
 		blk_insert_flush(rq);
 		blk_mq_run_hw_queue(data.hctx, true);
-	} else if (plug && (q->nr_hw_queues == 1 || q->mq_ops->commit_rqs)) {
+	} else if (plug && q->mq_ops->commit_rqs) {
 		/*
 		 * Use plugging if we have a ->commit_rqs() hook as well, as
 		 * we know the driver uses bd->last in a smart fashion.
@@ -2020,9 +2019,6 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
 					&cookie);
 		}
-	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
-			!data.hctx->dispatch_busy)) {
-		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
 	} else {
 		blk_mq_sched_insert_request(rq, false, true, true);
 	}
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-25 19:30                       ` Alan Stern
@ 2019-09-25 19:36                         ` Jens Axboe
  2019-09-27 15:47                           ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Jens Axboe @ 2019-09-25 19:36 UTC (permalink / raw)
  To: Alan Stern, Andrea Vai
  Cc: Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

On 9/25/19 9:30 PM, Alan Stern wrote:
> On Fri, 20 Sep 2019, Andrea Vai wrote:
> 
>> Il giorno gio, 19/09/2019 alle 14.14 +0000, Damien Le Moal ha scritto:
>>> On 2019/09/19 16:01, Alan Stern wrote:
>>> [...]
>>>> No doubt Andrea will be happy to test your fix when it's ready.
>>
>> Yes, of course.
>>
>>>
>>> Hannes posted an RFC series:
>>>
>>> https://www.spinics.net/lists/linux-scsi/msg133848.html
>>>
>>> Andrea can try it.
>>
>> Ok, but I would need some instructions please, because I am not able
>> to understand how to "try it". Sorry for that.
> 
> I have attached the two patches to this email.  You should start with a
> recent kernel source tree and apply the patches by doing:
> 
> 	git apply patch1 patch2
> 
> or something similar.  Then build a kernel from the new source code and
> test it.
> 
> Ultimately, if nobody can find a way to restore the sequential I/O
> behavior we had prior to commit f664a3cc17b7, that commit may have to
> be reverted.

Don't use patch1, it's buggy. patch2 should be enough to test the theory.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-25 19:36                         ` Jens Axboe
@ 2019-09-27 15:47                           ` Andrea Vai
  2019-11-04 16:00                             ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-09-27 15:47 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern
  Cc: Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

Il giorno mer, 25/09/2019 alle 21.36 +0200, Jens Axboe ha scritto:
> On 9/25/19 9:30 PM, Alan Stern wrote:
> [...]
> > 
> > I have attached the two patches to this email.  You should start
> with a
> > recent kernel source tree and apply the patches by doing:
> > 
> > 	git apply patch1 patch2
> > 
> > or something similar.  Then build a kernel from the new source
> code and
> > test it.
> > 
> > Ultimately, if nobody can find a way to restore the sequential I/O
> > behavior we had prior to commit f664a3cc17b7, that commit may have
> to
> > be reverted.
> 
> Don't use patch1, it's buggy. patch2 should be enough to test the
> theory.

Sorry, but if I cd into the "linux" directory and run the command

# git apply -v patch2

the result is that the patch cannot be applied correctly:

------------------------------------------------------------------------------
Controllo della patch block/blk-mq.c in corso...
error: durante la ricerca per:
?
static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)?
{?
	const int is_sync = op_is_sync(bio->bi_opf);?
	const int is_flush_fua = op_is_flush(bio->bi_opf);?
	struct blk_mq_alloc_data data = { .flags = 0};?
	struct request *rq;?

error: patch non riuscita: block/blk-mq.c:1931
error: block/blk-mq.c: la patch non si applica correttamente
------------------------------------------------------------------------------

The "linux" directory is the one generated by a fresh git clone:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

What am I doing wrong?

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-09-27 15:47                           ` Andrea Vai
@ 2019-11-04 16:00                             ` Andrea Vai
  2019-11-04 18:20                               ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-04 16:00 UTC (permalink / raw)
  To: Jens Axboe, Alan Stern
  Cc: Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

[-- Attachment #1: Type: text/plain, Size: 2368 bytes --]

Il giorno ven, 27/09/2019 alle 17.47 +0200, Andrea Vai ha scritto:
> Il giorno mer, 25/09/2019 alle 21.36 +0200, Jens Axboe ha scritto:
> > On 9/25/19 9:30 PM, Alan Stern wrote:
> > [...]
> > > 
> > > I have attached the two patches to this email.  You should start
> > with a
> > > recent kernel source tree and apply the patches by doing:
> > > 
> > > 	git apply patch1 patch2
> > > 
> > > or something similar.  Then build a kernel from the new source
> > code and
> > > test it.
> > > 
> > > Ultimately, if nobody can find a way to restore the sequential
> I/O
> > > behavior we had prior to commit f664a3cc17b7, that commit may
> have
> > to
> > > be reverted.
> > 
> > Don't use patch1, it's buggy. patch2 should be enough to test the
> > theory.

As I didn't have any answer, I am quoting my last reply here:

> 
> Sorry, but if I cd into the "linux" directory and run the command
> 
> # git apply -v patch2
> 
> the result is that the patch cannot be applied correctly:
> 
> --------------------------------------------------------------------
> ----------
> Controllo della patch block/blk-mq.c in corso...
> error: durante la ricerca per:
> ?
> static blk_qc_t blk_mq_make_request(struct request_queue *q, struct
> bio *bio)?
> {?
> 	const int is_sync = op_is_sync(bio->bi_opf);?
> 	const int is_flush_fua = op_is_flush(bio->bi_opf);?
> 	struct blk_mq_alloc_data data = { .flags = 0};?
> 	struct request *rq;?
> 
> error: patch non riuscita: block/blk-mq.c:1931
> error: block/blk-mq.c: la patch non si applica correttamente
> --------------------------------------------------------------------
> ----------
> 
> The "linux" directory is the one generated by a fresh git clone:
> 
> git clone
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> What am I doing wrong?
> 

Meanwhile, Alan tried to help me and gave me another patch (attached),
which doesn't work too, but gives a different error: "The git diff
header does not contain information about the file once removed 1
initial component of the path (row 14)" (actually, this is my
translation from the original message in Italian: "error:
l'intestazione git diff non riporta le informazioni sul file una volta
rimosso 1 componente iniziale del percorso (riga 14)")

I tested the two patches after a fresh git clone today, a few minutes
ago.

What can I do?

Thank you,
Bye
Andrea

[-- Attachment #2: patch2_alan --]
[-- Type: message/rfc822, Size: 296 bytes --]

From: Hannes Reinecke <hare@suse.com>
Subject: No Subject
Date: Mon, 04 Nov 2019 16:58:21 +0100
Message-ID: <fe072bc69e13435573d824133c3981f8841cf2c7.camel@suse.com>



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-04 16:00                             ` Andrea Vai
@ 2019-11-04 18:20                               ` Alan Stern
  2019-11-05 11:48                                 ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-11-04 18:20 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Jens Axboe, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1088 bytes --]

On Mon, 4 Nov 2019, Andrea Vai wrote:

> > The "linux" directory is the one generated by a fresh git clone:
> > 
> > git clone
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > 
> > What am I doing wrong?
> > 
> 
> Meanwhile, Alan tried to help me and gave me another patch (attached),
> which doesn't work too, but gives a different error: "The git diff
> header does not contain information about the file once removed 1
> initial component of the path (row 14)" (actually, this is my
> translation from the original message in Italian: "error:
> l'intestazione git diff non riporta le informazioni sul file una volta
> rimosso 1 componente iniziale del percorso (riga 14)")
> 
> I tested the two patches after a fresh git clone today, a few minutes
> ago.
> 
> What can I do?

You should be able to do something like this:

	cd linux
	patch -p1 </path/to/patch2

and that should work with no errors.  You don't need to use git to 
apply a patch.

In case that patch2 file was mangled somewhere along the way, I have 
attached a copy to this message.

Alan Stern

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1777 bytes --]

From: Hannes Reinecke <hare@suse.com>

A scheduler might be attached even for devices exposing more than
one hardware queue, so the check for the number of hardware queue
is pointless and should be removed.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq.c |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 44ff3c1442a4..faab542e4836 100644
Index: usb-devel/block/blk-mq.c
===================================================================
--- usb-devel.orig/block/blk-mq.c
+++ usb-devel/block/blk-mq.c
@@ -1946,7 +1946,6 @@ static void blk_add_rq_to_plug(struct bl
 
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
-	const int is_sync = op_is_sync(bio->bi_opf);
 	const int is_flush_fua = op_is_flush(bio->bi_opf);
 	struct blk_mq_alloc_data data = { .flags = 0};
 	struct request *rq;
@@ -1992,8 +1991,7 @@ static blk_qc_t blk_mq_make_request(stru
 		/* bypass scheduler for flush rq */
 		blk_insert_flush(rq);
 		blk_mq_run_hw_queue(data.hctx, true);
-	} else if (plug && (q->nr_hw_queues == 1 || q->mq_ops->commit_rqs ||
-				!blk_queue_nonrot(q))) {
+	} else if (plug && (q->mq_ops->commit_rqs || !blk_queue_nonrot(q))) {
 		/*
 		 * Use plugging if we have a ->commit_rqs() hook as well, as
 		 * we know the driver uses bd->last in a smart fashion.
@@ -2041,9 +2039,6 @@ static blk_qc_t blk_mq_make_request(stru
 			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
 					&cookie);
 		}
-	} else if ((q->nr_hw_queues > 1 && is_sync) ||
-			!data.hctx->dispatch_busy) {
-		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
 	} else {
 		blk_mq_sched_insert_request(rq, false, true, true);
 	}

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-04 18:20                               ` Alan Stern
@ 2019-11-05 11:48                                 ` Andrea Vai
  2019-11-05 18:31                                   ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-05 11:48 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jens Axboe, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg

Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
> On Mon, 4 Nov 2019, Andrea Vai wrote:
> 
> > > The "linux" directory is the one generated by a fresh git clone:
> > > 
> > > git clone
> > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > 
> > > What am I doing wrong?
> > > 
> > 
> > Meanwhile, Alan tried to help me and gave me another patch
> (attached),
> > which doesn't work too, but gives a different error: "The git diff
> > header does not contain information about the file once removed 1
> > initial component of the path (row 14)" (actually, this is my
> > translation from the original message in Italian: "error:
> > l'intestazione git diff non riporta le informazioni sul file una
> volta
> > rimosso 1 componente iniziale del percorso (riga 14)")
> > 
> > I tested the two patches after a fresh git clone today, a few
> minutes
> > ago.
> > 
> > What can I do?
> 
> You should be able to do something like this:
> 
>         cd linux
>         patch -p1 </path/to/patch2
> 
> and that should work with no errors.  You don't need to use git to 
> apply a patch.
> 
> In case that patch2 file was mangled somewhere along the way, I
> have 
> attached a copy to this message.

Ok, so the "patch" command worked, the kernel compiled and ran, but
the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
kernel).

Let me know what else could I do,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-05 11:48                                 ` Andrea Vai
@ 2019-11-05 18:31                                   ` Alan Stern
  2019-11-05 23:29                                     ` Jens Axboe
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-11-05 18:31 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Jens Axboe, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, 5 Nov 2019, Andrea Vai wrote:

> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:

> > You should be able to do something like this:
> > 
> >         cd linux
> >         patch -p1 </path/to/patch2
> > 
> > and that should work with no errors.  You don't need to use git to 
> > apply a patch.
> > 
> > In case that patch2 file was mangled somewhere along the way, I
> > have 
> > attached a copy to this message.
> 
> Ok, so the "patch" command worked, the kernel compiled and ran, but
> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
> kernel).
> 
> Let me know what else could I do,

I'm out of suggestions.  If anyone else knows how to make a kernel with 
no legacy queuing support -- only multiqueue -- issue I/O requests 
sequentially, please speak up.

In the absence of any responses, after a week or so I will submit a
patch to revert the f664a3cc17b7 ("scsi: kill off the legacy IO path")  
commit.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-05 18:31                                   ` Alan Stern
@ 2019-11-05 23:29                                     ` Jens Axboe
  2019-11-06 16:03                                       ` Alan Stern
  0 siblings, 1 reply; 102+ messages in thread
From: Jens Axboe @ 2019-11-05 23:29 UTC (permalink / raw)
  To: Alan Stern, Andrea Vai
  Cc: Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On 11/5/19 11:31 AM, Alan Stern wrote:
> On Tue, 5 Nov 2019, Andrea Vai wrote:
> 
>> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
> 
>>> You should be able to do something like this:
>>>
>>>          cd linux
>>>          patch -p1 </path/to/patch2
>>>
>>> and that should work with no errors.  You don't need to use git to
>>> apply a patch.
>>>
>>> In case that patch2 file was mangled somewhere along the way, I
>>> have
>>> attached a copy to this message.
>>
>> Ok, so the "patch" command worked, the kernel compiled and ran, but
>> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
>> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
>> kernel).
>>
>> Let me know what else could I do,
> 
> I'm out of suggestions.  If anyone else knows how to make a kernel with
> no legacy queuing support -- only multiqueue -- issue I/O requests
> sequentially, please speak up.

Do we know for a fact that the device needs strictly serialized requests
to not stall? And writes in particular? I won't comment on how broken
that is, just trying to establish this as the problem that's making this
particular device be slow?

I've lost track of this thread, but has mq-deadline been tried as the
IO scheduler? We do have support for strictly serialized (writes)
since that's required for zoned device, wouldn't be hard at all to make
this cover a blacklisted device like this one.

> In the absence of any responses, after a week or so I will submit a
> patch to revert the f664a3cc17b7 ("scsi: kill off the legacy IO path")
> commit.

That's not going to be feasible.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-05 23:29                                     ` Jens Axboe
@ 2019-11-06 16:03                                       ` Alan Stern
  2019-11-06 22:13                                         ` Damien Le Moal
  0 siblings, 1 reply; 102+ messages in thread
From: Alan Stern @ 2019-11-06 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Andrea Vai, Damien Le Moal, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, 5 Nov 2019, Jens Axboe wrote:

> On 11/5/19 11:31 AM, Alan Stern wrote:
> > On Tue, 5 Nov 2019, Andrea Vai wrote:
> > 
> >> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
> > 
> >>> You should be able to do something like this:
> >>>
> >>>          cd linux
> >>>          patch -p1 </path/to/patch2
> >>>
> >>> and that should work with no errors.  You don't need to use git to
> >>> apply a patch.
> >>>
> >>> In case that patch2 file was mangled somewhere along the way, I
> >>> have
> >>> attached a copy to this message.
> >>
> >> Ok, so the "patch" command worked, the kernel compiled and ran, but
> >> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
> >> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
> >> kernel).
> >>
> >> Let me know what else could I do,
> > 
> > I'm out of suggestions.  If anyone else knows how to make a kernel with
> > no legacy queuing support -- only multiqueue -- issue I/O requests
> > sequentially, please speak up.
> 
> Do we know for a fact that the device needs strictly serialized requests
> to not stall?

Not exactly, but that is far and away the most likely explanation for
the device's behavior.  We tried making a bunch of changes, some of
which helped a little bit, but all of them left a very large
performance gap.  I/O monitoring showed that the only noticeable
difference in the kernel-device interaction caused by the $SUBJECT
commit was the non-sequential access pattern.

> And writes in particular?

Andrea has tested only the write behavior.  Possibly reading will be
affected too, but my guess is that it won't be.

> I won't comment on how broken
> that is, just trying to establish this as the problem that's making this
> particular device be slow?

It seems reasonable that the access pattern could make a significant
difference.  The device's behavior suggests that it buffers incoming
data and pauses from time to time to write the accumulated data into
non-volatile storage.  If its algorithm for allocating, erasing, and
writing data blocks is optimized for the sequential case, you can
easily imagine that non-sequential accesses would cause it to pause
more often and for longer times -- which is exactly what we observed.
These extra pauses are what resulted in the overall performance 
decrease.

So far we have had no way to perform a direct test.  That is, we don't
know of any setting that would change a single kernel between
sequential and non-sequential access.  If you can suggest a simple way
to force a kernel without the $SUBJECT commit to do non-sequential
writes, I'm sure Andrea will be happy to try it out and see if it
causes a slowdown.

> I've lost track of this thread, but has mq-deadline been tried as the
> IO scheduler? We do have support for strictly serialized (writes)
> since that's required for zoned device, wouldn't be hard at all to make
> this cover a blacklisted device like this one.

Please spell out the exact procedure in detail so that Andrea can try 
it.  He's not a kernel hacker, and I know very little about the block 
layer.

Alan Stern


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-06 16:03                                       ` Alan Stern
@ 2019-11-06 22:13                                         ` Damien Le Moal
  2019-11-07  7:04                                           ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Damien Le Moal @ 2019-11-06 22:13 UTC (permalink / raw)
  To: Alan Stern, Jens Axboe
  Cc: Andrea Vai, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 2019/11/07 1:04, Alan Stern wrote:
> On Tue, 5 Nov 2019, Jens Axboe wrote:
> 
>> On 11/5/19 11:31 AM, Alan Stern wrote:
>>> On Tue, 5 Nov 2019, Andrea Vai wrote:
>>>
>>>> Il giorno lun, 04/11/2019 alle 13.20 -0500, Alan Stern ha scritto:
>>>
>>>>> You should be able to do something like this:
>>>>>
>>>>>          cd linux
>>>>>          patch -p1 </path/to/patch2
>>>>>
>>>>> and that should work with no errors.  You don't need to use git to
>>>>> apply a patch.
>>>>>
>>>>> In case that patch2 file was mangled somewhere along the way, I
>>>>> have
>>>>> attached a copy to this message.
>>>>
>>>> Ok, so the "patch" command worked, the kernel compiled and ran, but
>>>> the test still failed (273, 108, 104, 260, 177, 236, 179, 1123, 289,
>>>> 873 seconds to copy a 500MB file, vs. ~30 seconds with the "good"
>>>> kernel).
>>>>
>>>> Let me know what else could I do,
>>>
>>> I'm out of suggestions.  If anyone else knows how to make a kernel with
>>> no legacy queuing support -- only multiqueue -- issue I/O requests
>>> sequentially, please speak up.
>>
>> Do we know for a fact that the device needs strictly serialized requests
>> to not stall?
> 
> Not exactly, but that is far and away the most likely explanation for
> the device's behavior.  We tried making a bunch of changes, some of
> which helped a little bit, but all of them left a very large
> performance gap.  I/O monitoring showed that the only noticeable
> difference in the kernel-device interaction caused by the $SUBJECT
> commit was the non-sequential access pattern.
> 
>> And writes in particular?
> 
> Andrea has tested only the write behavior.  Possibly reading will be
> affected too, but my guess is that it won't be.
> 
>> I won't comment on how broken
>> that is, just trying to establish this as the problem that's making this
>> particular device be slow?
> 
> It seems reasonable that the access pattern could make a significant
> difference.  The device's behavior suggests that it buffers incoming
> data and pauses from time to time to write the accumulated data into
> non-volatile storage.  If its algorithm for allocating, erasing, and
> writing data blocks is optimized for the sequential case, you can
> easily imagine that non-sequential accesses would cause it to pause
> more often and for longer times -- which is exactly what we observed.
> These extra pauses are what resulted in the overall performance 
> decrease.
> 
> So far we have had no way to perform a direct test.  That is, we don't
> know of any setting that would change a single kernel between
> sequential and non-sequential access.  If you can suggest a simple way
> to force a kernel without the $SUBJECT commit to do non-sequential
> writes, I'm sure Andrea will be happy to try it out and see if it
> causes a slowdown.
> 
>> I've lost track of this thread, but has mq-deadline been tried as the
>> IO scheduler? We do have support for strictly serialized (writes)
>> since that's required for zoned device, wouldn't be hard at all to make
>> this cover a blacklisted device like this one.
> 
> Please spell out the exact procedure in detail so that Andrea can try 
> it.  He's not a kernel hacker, and I know very little about the block 
> layer.

Please simply try your write tests after doing this:

echo mq-deadline > /sys/block/<name of your USB disk>/queue/scheduler

And confirm that mq-deadline is selected with:

cat /sys/block/<name of your USB disk>/queue/scheduler
[mq-deadline] kyber bfq none


> 
> Alan Stern
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-06 22:13                                         ` Damien Le Moal
@ 2019-11-07  7:04                                           ` Andrea Vai
  2019-11-07  7:54                                             ` Damien Le Moal
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-07  7:04 UTC (permalink / raw)
  To: Damien Le Moal, Alan Stern, Jens Axboe
  Cc: Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> 
> 
> Please simply try your write tests after doing this:
> 
> echo mq-deadline > /sys/block/<name of your USB
> disk>/queue/scheduler
> 
> And confirm that mq-deadline is selected with:
> 
> cat /sys/block/<name of your USB disk>/queue/scheduler
> [mq-deadline] kyber bfq none

ok, which kernel should I test with this: the fresh git cloned, or the
one just patched with Alan's patch, or doesn't matter which one?

Thanks, and bye,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07  7:04                                           ` Andrea Vai
@ 2019-11-07  7:54                                             ` Damien Le Moal
  2019-11-07 18:59                                               ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Damien Le Moal @ 2019-11-07  7:54 UTC (permalink / raw)
  To: Andrea Vai, Alan Stern, Jens Axboe
  Cc: Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 2019/11/07 16:04, Andrea Vai wrote:
> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>
>>
>> Please simply try your write tests after doing this:
>>
>> echo mq-deadline > /sys/block/<name of your USB
>> disk>/queue/scheduler
>>
>> And confirm that mq-deadline is selected with:
>>
>> cat /sys/block/<name of your USB disk>/queue/scheduler
>> [mq-deadline] kyber bfq none
> 
> ok, which kernel should I test with this: the fresh git cloned, or the
> one just patched with Alan's patch, or doesn't matter which one?

Probably all of them to see if there are any differences.

> 
> Thanks, and bye,
> Andrea
> 
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07  7:54                                             ` Damien Le Moal
@ 2019-11-07 18:59                                               ` Andrea Vai
  2019-11-08  8:42                                                 ` Damien Le Moal
  2019-11-09 22:28                                                 ` Ming Lei
  0 siblings, 2 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-07 18:59 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[Sorry for the duplicate message, it didn't reach the lists due to
html formatting]
Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
<Damien.LeMoal@wdc.com> ha scritto:
>
> On 2019/11/07 16:04, Andrea Vai wrote:
> > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> >>
> >>
> >> Please simply try your write tests after doing this:
> >>
> >> echo mq-deadline > /sys/block/<name of your USB
> >> disk>/queue/scheduler
> >>
> >> And confirm that mq-deadline is selected with:
> >>
> >> cat /sys/block/<name of your USB disk>/queue/scheduler
> >> [mq-deadline] kyber bfq none
> >
> > ok, which kernel should I test with this: the fresh git cloned, or the
> > one just patched with Alan's patch, or doesn't matter which one?
>
> Probably all of them to see if there are any differences.

with both kernels, the output of
cat /sys/block/sdh/queue/schedule

already contains [mq-deadline]: is it correct to assume that the echo
command and the subsequent testing is useless? What to do now?

Thanks, and bye
Andrea

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07 18:59                                               ` Andrea Vai
@ 2019-11-08  8:42                                                 ` Damien Le Moal
  2019-11-08 14:33                                                   ` Jens Axboe
  2019-11-09 10:09                                                   ` Ming Lei
  2019-11-09 22:28                                                 ` Ming Lei
  1 sibling, 2 replies; 102+ messages in thread
From: Damien Le Moal @ 2019-11-08  8:42 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Ming Lei, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On 2019/11/08 4:00, Andrea Vai wrote:
> [Sorry for the duplicate message, it didn't reach the lists due to
> html formatting]
> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> <Damien.LeMoal@wdc.com> ha scritto:
>>
>> On 2019/11/07 16:04, Andrea Vai wrote:
>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>>>
>>>>
>>>> Please simply try your write tests after doing this:
>>>>
>>>> echo mq-deadline > /sys/block/<name of your USB
>>>> disk>/queue/scheduler
>>>>
>>>> And confirm that mq-deadline is selected with:
>>>>
>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
>>>> [mq-deadline] kyber bfq none
>>>
>>> ok, which kernel should I test with this: the fresh git cloned, or the
>>> one just patched with Alan's patch, or doesn't matter which one?
>>
>> Probably all of them to see if there are any differences.
> 
> with both kernels, the output of
> cat /sys/block/sdh/queue/schedule
> 
> already contains [mq-deadline]: is it correct to assume that the echo
> command and the subsequent testing is useless? What to do now?

Probably, yes. Have you obtained a blktrace of the workload during these
tests ? Any significant difference in the IO pattern (IO size and
randomness) and IO timing (any device idle time where the device has no
command to process) ? Asking because the problem may be above the block
layer, with the file system for instance.

> 
> Thanks, and bye
> Andrea
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08  8:42                                                 ` Damien Le Moal
@ 2019-11-08 14:33                                                   ` Jens Axboe
  2019-11-11 10:46                                                     ` Andrea Vai
  2019-11-09 10:09                                                   ` Ming Lei
  1 sibling, 1 reply; 102+ messages in thread
From: Jens Axboe @ 2019-11-08 14:33 UTC (permalink / raw)
  To: Damien Le Moal, Andrea Vai
  Cc: Alan Stern, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 11/8/19 1:42 AM, Damien Le Moal wrote:
> On 2019/11/08 4:00, Andrea Vai wrote:
>> [Sorry for the duplicate message, it didn't reach the lists due to
>> html formatting]
>> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
>> <Damien.LeMoal@wdc.com> ha scritto:
>>>
>>> On 2019/11/07 16:04, Andrea Vai wrote:
>>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
>>>>>
>>>>>
>>>>> Please simply try your write tests after doing this:
>>>>>
>>>>> echo mq-deadline > /sys/block/<name of your USB
>>>>> disk>/queue/scheduler
>>>>>
>>>>> And confirm that mq-deadline is selected with:
>>>>>
>>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
>>>>> [mq-deadline] kyber bfq none
>>>>
>>>> ok, which kernel should I test with this: the fresh git cloned, or the
>>>> one just patched with Alan's patch, or doesn't matter which one?
>>>
>>> Probably all of them to see if there are any differences.
>>
>> with both kernels, the output of
>> cat /sys/block/sdh/queue/schedule
>>
>> already contains [mq-deadline]: is it correct to assume that the echo
>> command and the subsequent testing is useless? What to do now?
> 
> Probably, yes. Have you obtained a blktrace of the workload during these
> tests ? Any significant difference in the IO pattern (IO size and
> randomness) and IO timing (any device idle time where the device has no
> command to process) ? Asking because the problem may be above the block
> layer, with the file system for instance.

blktrace would indeed be super useful, especially if you can do that
with a kernel that's fast for you, and one with the current kernel
where it's slow.

Given that your device is sdh, you simply do:

# blktrace /dev/sdh

and then run the test, then ctrl-c the blktrace. Then do:

# blkparse sdh > output

and save that output file. Do both runs, and bzip2 them up. The shorter
the run you can reproduce with the better, to cut down on the size of
the traces.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08  8:42                                                 ` Damien Le Moal
  2019-11-08 14:33                                                   ` Jens Axboe
@ 2019-11-09 10:09                                                   ` Ming Lei
  1 sibling, 0 replies; 102+ messages in thread
From: Ming Lei @ 2019-11-09 10:09 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Andrea Vai, Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Fri, Nov 08, 2019 at 08:42:53AM +0000, Damien Le Moal wrote:
> On 2019/11/08 4:00, Andrea Vai wrote:
> > [Sorry for the duplicate message, it didn't reach the lists due to
> > html formatting]
> > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > <Damien.LeMoal@wdc.com> ha scritto:
> >>
> >> On 2019/11/07 16:04, Andrea Vai wrote:
> >>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> >>>>
> >>>>
> >>>> Please simply try your write tests after doing this:
> >>>>
> >>>> echo mq-deadline > /sys/block/<name of your USB
> >>>> disk>/queue/scheduler
> >>>>
> >>>> And confirm that mq-deadline is selected with:
> >>>>
> >>>> cat /sys/block/<name of your USB disk>/queue/scheduler
> >>>> [mq-deadline] kyber bfq none
> >>>
> >>> ok, which kernel should I test with this: the fresh git cloned, or the
> >>> one just patched with Alan's patch, or doesn't matter which one?
> >>
> >> Probably all of them to see if there are any differences.
> > 
> > with both kernels, the output of
> > cat /sys/block/sdh/queue/schedule
> > 
> > already contains [mq-deadline]: is it correct to assume that the echo
> > command and the subsequent testing is useless? What to do now?
> 
> Probably, yes. Have you obtained a blktrace of the workload during these
> tests ? Any significant difference in the IO pattern (IO size and
> randomness) and IO timing (any device idle time where the device has no
> command to process) ? Asking because the problem may be above the block
> layer, with the file system for instance.

You may get the IO pattern via the previous trace 

https://lore.kernel.org/linux-usb/20190710024439.GA2621@ming.t460p/

IMO, if it is related write order, one possibility could be that
the queue lock is killed in .make_request_fn().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-07 18:59                                               ` Andrea Vai
  2019-11-08  8:42                                                 ` Damien Le Moal
@ 2019-11-09 22:28                                                 ` Ming Lei
  2019-11-11 10:50                                                   ` Andrea Vai
  2019-11-22 19:16                                                   ` Andrea Vai
  1 sibling, 2 replies; 102+ messages in thread
From: Ming Lei @ 2019-11-09 22:28 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> [Sorry for the duplicate message, it didn't reach the lists due to
> html formatting]
> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> <Damien.LeMoal@wdc.com> ha scritto:
> >
> > On 2019/11/07 16:04, Andrea Vai wrote:
> > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha scritto:
> > >>
> > >>
> > >> Please simply try your write tests after doing this:
> > >>
> > >> echo mq-deadline > /sys/block/<name of your USB
> > >> disk>/queue/scheduler
> > >>
> > >> And confirm that mq-deadline is selected with:
> > >>
> > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > >> [mq-deadline] kyber bfq none
> > >
> > > ok, which kernel should I test with this: the fresh git cloned, or the
> > > one just patched with Alan's patch, or doesn't matter which one?
> >
> > Probably all of them to see if there are any differences.
> 
> with both kernels, the output of
> cat /sys/block/sdh/queue/schedule
> 
> already contains [mq-deadline]: is it correct to assume that the echo
> command and the subsequent testing is useless? What to do now?

Another thing we could try is to use 'none' via the following command:

 echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points to the usb storage disk

Because USB storage HBA is single hw queue, which depth is 1. This way
should change to dispatch IO in the order of bio submission.

Andrea, could you switch io scheduler to none and update us if difference
can be made?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-08 14:33                                                   ` Jens Axboe
@ 2019-11-11 10:46                                                     ` Andrea Vai
  0 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-11 10:46 UTC (permalink / raw)
  To: Jens Axboe, Damien Le Moal
  Cc: Alan Stern, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Ming Lei, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno ven, 08/11/2019 alle 07.33 -0700, Jens Axboe ha scritto:
> On 11/8/19 1:42 AM, Damien Le Moal wrote:
> > On 2019/11/08 4:00, Andrea Vai wrote:
> >> [Sorry for the duplicate message, it didn't reach the lists due
> to
> >> html formatting]
> >> Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> >> <Damien.LeMoal@wdc.com> ha scritto:
> >>>
> >>> On 2019/11/07 16:04, Andrea Vai wrote:
> >>>> Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> scritto:
> >>>>>
> >>>>>
> >>>>> Please simply try your write tests after doing this:
> >>>>>
> >>>>> echo mq-deadline > /sys/block/<name of your USB
> >>>>> disk>/queue/scheduler
> >>>>>
> >>>>> And confirm that mq-deadline is selected with:
> >>>>>
> >>>>> cat /sys/block/<name of your USB disk>/queue/scheduler
> >>>>> [mq-deadline] kyber bfq none
> >>>>
> >>>> ok, which kernel should I test with this: the fresh git cloned,
> or the
> >>>> one just patched with Alan's patch, or doesn't matter which
> one?
> >>>
> >>> Probably all of them to see if there are any differences.
> >>
> >> with both kernels, the output of
> >> cat /sys/block/sdh/queue/schedule
> >>
> >> already contains [mq-deadline]: is it correct to assume that the
> echo
> >> command and the subsequent testing is useless? What to do now?
> > 
> > Probably, yes. Have you obtained a blktrace of the workload during
> these
> > tests ? Any significant difference in the IO pattern (IO size and
> > randomness) and IO timing (any device idle time where the device
> has no
> > command to process) ? Asking because the problem may be above the
> block
> > layer, with the file system for instance.
> 
> blktrace would indeed be super useful, especially if you can do that
> with a kernel that's fast for you, and one with the current kernel
> where it's slow.
> 
> Given that your device is sdh, you simply do:
> 
> # blktrace /dev/sdh
> 
> and then run the test, then ctrl-c the blktrace. Then do:
> 
> # blkparse sdh > output
> 
> and save that output file. Do both runs, and bzip2 them up. The
> shorter
> the run you can reproduce with the better, to cut down on the size
> of
> the traces.

Sorry, the next message from Ming...

-----
You may get the IO pattern via the previous trace 
https://lore.kernel.org/linux-usb/20190710024439.GA2621@ming.t460p/

IMO, if it is related write order, one possibility could be that
the queue lock is killed in .make_request_fn().
-----

...made me wonder if I should really do the blkparse trace test, or
not. So please confirm if it's needed (testing is quite time-consuming 
, so I'd like to do it if it's needed).

Thanks, and bye,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-09 22:28                                                 ` Ming Lei
@ 2019-11-11 10:50                                                   ` Andrea Vai
  2019-11-11 11:05                                                     ` Ming Lei
  2019-11-22 19:16                                                   ` Andrea Vai
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-11 10:50 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > [Sorry for the duplicate message, it didn't reach the lists due to
> > html formatting]
> > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > <Damien.LeMoal@wdc.com> ha scritto:
> > >
> > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> scritto:
> > > >>
> > > >>
> > > >> Please simply try your write tests after doing this:
> > > >>
> > > >> echo mq-deadline > /sys/block/<name of your USB
> > > >> disk>/queue/scheduler
> > > >>
> > > >> And confirm that mq-deadline is selected with:
> > > >>
> > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > >> [mq-deadline] kyber bfq none
> > > >
> > > > ok, which kernel should I test with this: the fresh git
> cloned, or the
> > > > one just patched with Alan's patch, or doesn't matter which
> one?
> > >
> > > Probably all of them to see if there are any differences.
> > 
> > with both kernels, the output of
> > cat /sys/block/sdh/queue/schedule
> > 
> > already contains [mq-deadline]: is it correct to assume that the
> echo
> > command and the subsequent testing is useless? What to do now?
> 
> Another thing we could try is to use 'none' via the following
> command:
> 
>  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> to the usb storage disk
> 
> Because USB storage HBA is single hw queue, which depth is 1. This
> way
> should change to dispatch IO in the order of bio submission.
> 
> Andrea, could you switch io scheduler to none and update us if
> difference
> can be made?

Of course I would to it, but I see that with the "good" kernel the
output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf) is

noop deadline [cfq]

, i.e. it doesn't show "none". Does it matter? (sorry if it's a silly
question)

Thanks, and bye
Andrea



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-11 10:50                                                   ` Andrea Vai
@ 2019-11-11 11:05                                                     ` Ming Lei
  2019-11-11 11:13                                                       ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-11 11:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Mon, Nov 11, 2019 at 11:50:49AM +0100, Andrea Vai wrote:
> Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> > On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > > [Sorry for the duplicate message, it didn't reach the lists due to
> > > html formatting]
> > > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > > <Damien.LeMoal@wdc.com> ha scritto:
> > > >
> > > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal ha
> > scritto:
> > > > >>
> > > > >>
> > > > >> Please simply try your write tests after doing this:
> > > > >>
> > > > >> echo mq-deadline > /sys/block/<name of your USB
> > > > >> disk>/queue/scheduler
> > > > >>
> > > > >> And confirm that mq-deadline is selected with:
> > > > >>
> > > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > > >> [mq-deadline] kyber bfq none
> > > > >
> > > > > ok, which kernel should I test with this: the fresh git
> > cloned, or the
> > > > > one just patched with Alan's patch, or doesn't matter which
> > one?
> > > >
> > > > Probably all of them to see if there are any differences.
> > > 
> > > with both kernels, the output of
> > > cat /sys/block/sdh/queue/schedule
> > > 
> > > already contains [mq-deadline]: is it correct to assume that the
> > echo
> > > command and the subsequent testing is useless? What to do now?
> > 
> > Another thing we could try is to use 'none' via the following
> > command:
> > 
> >  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> > to the usb storage disk
> > 
> > Because USB storage HBA is single hw queue, which depth is 1. This
> > way
> > should change to dispatch IO in the order of bio submission.
> > 
> > Andrea, could you switch io scheduler to none and update us if
> > difference
> > can be made?
> 
> Of course I would to it, but I see that with the "good" kernel the
> output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf) is
> 
> noop deadline [cfq]

Not sure if cfq makes a difference, and I guess you may get same result
with noop or deadline. However, if you only see good write performance with
cfq, you may try 'bfq' and see if it works as cfq.

> 
> , i.e. it doesn't show "none". Does it matter? (sorry if it's a silly
> question)

We are talking about new kernel in which there can't be 'noop deadline [cfq]'
any more. And you should see the following output from '/sys/block/sdf/queue/scheduler'
in the new kernel:

	[mq-deadline] kyber bfq none


thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-11 11:05                                                     ` Ming Lei
@ 2019-11-11 11:13                                                       ` Andrea Vai
  0 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-11 11:13 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno lun, 11/11/2019 alle 19.05 +0800, Ming Lei ha scritto:
> On Mon, Nov 11, 2019 at 11:50:49AM +0100, Andrea Vai wrote:
> > Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> > > On Thu, Nov 07, 2019 at 07:59:44PM +0100, Andrea Vai wrote:
> > > > [Sorry for the duplicate message, it didn't reach the lists
> due to
> > > > html formatting]
> > > > Il giorno gio 7 nov 2019 alle ore 08:54 Damien Le Moal
> > > > <Damien.LeMoal@wdc.com> ha scritto:
> > > > >
> > > > > On 2019/11/07 16:04, Andrea Vai wrote:
> > > > > > Il giorno mer, 06/11/2019 alle 22.13 +0000, Damien Le Moal
> ha
> > > scritto:
> > > > > >>
> > > > > >>
> > > > > >> Please simply try your write tests after doing this:
> > > > > >>
> > > > > >> echo mq-deadline > /sys/block/<name of your USB
> > > > > >> disk>/queue/scheduler
> > > > > >>
> > > > > >> And confirm that mq-deadline is selected with:
> > > > > >>
> > > > > >> cat /sys/block/<name of your USB disk>/queue/scheduler
> > > > > >> [mq-deadline] kyber bfq none
> > > > > >
> > > > > > ok, which kernel should I test with this: the fresh git
> > > cloned, or the
> > > > > > one just patched with Alan's patch, or doesn't matter
> which
> > > one?
> > > > >
> > > > > Probably all of them to see if there are any differences.
> > > > 
> > > > with both kernels, the output of
> > > > cat /sys/block/sdh/queue/schedule
> > > > 
> > > > already contains [mq-deadline]: is it correct to assume that
> the
> > > echo
> > > > command and the subsequent testing is useless? What to do now?
> > > 
> > > Another thing we could try is to use 'none' via the following
> > > command:
> > > 
> > >  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh'
> points
> > > to the usb storage disk
> > > 
> > > Because USB storage HBA is single hw queue, which depth is 1.
> This
> > > way
> > > should change to dispatch IO in the order of bio submission.
> > > 
> > > Andrea, could you switch io scheduler to none and update us if
> > > difference
> > > can be made?
> > 
> > Of course I would to it, but I see that with the "good" kernel the
> > output of "cat /sys/block/sdf/queue/scheduler" (yes, now it's sdf)
> is
> > 
> > noop deadline [cfq]
> 
> Not sure if cfq makes a difference, and I guess you may get same
> result
> with noop or deadline. However, if you only see good write
> performance with
> cfq, you may try 'bfq' and see if it works as cfq.
> 
> > 
> > , i.e. it doesn't show "none". Does it matter? (sorry if it's a
> silly
> > question)
> 
> We are talking about new kernel in which there can't be 'noop
> deadline [cfq]'
> any more. And you should see the following output from
> '/sys/block/sdf/queue/scheduler'
> in the new kernel:
> 
> 	[mq-deadline] kyber bfq none
> 
> 

ok sorry I misunderstood, assumed you wanted me to compare the "none"
setting in the old kernel with the same setting in the new kernel. Now
it's clear to me that you want me to compare the different scheduler
settings in the new kernel.

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-09 22:28                                                 ` Ming Lei
  2019-11-11 10:50                                                   ` Andrea Vai
@ 2019-11-22 19:16                                                   ` Andrea Vai
  2019-11-23  7:28                                                     ` Ming Lei
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-22 19:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> Another thing we could try is to use 'none' via the following
> command:
> 
>  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> to the usb storage disk
> 
> Because USB storage HBA is single hw queue, which depth is 1. This
> way
> should change to dispatch IO in the order of bio submission.
> 
> Andrea, could you switch io scheduler to none and update us if
> difference
> can be made?

Using the new kernel, there is indeed a difference because the time to
copy a file is 1800 seconds with [mq-deadline], and 340 seconds with
[none]. But that is still far away from the old kernel, which performs
the copy of the same file in 76 seconds.

Side notes:

- The numbers above are average values calculated on 100 trials for
each  different situation. As previously noticed on this thread, with
the new kernel the times are also very different among the different
trials in the same situation. With the old kernel the standard
deviation on the times in a set of 100 trials is much smaller (to give
some mean/sigma values: m=1800->s=530; m=340->s=131; m=76->s=13; ).

- The size of the transferred file has been 1GB in these trials.
Smaller files don't always give appreciable differences, but if you
want I can also provide those data. Of course, I can also provide the
raw data of each set of trials.

Thanks,
and bye,

Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-22 19:16                                                   ` Andrea Vai
@ 2019-11-23  7:28                                                     ` Ming Lei
  2019-11-23 15:44                                                       ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-23  7:28 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Fri, Nov 22, 2019 at 08:16:30PM +0100, Andrea Vai wrote:
> Il giorno dom, 10/11/2019 alle 06.28 +0800, Ming Lei ha scritto:
> > Another thing we could try is to use 'none' via the following
> > command:
> > 
> >  echo none > /sys/block/sdh/queue/scheduler  #suppose 'sdh' points
> > to the usb storage disk
> > 
> > Because USB storage HBA is single hw queue, which depth is 1. This
> > way
> > should change to dispatch IO in the order of bio submission.
> > 
> > Andrea, could you switch io scheduler to none and update us if
> > difference
> > can be made?
> 
> Using the new kernel, there is indeed a difference because the time to
> copy a file is 1800 seconds with [mq-deadline], and 340 seconds with
> [none]. But that is still far away from the old kernel, which performs
> the copy of the same file in 76 seconds.

Please post the log of 'lsusb -v', and I will try to make a patch for
addressing the issue.


thanks, 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-23  7:28                                                     ` Ming Lei
@ 2019-11-23 15:44                                                       ` Andrea Vai
  2019-11-25  3:54                                                         ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-23 15:44 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 201 bytes --]

Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha scritto:
> 
> Please post the log of 'lsusb -v', and I will try to make a patch
> for
> addressing the issue.

attached,

Thanks, and bye
Andrea

[-- Attachment #2: lsusb.txt --]
[-- Type: text/plain, Size: 41247 bytes --]

can't get debug descriptor: Resource temporarily unavailable

Bus 002 Device 002: ID 8087:8000 Intel Corp. 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x8087 Intel Corp.
  idProduct          0x8000 
  bcdDevice            0.04
  iManufacturer           0 
  iProduct                0 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             6
  wHubCharacteristic 0x0009
    Per-port power switching
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood        0 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
   Port 5: 0000.0100 power
   Port 6: 0000.0100 power
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 ehci_hcd
  iProduct                2 EHCI Host Controller
  iSerial                 1 0000:00:1d.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
 can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
 nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x02
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0507 highspeed power suspend enable connect
   Port 2: 0000.0100 power
Device Status:     0x0001
  Self Powered

Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 ehci_hcd
  iProduct                2 EHCI Host Controller
  iSerial                 1 0000:05:00.2
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
Device Status:     0x0001
  Self Powered

Bus 005 Device 002: ID 04a9:2206 Canon, Inc. CanoScan N650U/N656U
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.00
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x04a9 Canon, Inc.
  idProduct          0x2206 CanoScan N650U/N656U
  bcdDevice            1.00
  iManufacturer          64 Canon
  iProduct               77 CanoScan
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0027
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0 
      bInterfaceProtocol    255 
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              16
      Endpoint Descriptorcan't get debug descriptor: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
Device Status:     0x0000
  (Bus Powered)

Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0001 1.1 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 uhci_hcd
  iProduct                2 UHCI Host Controller
  iSerial                 1 0000:05:00.1
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0002  1x 2 bytes
        bInterval             255
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0103 power enable connect
Device Status:     0x0001
  Self Powered

Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0001 1.1 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 uhci_hcd
  iProduct                2 UHCI Host Controller
  iSerial                 1 0000:05:00.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iIntercan't get debug descriptor: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
face              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0002  1x 2 bytes
        bInterval             255
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood        1 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
Device Status:     0x0001
  Self Powered

Bus 001 Device 002: ID 8087:8008 Intel Corp. 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x8087 Intel Corp.
  idProduct          0x8008 
  bcdDevice            0.04
  iManufacturer           0 
  iProduct                0 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             6
  wHubCharacteristic 0x0009
    Per-port power switching
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood        0 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
   Port 5: 0000.0100 power
   Port 6: 0000.0100 power
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 ehci_hcd
  iProduct                2 EHCI Host Controller
  iSerial                 1 0000:00:1a.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             2
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x02
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0507 highspeed power suspend enable connect
   Port 2: 0000.0100 power
Device Status:     0x0001
  Self Powered

Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               3.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         3 
  bMaxPacketSize0         9
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0003 3.0 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 xhci-hcd
  iProduct                2 xHCI Host Controller
  iSerial                 1 0000:00:14.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x001f
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
        bMaxBurst               0
Hub Descriptor:
  bLength              12
  bDescriptorType      42
  nNbrPorts             6
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  bHubDecLat          0.0 micro seconds
  wHubDelay             0 nano seconds
  DeviceRemovable    0x00
 Hub Port Status:
   Port 1: 0000.02a0 5Gbps power Rx.Detect
   Port 2: 0000.02a0 5Gbps power Rx.Detect
   Port 3: 0000.02a0 5Gbps power Rx.Detect
   Port 4: 0000.02a0 5Gbps power Rx.Detect
   Port 5: 0000.02a0 5Gbps power Rx.Detect
   Port 6: 0000.02a0 5Gbps power Rx.Detect
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength       0x000f
  bNumDeviceCaps          1
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x02
      Latency Tolerance Messages (LTM) Supported
    wSpeedsSupported   0x0008
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   3
      Lowest fully-functional device speed is SuperSpeed (5Gbps)
    bU1DevExitLat          10 mican't get debug descriptor: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
cannot read device status, Resource temporarily unavailable (11)
can't get debug descriptor: Resource temporarily unavailable
cro seconds
    bU2DevExitLat         512 micro seconds
Device Status:     0x0001
  Self Powered

Bus 006 Device 007: ID 0aec:3050 Neodio Technologies Corp. ND3050 8-in-1 Card Reader
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        16
  idVendor           0x0aec Neodio Technologies Corp.
  idProduct          0x3050 ND3050 8-in-1 Card Reader
  bcdDevice            1.00
  iManufacturer           1 (error)
  iProduct                2 (error)
  iSerial                 3 (error)
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0020
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0

Bus 006 Device 005: ID 051d:0002 American Power Conversion Uninterruptible Power Supply
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x051d American Power Conversion
  idProduct          0x0002 Uninterruptible Power Supply
  bcdDevice            1.06
  iManufacturer           3 American Power Conversion
  iProduct                1 Back-UPS XS 700U   FW:924.Z3 .I USB FW:Z3 
  iSerial                 2 3B1828X60578  
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0022
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower               24mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 
      iInterface              0 
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.10
          bCountryCode           33 US
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength    1029
         Report Descriptors: 
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0006  1x 6 bytes
        bInterval              10
Device Status:     0xcan't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
cannot read device status, Resource temporarily unavailable (11)
0000
  (Bus Powered)

Bus 006 Device 003: ID 04e8:330e Samsung Electronics Co., Ltd ML-2950 Series
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x04e8 Samsung Electronics Co., Ltd
  idProduct          0x330e 
  bcdDevice            1.00
  iManufacturer           1 (error)
  iProduct                2 (error)
  iSerial                 3 (error)
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0020
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xc0
      Self Powered
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         7 Printer
      bInterfaceSubClass      1 Printer
      bInterfaceProtocol      2 Bidirectional
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              10
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              10

Bus 006 Device 004: ID 046d:c52f Logitech, Inc. Unifying Receiver
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x046d Logitech, Inc.
  idProduct          0xc52f Unifying Receiver
  bcdDevice           22.00
  iManufacturer           1 Logitech
  iProduct                2 USB Receiver
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x003b
    bNumInterfaces          2
    bConfigurationValue     1
    iConfiguration          4 RQR22.00_B0005
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      1 Boot Interface Subclass
      bInterfaceProtocol      2 Mouse
      iInterface              0 
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.11
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      67
         Report Descriptors: 
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0008  1x 8 bytes
        bInterval               2
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints  can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
         1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 
      iInterface              0 
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.11
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      79
         Report Descriptors: 
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0014  1x 20 bytes
        bInterval               2
Device Status:     0x0000
  (Bus Powered)

Bus 006 Device 008: ID 0951:1666 Kingston Technology DataTraveler 100 G3/G4/SE9 G2
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.10
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x0951 Kingston Technology
  idProduct          0x1666 DataTraveler 100 G3/G4/SE9 G2
  bcdDevice            0.01
  iManufacturer           1 Kingston
  iProduct                2 DataTraveler 3.0
  iSerial                 3 60A44C4139D4FF70899506DC
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0020
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              500mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval             255
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval             255
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength       0x0016
  bNumDeviceCaps          2
  USB 2.0 Extension Device Capability:
    bLength                 7
    bDescriptorType        16
    bDevCapabilityType      2
    bmAttributes   0x00000006
      BESL Link Power Management (LPM) Supported
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x00
    wSpeedsSupported   0x000e
      Device can operate at Full Speed (12Mbps)
      Device can operate at High Speed (480Mbps)
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   2
      Lowest fully-functional device speed is High Speed (480Mbps)
    bU1DevExitLat          10 micro seconds
    bU2DevExitLat        2047 micro seconds
Device Status:     0x0000
  (Bus Powered)

Bus 006 Device 006: ID 045b:0209 Hitachi, Ltd 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPcan't get debug descriptor: Resource temporarily unavailable
acketSize0        64
  idVendor           0x045b Hitachi, Ltd
  idProduct          0x0209 
  bcdDevice            1.00
  iManufacturer           0 
  iProduct                0 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x0029
    Per-port power switching
    Per-port overcurrent protection
    TT think time 16 FS bits
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent    100 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 006 Device 002: ID 045b:0209 Hitachi, Ltd 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x045b Hitachi, Ltd
  idProduct          0x0209 
  bcdDevice            1.00
  iManufacturer           0 
  iProduct                0 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x0029
    Per-port power switching
    Per-port overcurrent protection
    TT think time 16 FS bits
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent    100 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
   Port 2: 0000.0100 power
   Port 3: 0000.010can't get debug descriptor: Resource temporarily unavailable
can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
0 power
   Port 4: 0000.0103 power enable connect
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            5.03
  iManufacturer           3 Linux 5.3.8-200.fc30.x86_64 xhci-hcd
  iProduct                2 xHCI Host Controller
  iSerial                 1 0000:00:14.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength              11
  bDescriptorType      41
  nNbrPorts            12
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00 0x00
  PortPwrCtrlMask    0xff 0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0503 highspeed power enable connect
   Port 5: 0000.0503 highspeed power enable connect
   Port 6: 0000.0303 lowspeed power enable connect
   Port 7: 0000.0100 power
   Port 8: 0000.0103 power enable connect
   Port 9: 0000.0100 power
   Port 10: 0000.0100 power
   Port 11: 0000.0100 power
   Port 12: 0000.0100 power
Device Status:     0x0001
  Self Powered

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-23 15:44                                                       ` Andrea Vai
@ 2019-11-25  3:54                                                         ` Ming Lei
  2019-11-25 10:11                                                           ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-25  3:54 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 400 bytes --]

On Sat, Nov 23, 2019 at 04:44:55PM +0100, Andrea Vai wrote:
> Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha scritto:
> > 
> > Please post the log of 'lsusb -v', and I will try to make a patch
> > for
> > addressing the issue.
> 
> attached,

Please apply the attached patch, and re-build & install & reboot kernel.

This time, please don't switch io scheduler.

Thanks,
Ming

[-- Attachment #2: usb.patch --]
[-- Type: text/plain, Size: 3616 bytes --]

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5c9adcaa27ac..eecb46020bfb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1436,7 +1436,13 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 	if (unlikely(blk_mq_hctx_stopped(hctx)))
 		return;
 
-	if (!async && !(hctx->flags & BLK_MQ_F_BLOCKING)) {
+	/*
+	 * Some single-queue devices may need to dispatch IO in order
+	 * which was guaranteed for the legacy queue via the big queue
+	 * lock. Now we reply on single hctx->run_work for that.
+	 */
+	if (!async && !(hctx->flags & (BLK_MQ_F_BLOCKING |
+					BLK_MQ_F_STRICT_DISPATCH_ORDER))) {
 		int cpu = get_cpu();
 		if (cpumask_test_cpu(cpu, hctx->cpumask)) {
 			__blk_mq_run_hw_queue(hctx);
@@ -3042,6 +3048,10 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
 	if (!set->ops->get_budget ^ !set->ops->put_budget)
 		return -EINVAL;
 
+	if (set->queue_depth > 1 && (set->flags &
+				BLK_MQ_F_STRICT_DISPATCH_ORDER))
+		return -EINVAL;
+
 	if (set->queue_depth > BLK_MQ_MAX_DEPTH) {
 		pr_info("blk-mq: reduced tag depth to %u\n",
 			BLK_MQ_MAX_DEPTH);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d3d237a09a78..563188844143 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1939,6 +1939,9 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
 	shost->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
 	shost->tag_set.flags |=
 		BLK_ALLOC_POLICY_TO_MQ_FLAG(shost->hostt->tag_alloc_policy);
+	if (shost->hostt->strict_dispatch_order)
+		shost->tag_set.flags |= BLK_MQ_F_STRICT_DISPATCH_ORDER;
+
 	shost->tag_set.driver_data = shost;
 
 	return blk_mq_alloc_tag_set(&shost->tag_set);
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 6737fab94959..77795edad8e8 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -661,6 +661,18 @@ static const struct scsi_host_template usb_stor_host_template = {
 	/* we do our own delay after a device or bus reset */
 	.skip_settle_delay =		1,
 
+
+	/*
+	 * Some USB storage, such as Kingston Technology DataTraveler 100
+	 * G3/G4/SE9 G2(ID 0951:1666), requires IO dispatched in the
+	 * sequential order, otherwise IO performance may drop drastically.
+	 *
+	 * can_queue is always 1, so we set .strict_dispatch_order for
+	 * USB mass storage HBA. Another reason is that there can be such
+	 * kind of devices too.
+	 */
+	.strict_dispatch_order =	1,
+
 	/* sysfs device attributes */
 	.sdev_attrs =			sysfs_device_attr_list,
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index dc03e059fdff..844539690a27 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -388,6 +388,7 @@ struct blk_mq_ops {
 enum {
 	BLK_MQ_F_SHOULD_MERGE	= 1 << 0,
 	BLK_MQ_F_TAG_SHARED	= 1 << 1,
+	BLK_MQ_F_STRICT_DISPATCH_ORDER	= 1 << 2,
 	BLK_MQ_F_BLOCKING	= 1 << 5,
 	BLK_MQ_F_NO_SCHED	= 1 << 6,
 	BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index d4452d0ea3c7..f932d6fa1a4c 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -442,6 +442,13 @@ struct scsi_host_template {
 	/* True if the low-level driver supports blk-mq only */
 	unsigned force_blk_mq:1;
 
+	/*
+	 * True if the low-level driver needs IO to be dispatched in
+	 * the order provided by legacy IO path. The flag is only
+	 * valid for single queue device.
+	 */
+	unsigned strict_dispatch_order:1;
+
 	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25  3:54                                                         ` Ming Lei
@ 2019-11-25 10:11                                                           ` Andrea Vai
  2019-11-25 10:29                                                             ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-25 10:11 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno lun, 25/11/2019 alle 11.54 +0800, Ming Lei ha scritto:
> On Sat, Nov 23, 2019 at 04:44:55PM +0100, Andrea Vai wrote:
> > Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha scritto:
> > > 
> > > Please post the log of 'lsusb -v', and I will try to make a
> patch
> > > for
> > > addressing the issue.
> > 
> > attached,
> 
> Please apply the attached patch, and re-build & install & reboot
> kernel.
> 
> This time, please don't switch io scheduler.

# patch -p1 < usb.patch outputs:

(Stripping trailing CRs from patch; use --binary to disable.)
patching file block/blk-mq.c
Hunk #1 succeeded at 1465 (offset 29 lines).
Hunk #2 succeeded at 3061 (offset 13 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file drivers/scsi/scsi_lib.c
Hunk #1 succeeded at 1902 (offset -37 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file drivers/usb/storage/scsiglue.c
Hunk #1 succeeded at 651 (offset -10 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file include/linux/blk-mq.h
Hunk #1 succeeded at 226 (offset -162 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file include/scsi/scsi_host.h
patch unexpectedly ends in middle of line
patch unexpectedly ends in middle of line

Just to be sure I have to go on, is this correct? Sounds like an error
but I don't know if it is important.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25 10:11                                                           ` Andrea Vai
@ 2019-11-25 10:29                                                             ` Ming Lei
  2019-11-25 14:58                                                               ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-25 10:29 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Mon, Nov 25, 2019 at 11:11:00AM +0100, Andrea Vai wrote:
> Il giorno lun, 25/11/2019 alle 11.54 +0800, Ming Lei ha scritto:
> > On Sat, Nov 23, 2019 at 04:44:55PM +0100, Andrea Vai wrote:
> > > Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha scritto:
> > > > 
> > > > Please post the log of 'lsusb -v', and I will try to make a
> > patch
> > > > for
> > > > addressing the issue.
> > > 
> > > attached,
> > 
> > Please apply the attached patch, and re-build & install & reboot
> > kernel.
> > 
> > This time, please don't switch io scheduler.
> 
> # patch -p1 < usb.patch outputs:
> 
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file block/blk-mq.c
> Hunk #1 succeeded at 1465 (offset 29 lines).
> Hunk #2 succeeded at 3061 (offset 13 lines).
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file drivers/scsi/scsi_lib.c
> Hunk #1 succeeded at 1902 (offset -37 lines).
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file drivers/usb/storage/scsiglue.c
> Hunk #1 succeeded at 651 (offset -10 lines).
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file include/linux/blk-mq.h
> Hunk #1 succeeded at 226 (offset -162 lines).
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file include/scsi/scsi_host.h
> patch unexpectedly ends in middle of line
> patch unexpectedly ends in middle of line
> 
> Just to be sure I have to go on, is this correct? Sounds like an error
> but I don't know if it is important.

Looks there is small conflict, however it has been fixed by patch, so
it is correct, please go on your test.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25 10:29                                                             ` Ming Lei
@ 2019-11-25 14:58                                                               ` Andrea Vai
  2019-11-25 15:15                                                                 ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-25 14:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno lun, 25/11/2019 alle 18.29 +0800, Ming Lei ha scritto:
> On Mon, Nov 25, 2019 at 11:11:00AM +0100, Andrea Vai wrote:
> > Il giorno lun, 25/11/2019 alle 11.54 +0800, Ming Lei ha scritto:
> > > On Sat, Nov 23, 2019 at 04:44:55PM +0100, Andrea Vai wrote:
> > > > Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha
> scritto:
> > > > > 
> > > > > Please post the log of 'lsusb -v', and I will try to make a
> > > patch
> > > > > for
> > > > > addressing the issue.
> > > > 
> > > > attached,
> > > 
> > > Please apply the attached patch, and re-build & install & reboot
> > > kernel.
> > > 
> > > This time, please don't switch io scheduler.
> > 
> > # patch -p1 < usb.patch outputs:
> > 
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file block/blk-mq.c
> > Hunk #1 succeeded at 1465 (offset 29 lines).
> > Hunk #2 succeeded at 3061 (offset 13 lines).
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file drivers/scsi/scsi_lib.c
> > Hunk #1 succeeded at 1902 (offset -37 lines).
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file drivers/usb/storage/scsiglue.c
> > Hunk #1 succeeded at 651 (offset -10 lines).
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file include/linux/blk-mq.h
> > Hunk #1 succeeded at 226 (offset -162 lines).
> > (Stripping trailing CRs from patch; use --binary to disable.)
> > patching file include/scsi/scsi_host.h
> > patch unexpectedly ends in middle of line
> > patch unexpectedly ends in middle of line
> > 
> > Just to be sure I have to go on, is this correct? Sounds like an
> error
> > but I don't know if it is important.
> 
> Looks there is small conflict, however it has been fixed by patch,
> so
> it is correct, please go on your test.

Done, it still fails (2000 seconds or more to copy 1GB) :-(

cat /sys/block/sdf/queue/scheduler outputs:
[mq-deadline] none

What to try next?

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25 14:58                                                               ` Andrea Vai
@ 2019-11-25 15:15                                                                 ` Ming Lei
  2019-11-25 18:51                                                                   ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-25 15:15 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> Il giorno lun, 25/11/2019 alle 18.29 +0800, Ming Lei ha scritto:
> > On Mon, Nov 25, 2019 at 11:11:00AM +0100, Andrea Vai wrote:
> > > Il giorno lun, 25/11/2019 alle 11.54 +0800, Ming Lei ha scritto:
> > > > On Sat, Nov 23, 2019 at 04:44:55PM +0100, Andrea Vai wrote:
> > > > > Il giorno sab, 23/11/2019 alle 15.28 +0800, Ming Lei ha
> > scritto:
> > > > > > 
> > > > > > Please post the log of 'lsusb -v', and I will try to make a
> > > > patch
> > > > > > for
> > > > > > addressing the issue.
> > > > > 
> > > > > attached,
> > > > 
> > > > Please apply the attached patch, and re-build & install & reboot
> > > > kernel.
> > > > 
> > > > This time, please don't switch io scheduler.
> > > 
> > > # patch -p1 < usb.patch outputs:
> > > 
> > > (Stripping trailing CRs from patch; use --binary to disable.)
> > > patching file block/blk-mq.c
> > > Hunk #1 succeeded at 1465 (offset 29 lines).
> > > Hunk #2 succeeded at 3061 (offset 13 lines).
> > > (Stripping trailing CRs from patch; use --binary to disable.)
> > > patching file drivers/scsi/scsi_lib.c
> > > Hunk #1 succeeded at 1902 (offset -37 lines).
> > > (Stripping trailing CRs from patch; use --binary to disable.)
> > > patching file drivers/usb/storage/scsiglue.c
> > > Hunk #1 succeeded at 651 (offset -10 lines).
> > > (Stripping trailing CRs from patch; use --binary to disable.)
> > > patching file include/linux/blk-mq.h
> > > Hunk #1 succeeded at 226 (offset -162 lines).
> > > (Stripping trailing CRs from patch; use --binary to disable.)
> > > patching file include/scsi/scsi_host.h
> > > patch unexpectedly ends in middle of line
> > > patch unexpectedly ends in middle of line
> > > 
> > > Just to be sure I have to go on, is this correct? Sounds like an
> > error
> > > but I don't know if it is important.
> > 
> > Looks there is small conflict, however it has been fixed by patch,
> > so
> > it is correct, please go on your test.
> 
> Done, it still fails (2000 seconds or more to copy 1GB) :-(
> 
> cat /sys/block/sdf/queue/scheduler outputs:
> [mq-deadline] none
> 
> What to try next?

1) cat /sys/kernel/debug/block/$DISK/hctx0/flags

note: replace $DISK with disk name of your usb drive, such as, if it is
/dev/sdb, pass $DISK as sdb.

2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your copy 1GB
test again.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25 15:15                                                                 ` Ming Lei
@ 2019-11-25 18:51                                                                   ` Andrea Vai
  2019-11-26  2:32                                                                     ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-25 18:51 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha scritto:
> On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> 
> [...]
> 
> > What to try next?
> 
> 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
result:

alloc_policy=FIFO SHOULD_MERGE|2

> 
> 
> 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your copy
> 1GB
> test again.

done, and still fails. What to try next?

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-25 18:51                                                                   ` Andrea Vai
@ 2019-11-26  2:32                                                                     ` Ming Lei
  2019-11-26  7:46                                                                       ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-26  2:32 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha scritto:
> > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > 
> > [...]
> > 
> > > What to try next?
> > 
> > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> result:
> 
> alloc_policy=FIFO SHOULD_MERGE|2
> 
> > 
> > 
> > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your copy
> > 1GB
> > test again.
> 
> done, and still fails. What to try next?

I just run 256M cp test to one USB storage device on patched kernel,
and WRITE data IO is really in ascending order. The filesystem is ext4,
and mount without '-o sync'. From previous discussion, looks that is
exactly your test setting. The order can be observed via the following script:

#!/bin/sh
MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))
/usr/share/bcc/tools/trace -t -C \
  't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'

$MAJ & $MIN can be retrieved via lsblk for your USB storage disk.

So I think we need to check if the patch is applied correctly first.

If your kernel tree is managed via git, please post 'git diff'.
Otherwise, share us your kernel version, and I will send you one
backported patch on the kernel version.

Meantime, you can collect IO order log via the above script as you did last
time, then send us the log.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26  2:32                                                                     ` Ming Lei
@ 2019-11-26  7:46                                                                       ` Andrea Vai
  2019-11-26  9:15                                                                         ` Ming Lei
  2019-11-27  0:21                                                                         ` Finn Thain
  0 siblings, 2 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-26  7:46 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 2944 bytes --]

Il giorno mar, 26/11/2019 alle 10.32 +0800, Ming Lei ha scritto:
> On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> > Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha scritto:
> > > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > > 
> > > [...]
> > > 
> > > > What to try next?
> > > 
> > > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> > result:
> > 
> > alloc_policy=FIFO SHOULD_MERGE|2
> > 
> > > 
> > > 
> > > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your
> copy
> > > 1GB
> > > test again.
> > 
> > done, and still fails. What to try next?
> 
> I just run 256M cp test

I would like to point out that 256MB is a filesize that usually don't
trigger the issue (don't know if it matters, sorry).

Another info I would provide is about another strange behavior I
noticed: yesterday I ran the test two times (as usual with 1GB
filesize) and took 2370s, 1786s, and a third test was going on when I
stopped it. Then I started another set of 100 trials and let them run
tonight, and the first 10 trials were around 1000s, then gradually
decreased to ~300s, and finally settled around 200s with some trials
below 70-80s. This to say, times are extremely variable and for the
first time I noticed a sort of "performance increase" with time.

>  to one USB storage device on patched kernel,
> and WRITE data IO is really in ascending order. The filesystem is
> ext4,
> and mount without '-o sync'. From previous discussion, looks that is
> exactly your test setting. The order can be observed via the
> following script:
> 
> #!/bin/sh
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> /usr/share/bcc/tools/trace -t -C \
>   't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> $MAJ & $MIN can be retrieved via lsblk for your USB storage disk.
> 
> So I think we need to check if the patch is applied correctly first.
> 
> If your kernel tree is managed via git,
yes it is,

>  please post 'git diff'.
attached. Is it correctly patched? thanks.


> Otherwise, share us your kernel version,
btw, is 5.4.0+

>  and I will send you one
> backported patch on the kernel version.
> 
> Meantime, you can collect IO order log via the above script as you
> did last
> time, then send us the log.

ok, will try; is it just required to run it for a short period of time
(say, some seconds) during the copy, or should I run it before the
beginning (or before the mount?), and terminate it after the end of
the copy? (Please note that in the latter case a large amount of time
(and data, I suppose) would be involved, because, as said, to be sure
the problem triggers I have to use a large file... but we can try to
better understand and tune this. If it can help, you can get an ods
file with the complete statistic at [1] (look at the "prove_nov19"
sheet)).

Thanks,
Andrea

[1]: http://fisica.unipv.it/transfer/kernelstats.zip

[-- Attachment #2: git_diff.txt --]
[-- Type: text/plain, Size: 6424 bytes --]

# git diff
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ec791156e9cc..92d60a5e1d15 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1465,7 +1465,13 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
        if (unlikely(blk_mq_hctx_stopped(hctx)))
                return;
 
-       if (!async && !(hctx->flags & BLK_MQ_F_BLOCKING)) {
+       /*
+        * Some single-queue devices may need to dispatch IO in order
+        * which was guaranteed for the legacy queue via the big queue
+        * lock. Now we reply on single hctx->run_work for that.
+        */
+       if (!async && !(hctx->flags & (BLK_MQ_F_BLOCKING |
+                                       BLK_MQ_F_STRICT_DISPATCH_ORDER))) {
                int cpu = get_cpu();
                if (cpumask_test_cpu(cpu, hctx->cpumask)) {
                        __blk_mq_run_hw_queue(hctx);
@@ -3055,6 +3061,10 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
        if (!set->ops->get_budget ^ !set->ops->put_budget)
                return -EINVAL;
 
+       if (set->queue_depth > 1 && (set->flags &
+                               BLK_MQ_F_STRICT_DISPATCH_ORDER))
+               return -EINVAL;
+
        if (set->queue_depth > BLK_MQ_MAX_DEPTH) {
                pr_info("blk-mq: reduced tag depth to %u\n",
                        BLK_MQ_MAX_DEPTH);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 91c007d26c1e..f013630275c9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1902,6 +1902,9 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
        shost->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
        shost->tag_set.flags |=
                BLK_ALLOC_POLICY_TO_MQ_FLAG(shost->hostt->tag_alloc_policy);
+       if (shost->hostt->strict_dispatch_order)
+               shost->tag_set.flags |= BLK_MQ_F_STRICT_DISPATCH_ORDER;
+
        shost->tag_set.driver_data = shost;
 
        return blk_mq_alloc_tag_set(&shost->tag_set);
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 54a3c8195c96..df1674d7f0fc 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -651,6 +651,18 @@ static const struct scsi_host_template usb_stor_host_template = {
        /* we do our own delay after a device or bus reset */
        .skip_settle_delay =            1,
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ec791156e9cc..92d60a5e1d15 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1465,7 +1465,13 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
        if (unlikely(blk_mq_hctx_stopped(hctx)))
                return;
 
-       if (!async && !(hctx->flags & BLK_MQ_F_BLOCKING)) {
+       /*
+        * Some single-queue devices may need to dispatch IO in order
+        * which was guaranteed for the legacy queue via the big queue
+        * lock. Now we reply on single hctx->run_work for that.
+        */
+       if (!async && !(hctx->flags & (BLK_MQ_F_BLOCKING |
+                                       BLK_MQ_F_STRICT_DISPATCH_ORDER))) {
                int cpu = get_cpu();
                if (cpumask_test_cpu(cpu, hctx->cpumask)) {
                        __blk_mq_run_hw_queue(hctx);
@@ -3055,6 +3061,10 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
        if (!set->ops->get_budget ^ !set->ops->put_budget)
                return -EINVAL;
 
+       if (set->queue_depth > 1 && (set->flags &
+                               BLK_MQ_F_STRICT_DISPATCH_ORDER))
+               return -EINVAL;
+
        if (set->queue_depth > BLK_MQ_MAX_DEPTH) {
                pr_info("blk-mq: reduced tag depth to %u\n",
                        BLK_MQ_MAX_DEPTH);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 91c007d26c1e..f013630275c9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1902,6 +1902,9 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
        shost->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
        shost->tag_set.flags |=
                BLK_ALLOC_POLICY_TO_MQ_FLAG(shost->hostt->tag_alloc_policy);
+       if (shost->hostt->strict_dispatch_order)
+               shost->tag_set.flags |= BLK_MQ_F_STRICT_DISPATCH_ORDER;
+
        shost->tag_set.driver_data = shost;
 
        return blk_mq_alloc_tag_set(&shost->tag_set);
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 54a3c8195c96..df1674d7f0fc 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -651,6 +651,18 @@ static const struct scsi_host_template usb_stor_host_template = {
        /* we do our own delay after a device or bus reset */
        .skip_settle_delay =            1,
 
+
+       /*
+        * Some USB storage, such as Kingston Technology DataTraveler 100
+        * G3/G4/SE9 G2(ID 0951:1666), requires IO dispatched in the
+        * sequential order, otherwise IO performance may drop drastically.
+        *
+        * can_queue is always 1, so we set .strict_dispatch_order for
+        * USB mass storage HBA. Another reason is that there can be such
+        * kind of devices too.
+        */
+       .strict_dispatch_order =        1,
+
        /* sysfs device attributes */
        .sdev_attrs =                   sysfs_device_attr_list,
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 0bf056de5cc3..89b1c28da36a 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -226,6 +226,7 @@ struct blk_mq_ops {
 enum {
        BLK_MQ_F_SHOULD_MERGE   = 1 << 0,
        BLK_MQ_F_TAG_SHARED     = 1 << 1,
+       BLK_MQ_F_STRICT_DISPATCH_ORDER  = 1 << 2,
        BLK_MQ_F_BLOCKING       = 1 << 5,
        BLK_MQ_F_NO_SCHED       = 1 << 6,
        BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 31e0d6ca1eba..dbcbc9ef6993 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -442,6 +442,13 @@ struct scsi_host_template {
        /* True if the low-level driver supports blk-mq only */
        unsigned force_blk_mq:1;
 
+       /*
+        * True if the low-level driver needs IO to be dispatched in
+        * the order provided by legacy IO path. The flag is only
+        * valid for single queue device.
+        */
+       unsigned strict_dispatch_order:1;
+
        /*
         * Countdown for host blocking with no commands outstanding.
         */


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26  7:46                                                                       ` Andrea Vai
@ 2019-11-26  9:15                                                                         ` Ming Lei
  2019-11-26 10:24                                                                           ` Ming Lei
  2019-11-26 11:14                                                                           ` Andrea Vai
  2019-11-27  0:21                                                                         ` Finn Thain
  1 sibling, 2 replies; 102+ messages in thread
From: Ming Lei @ 2019-11-26  9:15 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, Nov 26, 2019 at 08:46:07AM +0100, Andrea Vai wrote:
> Il giorno mar, 26/11/2019 alle 10.32 +0800, Ming Lei ha scritto:
> > On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> > > Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha scritto:
> > > > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > What to try next?
> > > > 
> > > > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> > > result:
> > > 
> > > alloc_policy=FIFO SHOULD_MERGE|2
> > > 
> > > > 
> > > > 
> > > > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your
> > copy
> > > > 1GB
> > > > test again.
> > > 
> > > done, and still fails. What to try next?
> > 
> > I just run 256M cp test
> 
> I would like to point out that 256MB is a filesize that usually don't
> trigger the issue (don't know if it matters, sorry).

OK.

I tested 256M because IO timeout is often triggered in case of
qemu-ehci, and it is a long-term issue. When setting up the disk
via xhci-qemu, the max request size is increased to 1MB from 120KB,
and IO pattern changed too. When the disk is connected via uhci-qemu,
the transfer is too slow(1MB/s) because max endpoint size is too small.

However, I just waited 16min and collected all the 1GB IO log by
connecting disk over uhci-qemu, but the sector of each data IO
is still in order.

> 
> Another info I would provide is about another strange behavior I
> noticed: yesterday I ran the test two times (as usual with 1GB
> filesize) and took 2370s, 1786s, and a third test was going on when I
> stopped it. Then I started another set of 100 trials and let them run
> tonight, and the first 10 trials were around 1000s, then gradually
> decreased to ~300s, and finally settled around 200s with some trials
> below 70-80s. This to say, times are extremely variable and for the
> first time I noticed a sort of "performance increase" with time.

The 'cp' test is buffered IO, can you reproduce it every time by
running copy just after fresh mount on the USB disk?

> 
> >  to one USB storage device on patched kernel,
> > and WRITE data IO is really in ascending order. The filesystem is
> > ext4,
> > and mount without '-o sync'. From previous discussion, looks that is
> > exactly your test setting. The order can be observed via the
> > following script:
> > 
> > #!/bin/sh
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > /usr/share/bcc/tools/trace -t -C \
> >   't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > $MAJ & $MIN can be retrieved via lsblk for your USB storage disk.
> > 
> > So I think we need to check if the patch is applied correctly first.
> > 
> > If your kernel tree is managed via git,
> yes it is,
> 
> >  please post 'git diff'.
> attached. Is it correctly patched? thanks.

Yeah, it should be correct except for the change on __blk_mq_delay_run_hw_queue()
is duplicated.

> 
> 
> > Otherwise, share us your kernel version,
> btw, is 5.4.0+
> 
> >  and I will send you one
> > backported patch on the kernel version.
> > 
> > Meantime, you can collect IO order log via the above script as you
> > did last
> > time, then send us the log.
> 
> ok, will try; is it just required to run it for a short period of time
> (say, some seconds) during the copy, or should I run it before the
> beginning (or before the mount?), and terminate it after the end of
> the copy? (Please note that in the latter case a large amount of time
> (and data, I suppose) would be involved, because, as said, to be sure
> the problem triggers I have to use a large file... but we can try to
> better understand and tune this. If it can help, you can get an ods
> file with the complete statistic at [1] (look at the "prove_nov19"
> sheet)).

The data won't be very big, each line covers 120KB, and ~10K line
is enough for cover 1GB transfer. Then ~300KB compressed file should
hold all the trace.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26  9:15                                                                         ` Ming Lei
@ 2019-11-26 10:24                                                                           ` Ming Lei
  2019-11-26 11:14                                                                           ` Andrea Vai
  1 sibling, 0 replies; 102+ messages in thread
From: Ming Lei @ 2019-11-26 10:24 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, Nov 26, 2019 at 05:15:33PM +0800, Ming Lei wrote:
> On Tue, Nov 26, 2019 at 08:46:07AM +0100, Andrea Vai wrote:
> > Il giorno mar, 26/11/2019 alle 10.32 +0800, Ming Lei ha scritto:
> > > On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> > > > Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha scritto:
> > > > > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > What to try next?
> > > > > 
> > > > > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> > > > result:
> > > > 
> > > > alloc_policy=FIFO SHOULD_MERGE|2
> > > > 
> > > > > 
> > > > > 
> > > > > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run your
> > > copy
> > > > > 1GB
> > > > > test again.
> > > > 
> > > > done, and still fails. What to try next?
> > > 
> > > I just run 256M cp test
> > 
> > I would like to point out that 256MB is a filesize that usually don't
> > trigger the issue (don't know if it matters, sorry).
> 
> OK.
> 
> I tested 256M because IO timeout is often triggered in case of
> qemu-ehci, and it is a long-term issue. When setting up the disk
> via xhci-qemu, the max request size is increased to 1MB from 120KB,
> and IO pattern changed too. When the disk is connected via uhci-qemu,
> the transfer is too slow(1MB/s) because max endpoint size is too small.
> 
> However, I just waited 16min and collected all the 1GB IO log by
> connecting disk over uhci-qemu, but the sector of each data IO
> is still in order.
> 
> > 
> > Another info I would provide is about another strange behavior I
> > noticed: yesterday I ran the test two times (as usual with 1GB
> > filesize) and took 2370s, 1786s, and a third test was going on when I
> > stopped it. Then I started another set of 100 trials and let them run
> > tonight, and the first 10 trials were around 1000s, then gradually
> > decreased to ~300s, and finally settled around 200s with some trials
> > below 70-80s. This to say, times are extremely variable and for the
> > first time I noticed a sort of "performance increase" with time.
> 
> The 'cp' test is buffered IO, can you reproduce it every time by
> running copy just after fresh mount on the USB disk?
> 
> > 
> > >  to one USB storage device on patched kernel,
> > > and WRITE data IO is really in ascending order. The filesystem is
> > > ext4,
> > > and mount without '-o sync'. From previous discussion, looks that is
> > > exactly your test setting. The order can be observed via the
> > > following script:
> > > 
> > > #!/bin/sh
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > > /usr/share/bcc/tools/trace -t -C \
> > >   't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > > >rwbs, args->sector, args->nr_sector'
> > > 
> > > $MAJ & $MIN can be retrieved via lsblk for your USB storage disk.
> > > 
> > > So I think we need to check if the patch is applied correctly first.
> > > 
> > > If your kernel tree is managed via git,
> > yes it is,
> > 
> > >  please post 'git diff'.
> > attached. Is it correctly patched? thanks.
> 
> Yeah, it should be correct except for the change on __blk_mq_delay_run_hw_queue()
> is duplicated.
> 
> > 
> > 
> > > Otherwise, share us your kernel version,
> > btw, is 5.4.0+
> > 
> > >  and I will send you one
> > > backported patch on the kernel version.
> > > 
> > > Meantime, you can collect IO order log via the above script as you
> > > did last
> > > time, then send us the log.
> > 
> > ok, will try; is it just required to run it for a short period of time
> > (say, some seconds) during the copy, or should I run it before the
> > beginning (or before the mount?), and terminate it after the end of
> > the copy? (Please note that in the latter case a large amount of time
> > (and data, I suppose) would be involved, because, as said, to be sure
> > the problem triggers I have to use a large file... but we can try to
> > better understand and tune this. If it can help, you can get an ods
> > file with the complete statistic at [1] (look at the "prove_nov19"
> > sheet)).
> 
> The data won't be very big, each line covers 120KB, and ~10K line
> is enough for cover 1GB transfer. Then ~300KB compressed file should
> hold all the trace.

Also use the following trace script this time:

#!/bin/sh

MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26  9:15                                                                         ` Ming Lei
  2019-11-26 10:24                                                                           ` Ming Lei
@ 2019-11-26 11:14                                                                           ` Andrea Vai
  2019-11-27  2:05                                                                             ` Ming Lei
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-26 11:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 3585 bytes --]

Il giorno mar, 26/11/2019 alle 17.15 +0800, Ming Lei ha scritto:
> On Tue, Nov 26, 2019 at 08:46:07AM +0100, Andrea Vai wrote:
> > Il giorno mar, 26/11/2019 alle 10.32 +0800, Ming Lei ha scritto:
> > > On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> > > > Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha
> scritto:
> > > > > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > What to try next?
> > > > > 
> > > > > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> > > > result:
> > > > 
> > > > alloc_policy=FIFO SHOULD_MERGE|2
> > > > 
> > > > > 
> > > > > 
> > > > > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run
> your
> > > copy
> > > > > 1GB
> > > > > test again.
> > > > 
> > > > done, and still fails. What to try next?
> > > 
> > > I just run 256M cp test
> > 
> > I would like to point out that 256MB is a filesize that usually
> don't
> > trigger the issue (don't know if it matters, sorry).
> 
> OK.
> 
> I tested 256M because IO timeout is often triggered in case of
> qemu-ehci, and it is a long-term issue. When setting up the disk
> via xhci-qemu, the max request size is increased to 1MB from 120KB,
> and IO pattern changed too. When the disk is connected via uhci-
> qemu,
> the transfer is too slow(1MB/s) because max endpoint size is too
> small.
> 
> However, I just waited 16min and collected all the 1GB IO log by
> connecting disk over uhci-qemu, but the sector of each data IO
> is still in order.
> 
> > 
> > Another info I would provide is about another strange behavior I
> > noticed: yesterday I ran the test two times (as usual with 1GB
> > filesize) and took 2370s, 1786s, and a third test was going on
> when I
> > stopped it. Then I started another set of 100 trials and let them
> run
> > tonight, and the first 10 trials were around 1000s, then gradually
> > decreased to ~300s, and finally settled around 200s with some
> trials
> > below 70-80s. This to say, times are extremely variable and for
> the
> > first time I noticed a sort of "performance increase" with time.
> 
> The 'cp' test is buffered IO, can you reproduce it every time by
> running copy just after fresh mount on the USB disk?

yes, every time my test script (attached) mounts, copy, unmount (but I
don't unplug and replug the pendrive each time). Is this enough?

> 
> > 
> > >  to one USB storage device on patched kernel,
> > > and WRITE data IO is really in ascending order. The filesystem
> is
> > > ext4,
> > > and mount without '-o sync'. From previous discussion, looks
> that is
> > > exactly your test setting. The order can be observed via the
> > > following script:
> > > 
> > > #!/bin/sh
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > > /usr/share/bcc/tools/trace -t -C \
> > >   't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> args-
> > > >rwbs, args->sector, args->nr_sector'
> > > 
> > > $MAJ & $MIN can be retrieved via lsblk for your USB storage
> disk.

ok, so I try:

# lsblk /dev/sdf
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdf      8:80   1 28,8G  0 disk 
└─sdf1   8:81   1 28,8G  0 part 

so I ran your script (the second one, which you sent me in the next
email message) with:

./test_ming 8 80

but it fails to run (terminal output is in attached errors.txt).
What am I doing wrong?

It's still not clear to me if I need to start the trace script and
then the test, or the opposite (or doesn't matter). The above errors
are in the former case (I didn't even start the test, actually)

Thanks,
Andrea

[-- Attachment #2: errors.txt --]
[-- Type: text/plain, Size: 60490 bytes --]

In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:26:
In file included from /lib/modules/5.4.0+/build/include/linux/irqflags.h:16:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/irqflags.h:9:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/nospec-branch.h:314:
/lib/modules/5.4.0+/build/arch/x86/include/asm/segment.h:254:2: error: expected '(' after 'asm'
        alternative_io ("lsl %[seg],%[p]",
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/thread_info.h:38:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/thread_info.h:12:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/page.h:12:
/lib/modules/5.4.0+/build/arch/x86/include/asm/page_64.h:49:2: error: expected '(' after 'asm'
        alternative_call_2(clear_page_orig,
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:256:2: note: expanded from macro 'alternative_call_2'
        asm_inline volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/thread_info.h:38:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/thread_info.h:53:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/cpufeature.h:5:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/processor.h:24:
/lib/modules/5.4.0+/build/arch/x86/include/asm/special_insns.h:205:2: error: expected '(' after 'asm'
        alternative_io(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/thread_info.h:38:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/thread_info.h:53:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/cpufeature.h:5:
/lib/modules/5.4.0+/build/arch/x86/include/asm/processor.h:795:2: error: expected '(' after 'asm'
        alternative_input(BASE_PREFETCH, "prefetchnta %P1",
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:221:2: note: expanded from macro 'alternative_input'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/thread_info.h:38:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/thread_info.h:53:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/cpufeature.h:5:
/lib/modules/5.4.0+/build/arch/x86/include/asm/processor.h:807:2: error: expected '(' after 'asm'
        alternative_input(BASE_PREFETCH, "prefetchw %P1",
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:221:2: note: expanded from macro 'alternative_input'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
/lib/modules/5.4.0+/build/include/linux/thread_info.h:134:2: error: expected '(' after 'asm'
        WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
        ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:124:3: note: expanded from macro 'WARN'
                __WARN_printf(TAINT_WARN, format);                      \
                ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:93:3: note: expanded from macro '__WARN_printf'
                __WARN_FLAGS(BUGFLAG_NO_CUT_HERE | BUGFLAG_TAINT(taint));\
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:79:2: note: expanded from macro '__WARN_FLAGS'
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:35:2: note: expanded from macro '_BUG_FLAGS'
        asm_inline volatile("1:\t" ins "\n"                             \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
/lib/modules/5.4.0+/build/include/linux/rcupdate.h:893:2: error: expected '(' after 'asm'
        WARN_ON_ONCE(func != (rcu_callback_t)~0L);
        ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:98:3: note: expanded from macro 'WARN_ON_ONCE'
                __WARN_FLAGS(BUGFLAG_ONCE |                     \
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:79:2: note: expanded from macro '__WARN_FLAGS'
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:35:2: note: expanded from macro '_BUG_FLAGS'
        asm_inline volatile("1:\t" ins "\n"                             \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:15:
In file included from /lib/modules/5.4.0+/build/include/linux/sem.h:5:
In file included from /lib/modules/5.4.0+/build/include/uapi/linux/sem.h:5:
In file included from /lib/modules/5.4.0+/build/include/linux/ipc.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/rhashtable-types.h:15:
In file included from /lib/modules/5.4.0+/build/include/linux/workqueue.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/timer.h:6:
/lib/modules/5.4.0+/build/include/linux/ktime.h:171:2: error: expected '(' after 'asm'
        WARN_ON(div < 0);
        ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:115:3: note: expanded from macro 'WARN_ON'
                __WARN();                                               \
                ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:90:19: note: expanded from macro '__WARN'
#define __WARN()                __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
                                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:79:2: note: expanded from macro '__WARN_FLAGS'
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:35:2: note: expanded from macro '_BUG_FLAGS'
        asm_inline volatile("1:\t" ins "\n"                             \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:20:
In file included from /lib/modules/5.4.0+/build/include/linux/hrtimer.h:19:
In file included from /lib/modules/5.4.0+/build/include/linux/percpu.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/smp.h:68:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/smp.h:13:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/apic.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/fixmap.h:190:
In file included from /lib/modules/5.4.0+/build/include/asm-generic/fixmap.h:19:
In file included from /lib/modules/5.4.0+/build/include/linux/mm_types.h:14:
In file included from /lib/modules/5.4.0+/build/include/linux/uprobes.h:49:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uprobes.h:13:
In file included from /lib/modules/5.4.0+/build/include/linux/notifier.h:16:
/lib/modules/5.4.0+/build/include/linux/srcu.h:179:2: error: expected '(' after 'asm'
        WARN_ON_ONCE(idx & ~0x1);
        ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:98:3: note: expanded from macro 'WARN_ON_ONCE'
                __WARN_FLAGS(BUGFLAG_ONCE |                     \
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:79:2: note: expanded from macro '__WARN_FLAGS'
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:35:2: note: expanded from macro '_BUG_FLAGS'
        asm_inline volatile("1:\t" ins "\n"                             \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:20:
In file included from /lib/modules/5.4.0+/build/include/linux/hrtimer.h:19:
In file included from /lib/modules/5.4.0+/build/include/linux/percpu.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/smp.h:68:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/smp.h:13:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/apic.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/fixmap.h:190:
/lib/modules/5.4.0+/build/include/asm-generic/fixmap.h:38:2: error: expected '(' after 'asm'
        BUG_ON(vaddr >= FIXADDR_TOP || vaddr < FIXADDR_START);
        ^
/lib/modules/5.4.0+/build/include/asm-generic/bug.h:62:57: note: expanded from macro 'BUG_ON'
#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                                                        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:73:2: note: expanded from macro 'BUG'
        _BUG_FLAGS(ASM_UD2, 0);                                 \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/bug.h:35:2: note: expanded from macro '_BUG_FLAGS'
        asm_inline volatile("1:\t" ins "\n"                             \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:20:
In file included from /lib/modules/5.4.0+/build/include/linux/hrtimer.h:19:
In file included from /lib/modules/5.4.0+/build/include/linux/percpu.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/smp.h:68:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/smp.h:13:
/lib/modules/5.4.0+/build/arch/x86/include/asm/apic.h:107:2: error: expected '(' after 'asm'
        alternative_io("movl %0, %P1", "xchgl %0, %P1", X86_BUG_11AP,
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:87:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return (set->sig[3] | set->sig[2] |
                        ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:87:25: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return (set->sig[3] | set->sig[2] |
                                      ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:88:4: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                        set->sig[1] | set->sig[0]) == 0;
                        ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:90:11: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return (set->sig[1] | set->sig[0]) == 0;
                        ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:103:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return  (set1->sig[3] == set2->sig[3]) &&
                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:103:27: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return  (set1->sig[3] == set2->sig[3]) &&
                                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:104:5: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
                        (set1->sig[2] == set2->sig[2]) &&
                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:104:21: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
                        (set1->sig[2] == set2->sig[2]) &&
                                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:105:5: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                        (set1->sig[1] == set2->sig[1]) &&
                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:105:21: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                        (set1->sig[1] == set2->sig[1]) &&
                                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:108:11: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return  (set1->sig[1] == set2->sig[1]) &&
                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:108:27: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
                return  (set1->sig[1] == set2->sig[1]) &&
                                         ^         ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:8: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:24: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:8: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:24: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:130:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[3] = op(a3, b3);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:131:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[2] = op(a2, b2);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:8: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:24: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:147:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigorsets, _sig_or)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:135:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[1] = op(a1, b1);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:8: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:24: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:8: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:24: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:130:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[3] = op(a3, b3);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:131:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[2] = op(a2, b2);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:8: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:24: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:150:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandsets, _sig_and)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:135:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[1] = op(a1, b1);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:8: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:128:24: note: expanded from macro '_SIG_SET_BINOP'
                a3 = a->sig[3]; a2 = a->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:8: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:129:24: note: expanded from macro '_SIG_SET_BINOP'
                b3 = b->sig[3]; b2 = b->sig[2];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:130:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[3] = op(a3, b3);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:131:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[2] = op(a2, b2);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:8: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:134:24: note: expanded from macro '_SIG_SET_BINOP'
                a1 = a->sig[1]; b1 = b->sig[1];                         \
                                     ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:153:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_BINOP(sigandnsets, _sig_andn)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:135:3: note: expanded from macro '_SIG_SET_BINOP'
                r->sig[1] = op(a1, b1);                                 \
                ^      ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:164:27: note: expanded from macro '_SIG_SET_OP'
        case 4: set->sig[3] = op(set->sig[3]);                          \
                                 ^        ~
/lib/modules/5.4.0+/build/include/linux/signal.h:176:24: note: expanded from macro '_sig_not'
#define _sig_not(x)     (~(x))
                           ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:164:10: note: expanded from macro '_SIG_SET_OP'
        case 4: set->sig[3] = op(set->sig[3]);                          \
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:165:20: note: expanded from macro '_SIG_SET_OP'
                set->sig[2] = op(set->sig[2]);                          \
                                 ^        ~
/lib/modules/5.4.0+/build/include/linux/signal.h:176:24: note: expanded from macro '_sig_not'
#define _sig_not(x)     (~(x))
                           ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:165:3: note: expanded from macro '_SIG_SET_OP'
                set->sig[2] = op(set->sig[2]);                          \
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:167:27: note: expanded from macro '_SIG_SET_OP'
        case 2: set->sig[1] = op(set->sig[1]);                          \
                                 ^        ~
/lib/modules/5.4.0+/build/include/linux/signal.h:176:24: note: expanded from macro '_sig_not'
#define _sig_not(x)     (~(x))
                           ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:177:1: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
_SIG_SET_OP(signotset, _sig_not)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/lib/modules/5.4.0+/build/include/linux/signal.h:167:10: note: expanded from macro '_SIG_SET_OP'
        case 2: set->sig[1] = op(set->sig[1]);                          \
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:188:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
        case 2: set->sig[1] = 0;
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:201:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
        case 2: set->sig[1] = -1;
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:232:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
        case 2: set->sig[1] = 0;
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:6:
/lib/modules/5.4.0+/build/include/linux/signal.h:244:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
        case 2: set->sig[1] = -1;
                ^        ~
/lib/modules/5.4.0+/build/arch/x86/include/asm/signal.h:24:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:12:
/lib/modules/5.4.0+/build/arch/x86/include/asm/smap.h:47:2: error: expected '(' after 'asm'
        alternative("", __ASM_CLAC, X86_FEATURE_SMAP);
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:12:
/lib/modules/5.4.0+/build/arch/x86/include/asm/smap.h:53:2: error: expected '(' after 'asm'
        alternative("", __ASM_STAC, X86_FEATURE_SMAP);
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:37:2: error: expected '(' after 'asm'
        alternative_call_2(copy_user_generic_unrolled,
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:256:2: note: expanded from macro 'alternative_call_2'
        asm_inline volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:74:3: error: expected '(' after 'asm'
                __uaccess_begin_nospec();
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:125:2: note: expanded from macro '__uaccess_begin_nospec'
        barrier_nospec();               \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/barrier.h:52:26: note: expanded from macro 'barrier_nospec'
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
                         ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:80:3: error: expected '(' after 'asm'
                __uaccess_begin_nospec();
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:125:2: note: expanded from macro '__uaccess_begin_nospec'
        barrier_nospec();               \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/barrier.h:52:26: note: expanded from macro 'barrier_nospec'
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
                         ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:86:3: error: expected '(' after 'asm'
                __uaccess_begin_nospec();
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:125:2: note: expanded from macro '__uaccess_begin_nospec'
        barrier_nospec();               \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/barrier.h:52:26: note: expanded from macro 'barrier_nospec'
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
                         ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:92:3: error: expected '(' after 'asm'
                __uaccess_begin_nospec();
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:125:2: note: expanded from macro '__uaccess_begin_nospec'
        barrier_nospec();               \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/barrier.h:52:26: note: expanded from macro 'barrier_nospec'
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
                         ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
In file included from /virtual/main.c:2:
In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:7:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/signal.h:9:
In file included from /lib/modules/5.4.0+/build/include/linux/sched/task.h:11:
In file included from /lib/modules/5.4.0+/build/include/linux/uaccess.h:11:
In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:694:
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess_64.h:98:3: error: expected '(' after 'asm'
                __uaccess_begin_nospec();
                ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/uaccess.h:125:2: note: expanded from macro '__uaccess_begin_nospec'
        barrier_nospec();               \
        ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/barrier.h:52:26: note: expanded from macro 'barrier_nospec'
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
                         ^
/lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:204:2: note: expanded from macro 'alternative'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
        ^
/lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
                       ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
49 warnings and 20 errors generated.
Failed to compile BPF text

[-- Attachment #3: test --]
[-- Type: application/x-shellscript, Size: 1080 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26  7:46                                                                       ` Andrea Vai
  2019-11-26  9:15                                                                         ` Ming Lei
@ 2019-11-27  0:21                                                                         ` Finn Thain
  2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
  2019-11-28 17:10                                                                           ` Andrea Vai
  1 sibling, 2 replies; 102+ messages in thread
From: Finn Thain @ 2019-11-27  0:21 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Tue, 26 Nov 2019, Andrea Vai wrote:

> Then I started another set of 100 trials and let them run tonight, and 
> the first 10 trials were around 1000s, then gradually decreased to 
> ~300s, and finally settled around 200s with some trials below 70-80s. 
> This to say, times are extremely variable and for the first time I 
> noticed a sort of "performance increase" with time.
> 

The sheer volume of testing (probably some terabytes by now) would 
exercise the wear leveling algorithm in the FTL.

This in itself seems unlikely to improve performance significantly. But if 
the flash memory came from a bad batch, perhaps it would have that effect.

To find out, someone may need to source another (genuine) Kingston 
DataTraveller device.

-- 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-26 11:14                                                                           ` Andrea Vai
@ 2019-11-27  2:05                                                                             ` Ming Lei
  2019-11-27  9:39                                                                               ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-27  2:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Tue, Nov 26, 2019 at 12:14:19PM +0100, Andrea Vai wrote:
> Il giorno mar, 26/11/2019 alle 17.15 +0800, Ming Lei ha scritto:
> > On Tue, Nov 26, 2019 at 08:46:07AM +0100, Andrea Vai wrote:
> > > Il giorno mar, 26/11/2019 alle 10.32 +0800, Ming Lei ha scritto:
> > > > On Mon, Nov 25, 2019 at 07:51:33PM +0100, Andrea Vai wrote:
> > > > > Il giorno lun, 25/11/2019 alle 23.15 +0800, Ming Lei ha
> > scritto:
> > > > > > On Mon, Nov 25, 2019 at 03:58:34PM +0100, Andrea Vai wrote:
> > > > > > 
> > > > > > [...]
> > > > > > 
> > > > > > > What to try next?
> > > > > > 
> > > > > > 1) cat /sys/kernel/debug/block/$DISK/hctx0/flags
> > > > > result:
> > > > > 
> > > > > alloc_policy=FIFO SHOULD_MERGE|2
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > 2) echo 128 > /sys/block/$DISK/queue/nr_requests and run
> > your
> > > > copy
> > > > > > 1GB
> > > > > > test again.
> > > > > 
> > > > > done, and still fails. What to try next?
> > > > 
> > > > I just run 256M cp test
> > > 
> > > I would like to point out that 256MB is a filesize that usually
> > don't
> > > trigger the issue (don't know if it matters, sorry).
> > 
> > OK.
> > 
> > I tested 256M because IO timeout is often triggered in case of
> > qemu-ehci, and it is a long-term issue. When setting up the disk
> > via xhci-qemu, the max request size is increased to 1MB from 120KB,
> > and IO pattern changed too. When the disk is connected via uhci-
> > qemu,
> > the transfer is too slow(1MB/s) because max endpoint size is too
> > small.
> > 
> > However, I just waited 16min and collected all the 1GB IO log by
> > connecting disk over uhci-qemu, but the sector of each data IO
> > is still in order.
> > 
> > > 
> > > Another info I would provide is about another strange behavior I
> > > noticed: yesterday I ran the test two times (as usual with 1GB
> > > filesize) and took 2370s, 1786s, and a third test was going on
> > when I
> > > stopped it. Then I started another set of 100 trials and let them
> > run
> > > tonight, and the first 10 trials were around 1000s, then gradually
> > > decreased to ~300s, and finally settled around 200s with some
> > trials
> > > below 70-80s. This to say, times are extremely variable and for
> > the
> > > first time I noticed a sort of "performance increase" with time.
> > 
> > The 'cp' test is buffered IO, can you reproduce it every time by
> > running copy just after fresh mount on the USB disk?
> 
> yes, every time my test script (attached) mounts, copy, unmount (but I
> don't unplug and replug the pendrive each time). Is this enough?
> 
> > 
> > > 
> > > >  to one USB storage device on patched kernel,
> > > > and WRITE data IO is really in ascending order. The filesystem
> > is
> > > > ext4,
> > > > and mount without '-o sync'. From previous discussion, looks
> > that is
> > > > exactly your test setting. The order can be observed via the
> > > > following script:
> > > > 
> > > > #!/bin/sh
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >   't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > $MAJ & $MIN can be retrieved via lsblk for your USB storage
> > disk.
> 
> ok, so I try:
> 
> # lsblk /dev/sdf
> NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> sdf      8:80   1 28,8G  0 disk 
> └─sdf1   8:81   1 28,8G  0 part 
> 
> so I ran your script (the second one, which you sent me in the next
> email message) with:
> 
> ./test_ming 8 80
> 
> but it fails to run (terminal output is in attached errors.txt).
> What am I doing wrong?
> 
> It's still not clear to me if I need to start the trace script and
> then the test, or the opposite (or doesn't matter). The above errors
> are in the former case (I didn't even start the test, actually)
> 
> Thanks,
> Andrea

> In file included from /virtual/main.c:2:
> In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
> In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
> In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
> In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
> In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:26:
> In file included from /lib/modules/5.4.0+/build/include/linux/irqflags.h:16:
> In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/irqflags.h:9:
> In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/nospec-branch.h:314:
> /lib/modules/5.4.0+/build/arch/x86/include/asm/segment.h:254:2: error: expected '(' after 'asm'
>         alternative_io ("lsl %[seg],%[p]",
>         ^
> /lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
>         asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
>         ^
> /lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
> #define asm_inline asm __inline
>                        ^
> In file included from /virtual/main.c:2:
> In file included from /lib/modules/5.4.0+/build/include/linux/ptrace.h:6:
> In file included from /lib/modules/5.4.0+/build/include/linux/sched.h:14:
> In file included from /lib/modules/5.4.0+/build/include/linux/pid.h:5:
> In file included from /lib/modules/5.4.0+/build/include/linux/rculist.h:11:
> In file included from /lib/modules/5.4.0+/build/include/linux/rcupdate.h:27:
> In file included from /lib/modules/5.4.0+/build/include/linux/preempt.h:78:
> In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/preempt.h:7:
> In file included from /lib/modules/5.4.0+/build/include/linux/thread_info.h:38:
> In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/thread_info.h:12:
> In file included from /lib/modules/5.4.0+/build/arch/x86/include/asm/page.h:12:
> /lib/modules/5.4.0+/build/arch/x86/include/asm/page_64.h:49:2: error: expected '(' after 'asm'
>         alternative_call_2(clear_page_orig,
>         ^
> /lib/modules/5.4.0+/build/arch/x86/include/asm/alternative.h:256:2: note: expanded from macro 'alternative_call_2'
>         asm_inline volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
>         ^
> /lib/modules/5.4.0+/build/include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
> #define asm_inline asm __inline


It can be workaround via the following change:

/lib/modules/5.4.0+/build/include/generated/autoconf.h:

//#define CONFIG_CC_HAS_ASM_INLINE 1


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  0:21                                                                         ` Finn Thain
@ 2019-11-27  8:14                                                                           ` Schmid, Carsten
  2019-11-27 21:49                                                                             ` Finn Thain
  2019-11-28  7:46                                                                             ` Andrea Vai
  2019-11-28 17:10                                                                           ` Andrea Vai
  1 sibling, 2 replies; 102+ messages in thread
From: Schmid, Carsten @ 2019-11-27  8:14 UTC (permalink / raw)
  To: Finn Thain, Andrea Vai
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

> 
> > Then I started another set of 100 trials and let them run tonight, and
> > the first 10 trials were around 1000s, then gradually decreased to
> > ~300s, and finally settled around 200s with some trials below 70-80s.
> > This to say, times are extremely variable and for the first time I
> > noticed a sort of "performance increase" with time.
> >
> 
> The sheer volume of testing (probably some terabytes by now) would
> exercise the wear leveling algorithm in the FTL.
> 
But with "old kernel" the copy operation still is "fast", as far as i understood.
If FTL (e.g. wear leveling) would slow down, we would see that also in
the old kernel, right?

Andrea, can you confirm that the same device used with the old fast
kernel is still fast today?

BR
Carsten

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  2:05                                                                             ` Ming Lei
@ 2019-11-27  9:39                                                                               ` Andrea Vai
  2019-11-27 13:08                                                                                 ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-27  9:39 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 503 bytes --]

Il giorno mer, 27/11/2019 alle 10.05 +0800, Ming Lei ha scritto:
> 
> 
> It can be workaround via the following change:
> 
> /lib/modules/5.4.0+/build/include/generated/autoconf.h:
> 
> //#define CONFIG_CC_HAS_ASM_INLINE 1

Thanks, it worked, trace attached. Produced by: start the trace script
(with the pendrive already plugged), wait some seconds, run the test
(1 trial, 1 GB), wait for the test to finish, stop the trace.

The copy took 2659 seconds, roughly as already seen before.

Thanks,
Andrea

[-- Attachment #2: log_ming.zip --]
[-- Type: application/zip, Size: 111658 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  9:39                                                                               ` Andrea Vai
@ 2019-11-27 13:08                                                                                 ` Ming Lei
  2019-11-27 15:01                                                                                   ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-27 13:08 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

On Wed, Nov 27, 2019 at 10:39:40AM +0100, Andrea Vai wrote:
> Il giorno mer, 27/11/2019 alle 10.05 +0800, Ming Lei ha scritto:
> > 
> > 
> > It can be workaround via the following change:
> > 
> > /lib/modules/5.4.0+/build/include/generated/autoconf.h:
> > 
> > //#define CONFIG_CC_HAS_ASM_INLINE 1
> 
> Thanks, it worked, trace attached. Produced by: start the trace script
> (with the pendrive already plugged), wait some seconds, run the test
> (1 trial, 1 GB), wait for the test to finish, stop the trace.
> 
> The copy took 2659 seconds, roughly as already seen before.

Thanks for collecting the log.

From the log, some of write IOs are out-of-order, such as, the 1st one
is 378880.

16.41240 2   266     266     kworker/2:1H    block_rq_issue   b'W' 370656 240
16.41961 3   485     485     kworker/3:1H    block_rq_issue   b'W' 378880 240
16.73729 2   266     266     kworker/2:1H    block_rq_issue   b'W' 370896 240
17.71161 2   266     266     kworker/2:1H    block_rq_issue   b'W' 379120 240
18.02344 2   266     266     kworker/2:1H    block_rq_issue   b'W' 371136 240
18.94314 3   485     485     kworker/3:1H    block_rq_issue   b'W' 379360 240
19.25624 2   266     266     kworker/2:1H    block_rq_issue   b'W' 371376 240

IO latency is increased a lot since the 1st out-of-order request(usb
storage HBA is single queue depth, one request can be issued only if 
the previous issued request is completed).

The reason is that there are two kind of tasks which inserts rq to device.
One is the 'cp' process, the other is kworker/u8:*.  The out-of-order
happens during the two task's interleaving.

Under such situation, I believe that the old legacy IO path may not
guarantee the order too. In blk_queue_bio(), after get_request()
allocates one request, the queue lock is released.  And request is
actually inserted & issued from blk_flush_plug_list() under the
branch of 'if (plug)'. If requests are from two tasks, then request
is inserted/issued from two plug list, and no order can be guaranteed.

In my test, except for several requests from the beginning, all other
requests are inserted via the kworker thread(guess it is writeback wq),
that is why I can't observe the issue in my test.

As Schmid suggested, you may run the same test on old kernel with
legacy io path, and see if the performance is still good.

Also, could you share the following info about your machine? So that
I can build my VM guest in this setting for reproducing your situation
(requests are inserted from two types of threads).

- lscpu
- free -h
- lsblk -d $USB_DISK
- exact commands for mount the disk, and running the copy operation

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27 13:08                                                                                 ` Ming Lei
@ 2019-11-27 15:01                                                                                   ` Andrea Vai
  0 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-27 15:01 UTC (permalink / raw)
  To: Ming Lei
  Cc: Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 3496 bytes --]

Il giorno mer, 27/11/2019 alle 21.08 +0800, Ming Lei ha scritto:
> On Wed, Nov 27, 2019 at 10:39:40AM +0100, Andrea Vai wrote:
> > Il giorno mer, 27/11/2019 alle 10.05 +0800, Ming Lei ha scritto:
> > > 
> > > 
> > > It can be workaround via the following change:
> > > 
> > > /lib/modules/5.4.0+/build/include/generated/autoconf.h:
> > > 
> > > //#define CONFIG_CC_HAS_ASM_INLINE 1
> > 
> > Thanks, it worked, trace attached. Produced by: start the trace
> script
> > (with the pendrive already plugged), wait some seconds, run the
> test
> > (1 trial, 1 GB), wait for the test to finish, stop the trace.
> > 
> > The copy took 2659 seconds, roughly as already seen before.
> 
> Thanks for collecting the log.
> 
> From the log, some of write IOs are out-of-order, such as, the 1st
> one
> is 378880.
> 
> 16.41240 2   266     266     kworker/2:1H    block_rq_issue   b'W'
> 370656 240
> 16.41961 3   485     485     kworker/3:1H    block_rq_issue   b'W'
> 378880 240
> 16.73729 2   266     266     kworker/2:1H    block_rq_issue   b'W'
> 370896 240
> 17.71161 2   266     266     kworker/2:1H    block_rq_issue   b'W'
> 379120 240
> 18.02344 2   266     266     kworker/2:1H    block_rq_issue   b'W'
> 371136 240
> 18.94314 3   485     485     kworker/3:1H    block_rq_issue   b'W'
> 379360 240
> 19.25624 2   266     266     kworker/2:1H    block_rq_issue   b'W'
> 371376 240
> 
> IO latency is increased a lot since the 1st out-of-order request(usb
> storage HBA is single queue depth, one request can be issued only
> if 
> the previous issued request is completed).
> 
> The reason is that there are two kind of tasks which inserts rq to
> device.
> One is the 'cp' process, the other is kworker/u8:*.  The out-of-
> order
> happens during the two task's interleaving.
> 
> Under such situation, I believe that the old legacy IO path may not
> guarantee the order too. In blk_queue_bio(), after get_request()
> allocates one request, the queue lock is released.  And request is
> actually inserted & issued from blk_flush_plug_list() under the
> branch of 'if (plug)'. If requests are from two tasks, then request
> is inserted/issued from two plug list, and no order can be
> guaranteed.
> 
> In my test, except for several requests from the beginning, all
> other
> requests are inserted via the kworker thread(guess it is writeback
> wq),
> that is why I can't observe the issue in my test.
> 
> As Schmid suggested, you may run the same test on old kernel with
> legacy io path, and see if the performance is still good.
> 
> Also, could you share the following info about your machine? So that
> I can build my VM guest in this setting for reproducing your
> situation
> (requests are inserted from two types of threads).
> 
> - lscpu
attached,

> - free -h
              total        used        free      shared  buff/cache   available
Mem:           23Gi       4,2Gi        11Gi       448Mi       7,0Gi        18Gi
Swap:         3,7Gi          0B       3,7Gi

> - lsblk -d $USB_DISK

NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdg    8:96   1 28,8G  0 disk 


> - exact commands for mount the disk, and running the copy operation

I attached the whole script to this thread, I attach it again to this
message and copy the relevant lines here:

  mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
  SECONDS=0
  cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
  umount /mnt/pendrive 2>&1 |tee -a $logfile

Meanwhile, I am going on with the further tests as suggested

Thanks,
Andrea

[-- Attachment #2: lscpu.txt --]
[-- Type: text/plain, Size: 1371 bytes --]

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       39 bits physical, 48 bits virtual
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               60
Model name:          Intel(R) Core(TM) i5-4430 CPU @ 3.00GHz
Stepping:            3
CPU MHz:             1674.727
CPU max MHz:         3200,0000
CPU min MHz:         800,0000
BogoMIPS:            5986.16
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            6144K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

[-- Attachment #3: test --]
[-- Type: application/x-shellscript, Size: 1137 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
@ 2019-11-27 21:49                                                                             ` Finn Thain
  2019-11-28  7:46                                                                             ` Andrea Vai
  1 sibling, 0 replies; 102+ messages in thread
From: Finn Thain @ 2019-11-27 21:49 UTC (permalink / raw)
  To: Schmid, Carsten
  Cc: Andrea Vai, Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Wed, 27 Nov 2019, Schmid, Carsten wrote:

> > 
> > The sheer volume of testing (probably some terabytes by now) would 
> > exercise the wear leveling algorithm in the FTL.
> > 
> But with "old kernel" the copy operation still is "fast", as far as i 
> understood. If FTL (e.g. wear leveling) would slow down, we would see 
> that also in the old kernel, right?
> 
> Andrea, can you confirm that the same device used with the old fast 
> kernel is still fast today?

You seem to be saying we should optimize the kernel for a pathological 
use-case merely because it used to be fast before the blk-mq conversion. 
That makes no sense to me. I suppose you have information that I don't.

I assume that your employer (and the other corporations involved in this) 
have plenty of regression test results from a variety of flash hardware to 
show that the regression is real and the device is not pathological.

I'm not privy to any of that information so I will shut up and leave you 
guys to it.

-- 

> > This in itself seems unlikely to improve performance significantly. 
> > But if the flash memory came from a bad batch, perhaps it would have 
> > that effect.
> > 
> > To find out, someone may need to source another (genuine) Kingston 
> > DataTraveller device.
> > 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
  2019-11-27 21:49                                                                             ` Finn Thain
@ 2019-11-28  7:46                                                                             ` Andrea Vai
  2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
  2019-11-28  9:17                                                                               ` Ming Lei
  1 sibling, 2 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-28  7:46 UTC (permalink / raw)
  To: Schmid, Carsten, Finn Thain
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
scritto:
> > 
> > > Then I started another set of 100 trials and let them run
> tonight, and
> > > the first 10 trials were around 1000s, then gradually decreased
> to
> > > ~300s, and finally settled around 200s with some trials below
> 70-80s.
> > > This to say, times are extremely variable and for the first time
> I
> > > noticed a sort of "performance increase" with time.
> > >
> > 
> > The sheer volume of testing (probably some terabytes by now) would
> > exercise the wear leveling algorithm in the FTL.
> > 
> But with "old kernel" the copy operation still is "fast", as far as
> i understood.
> If FTL (e.g. wear leveling) would slow down, we would see that also
> in
> the old kernel, right?
> 
> Andrea, can you confirm that the same device used with the old fast
> kernel is still fast today?

Yes, it is still fast. Just ran a 100 trials test and got an average
of 70 seconds with standard deviation = 6 seconds, aligned with the
past values of the same kernel.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* AW: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28  7:46                                                                             ` Andrea Vai
@ 2019-11-28  8:12                                                                               ` Schmid, Carsten
  2019-11-28 11:40                                                                                 ` Andrea Vai
  2019-11-28 17:39                                                                                 ` Alan Stern
  2019-11-28  9:17                                                                               ` Ming Lei
  1 sibling, 2 replies; 102+ messages in thread
From: Schmid, Carsten @ 2019-11-28  8:12 UTC (permalink / raw)
  To: Andrea Vai, Finn Thain
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

> > > The sheer volume of testing (probably some terabytes by now) would
> > > exercise the wear leveling algorithm in the FTL.
> > >
> > But with "old kernel" the copy operation still is "fast", as far as
> > i understood.
> > If FTL (e.g. wear leveling) would slow down, we would see that also
> > in
> > the old kernel, right?
> >
> > Andrea, can you confirm that the same device used with the old fast
> > kernel is still fast today?
> 
> Yes, it is still fast. Just ran a 100 trials test and got an average
> of 70 seconds with standard deviation = 6 seconds, aligned with the
> past values of the same kernel.
> 
> Thanks,
> Andrea
I have been involved in several benchmarkings of flash devices in the past.
So what we see here is definitely not a device issue regarding wear leveling.

I wanted to prevent all of you going into the wrong direction, that's why
i wanted Andrea to confirm that it's not a matter of the flash device.

There are so much items involved into benchmarking flash devices.
But Andrea's observations with factors of 10-30 times slow down
i have never seen before.

I assume the only thing that you change between the benchmarks
is the kernel (and the modules, of course), right, Andrea?
Then we can rule out cache settings which massively can impact
benchmarks.

The only thing that makes sense from my POV is:
- collect traces with the kernel before mentioned commit (fast)
- apply patch in doubt
- again collect traces (slow)
- compare the traces

Then we should be able to see the difference(s).
Unfortunately i'm not an expert on the SCSI and USB kernel stuff
involved here. Else i would try to understand what happens and
give you some hints.

BR
Carsten

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28  7:46                                                                             ` Andrea Vai
  2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
@ 2019-11-28  9:17                                                                               ` Ming Lei
  2019-11-28 17:34                                                                                 ` Andrea Vai
  1 sibling, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-28  9:17 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> scritto:
> > > 
> > > > Then I started another set of 100 trials and let them run
> > tonight, and
> > > > the first 10 trials were around 1000s, then gradually decreased
> > to
> > > > ~300s, and finally settled around 200s with some trials below
> > 70-80s.
> > > > This to say, times are extremely variable and for the first time
> > I
> > > > noticed a sort of "performance increase" with time.
> > > >
> > > 
> > > The sheer volume of testing (probably some terabytes by now) would
> > > exercise the wear leveling algorithm in the FTL.
> > > 
> > But with "old kernel" the copy operation still is "fast", as far as
> > i understood.
> > If FTL (e.g. wear leveling) would slow down, we would see that also
> > in
> > the old kernel, right?
> > 
> > Andrea, can you confirm that the same device used with the old fast
> > kernel is still fast today?
> 
> Yes, it is still fast. Just ran a 100 trials test and got an average
> of 70 seconds with standard deviation = 6 seconds, aligned with the
> past values of the same kernel.

Then can you collect trace on the old kernel via the previous script?

#!/bin/sh

MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'

Both the two trace points and bcc should be available on the old kernel.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
@ 2019-11-28 11:40                                                                                 ` Andrea Vai
  2019-11-28 17:39                                                                                 ` Alan Stern
  1 sibling, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-28 11:40 UTC (permalink / raw)
  To: Schmid, Carsten, Finn Thain
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno gio, 28/11/2019 alle 08.12 +0000, Schmid, Carsten ha
scritto:
> 
> [...]
> 
> I assume the only thing that you change between the benchmarks
> is the kernel (and the modules, of course), right, Andrea?
> 

It's my production machine, so apart from the changes involved in a
"normal use of a PC" I can say that there are no changes I am aware of
(apart from the kernel, and other changes you told me to do, such as
changing the IO scheduler, etc)... but please remember I am not an
expert, so feel free to ask me what other kind of changes I can tell
you about.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-27  0:21                                                                         ` Finn Thain
  2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
@ 2019-11-28 17:10                                                                           ` Andrea Vai
  1 sibling, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-11-28 17:10 UTC (permalink / raw)
  To: Finn Thain
  Cc: Ming Lei, Damien Le Moal, Alan Stern, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

Il giorno mer, 27/11/2019 alle 11.21 +1100, Finn Thain ha scritto:
> On Tue, 26 Nov 2019, Andrea Vai wrote:
> 
> > Then I started another set of 100 trials and let them run tonight,
> and 
> > the first 10 trials were around 1000s, then gradually decreased
> to 
> > ~300s, and finally settled around 200s with some trials below 70-
> 80s. 
> > This to say, times are extremely variable and for the first time
> I 
> > noticed a sort of "performance increase" with time.
> > 
> 
> The sheer volume of testing (probably some terabytes by now) would 
> exercise the wear leveling algorithm in the FTL.
> 
> This in itself seems unlikely to improve performance significantly.
> But if 
> the flash memory came from a bad batch, perhaps it would have that
> effect.
> 
> To find out, someone may need to source another (genuine) Kingston 
> DataTraveller device.

I own another device (let's refer to it as "black odd"), identical to
the "slow" one (call it "black even"), and used it as well  to do the
tests, especially in the beginning of this story, because I suspected
the problem could be related to a faulty pen drive. At a certain time
I realized that the tests I performed didn't show any difference
between the two flash drives, so since that time I kept using just the
"black even". They were bought together, so of course both of them
probably belong to the same "maybe-bad batch".

But I have another Kingston DataTraveler ("White"), externally
slightly different from the other twos (it's white instead of black,
and labeled G4 instead of G3), though lsusb shows the same IDs:
0951:1666. It had been purchased some months after the other twos
(well, actually, it may be the result of an RMA exchange).

I have just ran one test on this White one, with the new (patched)
kernel, and it took an average of 200seconds (st.dev=46s), which is
not "good", but less "bad" than the real "bad" case of the "black"
ones (>1000 seconds).

I have also tried the "WHITE" one with the old fast kernel, and the
behavior is almost the same as with the new kernel, though a little
bit better (mean=173; st.dev.=11).

Feel free to let me know if I should do other tries,

thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28  9:17                                                                               ` Ming Lei
@ 2019-11-28 17:34                                                                                 ` Andrea Vai
  2019-11-29  0:57                                                                                   ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-28 17:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

[-- Attachment #1: Type: text/plain, Size: 2094 bytes --]

Il giorno gio, 28/11/2019 alle 17.17 +0800, Ming Lei ha scritto:
> On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> > Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> > scritto:
> > > > 
> > > > > Then I started another set of 100 trials and let them run
> > > tonight, and
> > > > > the first 10 trials were around 1000s, then gradually
> decreased
> > > to
> > > > > ~300s, and finally settled around 200s with some trials
> below
> > > 70-80s.
> > > > > This to say, times are extremely variable and for the first
> time
> > > I
> > > > > noticed a sort of "performance increase" with time.
> > > > >
> > > > 
> > > > The sheer volume of testing (probably some terabytes by now)
> would
> > > > exercise the wear leveling algorithm in the FTL.
> > > > 
> > > But with "old kernel" the copy operation still is "fast", as far
> as
> > > i understood.
> > > If FTL (e.g. wear leveling) would slow down, we would see that
> also
> > > in
> > > the old kernel, right?
> > > 
> > > Andrea, can you confirm that the same device used with the old
> fast
> > > kernel is still fast today?
> > 
> > Yes, it is still fast. Just ran a 100 trials test and got an
> average
> > of 70 seconds with standard deviation = 6 seconds, aligned with
> the
> > past values of the same kernel.
> 
> Then can you collect trace on the old kernel via the previous
> script?
> 
> #!/bin/sh
> 
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> Both the two trace points and bcc should be available on the old
> kernel.
> 

Trace attached. Produced by: start the trace script
(with the pendrive already plugged), wait some seconds, run the test
(1 trial, 1 GB), wait for the test to finish, stop the trace.

The copy took 73 seconds, roughly as already seen before with the fast
old kernel.

Thanks,
Andrea

[-- Attachment #2: log_ming_20191128_182751.zip --]
[-- Type: application/zip, Size: 118068 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
  2019-11-28 11:40                                                                                 ` Andrea Vai
@ 2019-11-28 17:39                                                                                 ` Alan Stern
  1 sibling, 0 replies; 102+ messages in thread
From: Alan Stern @ 2019-11-28 17:39 UTC (permalink / raw)
  To: Schmid, Carsten
  Cc: Andrea Vai, Finn Thain, Ming Lei, Damien Le Moal, Jens Axboe,
	Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Thu, 28 Nov 2019, Schmid, Carsten wrote:

> I have been involved in several benchmarkings of flash devices in the past.
> So what we see here is definitely not a device issue regarding wear leveling.
> 
> I wanted to prevent all of you going into the wrong direction, that's why
> i wanted Andrea to confirm that it's not a matter of the flash device.
> 
> There are so much items involved into benchmarking flash devices.
> But Andrea's observations with factors of 10-30 times slow down
> i have never seen before.
> 
> I assume the only thing that you change between the benchmarks
> is the kernel (and the modules, of course), right, Andrea?
> Then we can rule out cache settings which massively can impact
> benchmarks.
> 
> The only thing that makes sense from my POV is:
> - collect traces with the kernel before mentioned commit (fast)
> - apply patch in doubt
> - again collect traces (slow)
> - compare the traces
> 
> Then we should be able to see the difference(s).

We have already done this.  I forget whether the traces are in the
email history available in the archives or whether they are stored 
somewhere else.

In any case, my analysis of the traces is in the archives.  It seemed 
very clear that the only difference which mattered was the ordering of 
the write commands (sequential vs. non-sequential).  This was obviously 
something which the commit in question would affect, and it also seemed 
likely to cause the device to slow down considerably.

Alan Stern

> Unfortunately i'm not an expert on the SCSI and USB kernel stuff
> involved here. Else i would try to understand what happens and
> give you some hints.
> 
> BR
> Carsten


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-28 17:34                                                                                 ` Andrea Vai
@ 2019-11-29  0:57                                                                                   ` Ming Lei
  2019-11-29  2:35                                                                                     ` Ming Lei
  2019-11-29 11:44                                                                                     ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Bernd Schubert
  0 siblings, 2 replies; 102+ messages in thread
From: Ming Lei @ 2019-11-29  0:57 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Thu, Nov 28, 2019 at 06:34:32PM +0100, Andrea Vai wrote:
> Il giorno gio, 28/11/2019 alle 17.17 +0800, Ming Lei ha scritto:
> > On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> > > Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> > > scritto:
> > > > > 
> > > > > > Then I started another set of 100 trials and let them run
> > > > tonight, and
> > > > > > the first 10 trials were around 1000s, then gradually
> > decreased
> > > > to
> > > > > > ~300s, and finally settled around 200s with some trials
> > below
> > > > 70-80s.
> > > > > > This to say, times are extremely variable and for the first
> > time
> > > > I
> > > > > > noticed a sort of "performance increase" with time.
> > > > > >
> > > > > 
> > > > > The sheer volume of testing (probably some terabytes by now)
> > would
> > > > > exercise the wear leveling algorithm in the FTL.
> > > > > 
> > > > But with "old kernel" the copy operation still is "fast", as far
> > as
> > > > i understood.
> > > > If FTL (e.g. wear leveling) would slow down, we would see that
> > also
> > > > in
> > > > the old kernel, right?
> > > > 
> > > > Andrea, can you confirm that the same device used with the old
> > fast
> > > > kernel is still fast today?
> > > 
> > > Yes, it is still fast. Just ran a 100 trials test and got an
> > average
> > > of 70 seconds with standard deviation = 6 seconds, aligned with
> > the
> > > past values of the same kernel.
> > 
> > Then can you collect trace on the old kernel via the previous
> > script?
> > 
> > #!/bin/sh
> > 
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > Both the two trace points and bcc should be available on the old
> > kernel.
> > 
> 
> Trace attached. Produced by: start the trace script
> (with the pendrive already plugged), wait some seconds, run the test
> (1 trial, 1 GB), wait for the test to finish, stop the trace.
> 
> The copy took 73 seconds, roughly as already seen before with the fast
> old kernel.

This trace shows a good write IO order because the writeback IOs are
queued to block layer serially from the 'cp' task and writeback wq.

However, writeback IO order is changed in current linus tree because
the IOs are queued to block layer concurrently from the 'cp' task
and writeback wq. It might be related with killing queue_congestion
by blk-mq.

The performance effect could be not only on this specific USB drive,
but also on all HDD., I guess.

However, I still can't reproduce it in my VM even though I built it
with similar setting of Andrea's test machine. Maybe the emulated disk
is too fast than Andrea's.

Andrea, can you collect the following log when running the test
on current new(bad) kernel?

	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-29  0:57                                                                                   ` Ming Lei
@ 2019-11-29  2:35                                                                                     ` Ming Lei
  2019-11-29 14:41                                                                                       ` Andrea Vai
  2019-11-29 11:44                                                                                     ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Bernd Schubert
  1 sibling, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-11-29  2:35 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> On Thu, Nov 28, 2019 at 06:34:32PM +0100, Andrea Vai wrote:
> > Il giorno gio, 28/11/2019 alle 17.17 +0800, Ming Lei ha scritto:
> > > On Thu, Nov 28, 2019 at 08:46:57AM +0100, Andrea Vai wrote:
> > > > Il giorno mer, 27/11/2019 alle 08.14 +0000, Schmid, Carsten ha
> > > > scritto:
> > > > > > 
> > > > > > > Then I started another set of 100 trials and let them run
> > > > > tonight, and
> > > > > > > the first 10 trials were around 1000s, then gradually
> > > decreased
> > > > > to
> > > > > > > ~300s, and finally settled around 200s with some trials
> > > below
> > > > > 70-80s.
> > > > > > > This to say, times are extremely variable and for the first
> > > time
> > > > > I
> > > > > > > noticed a sort of "performance increase" with time.
> > > > > > >
> > > > > > 
> > > > > > The sheer volume of testing (probably some terabytes by now)
> > > would
> > > > > > exercise the wear leveling algorithm in the FTL.
> > > > > > 
> > > > > But with "old kernel" the copy operation still is "fast", as far
> > > as
> > > > > i understood.
> > > > > If FTL (e.g. wear leveling) would slow down, we would see that
> > > also
> > > > > in
> > > > > the old kernel, right?
> > > > > 
> > > > > Andrea, can you confirm that the same device used with the old
> > > fast
> > > > > kernel is still fast today?
> > > > 
> > > > Yes, it is still fast. Just ran a 100 trials test and got an
> > > average
> > > > of 70 seconds with standard deviation = 6 seconds, aligned with
> > > the
> > > > past values of the same kernel.
> > > 
> > > Then can you collect trace on the old kernel via the previous
> > > script?
> > > 
> > > #!/bin/sh
> > > 
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > > 
> > > /usr/share/bcc/tools/trace -t -C \
> > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > > >rwbs, args->sector, args->nr_sector' \
> > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > > >rwbs, args->sector, args->nr_sector'
> > > 
> > > Both the two trace points and bcc should be available on the old
> > > kernel.
> > > 
> > 
> > Trace attached. Produced by: start the trace script
> > (with the pendrive already plugged), wait some seconds, run the test
> > (1 trial, 1 GB), wait for the test to finish, stop the trace.
> > 
> > The copy took 73 seconds, roughly as already seen before with the fast
> > old kernel.
> 
> This trace shows a good write IO order because the writeback IOs are
> queued to block layer serially from the 'cp' task and writeback wq.
> 
> However, writeback IO order is changed in current linus tree because
> the IOs are queued to block layer concurrently from the 'cp' task
> and writeback wq. It might be related with killing queue_congestion
> by blk-mq.
> 
> The performance effect could be not only on this specific USB drive,
> but also on all HDD., I guess.
> 
> However, I still can't reproduce it in my VM even though I built it
> with similar setting of Andrea's test machine. Maybe the emulated disk
> is too fast than Andrea's.
> 
> Andrea, can you collect the following log when running the test
> on current new(bad) kernel?
> 
> 	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request

Instead, please run the following trace, given insert may be
called from other paths, such as flush plug:

	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert

If you are using python3, the following failure may be triggered:

	"cannot use a bytes pattern on a string-like object"

Then apply the following fix on /usr/lib/python3.7/site-packages/bcc/__init__.py

diff --git a/src/python/bcc/__init__.py b/src/python/bcc/__init__.py
index 6f114de8..bff5f282 100644
--- a/src/python/bcc/__init__.py
+++ b/src/python/bcc/__init__.py
@@ -769,7 +769,7 @@ class BPF(object):
                 evt_dir = os.path.join(cat_dir, event)
                 if os.path.isdir(evt_dir):
                     tp = ("%s:%s" % (category, event))
-                    if re.match(tp_re, tp):
+                    if re.match(tp_re.decode(), tp):
                         results.append(tp)
         return results

Thanks,
Ming


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-29  0:57                                                                                   ` Ming Lei
  2019-11-29  2:35                                                                                     ` Ming Lei
@ 2019-11-29 11:44                                                                                     ` Bernd Schubert
  2019-12-02  7:01                                                                                       ` Andrea Vai
  1 sibling, 1 reply; 102+ messages in thread
From: Bernd Schubert @ 2019-11-29 11:44 UTC (permalink / raw)
  To: Ming Lei, Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

>> Trace attached. Produced by: start the trace script
>> (with the pendrive already plugged), wait some seconds, run the test
>> (1 trial, 1 GB), wait for the test to finish, stop the trace.
>>
>> The copy took 73 seconds, roughly as already seen before with the fast
>> old kernel.
> 
> This trace shows a good write IO order because the writeback IOs are
> queued to block layer serially from the 'cp' task and writeback wq.
> 
> However, writeback IO order is changed in current linus tree because
> the IOs are queued to block layer concurrently from the 'cp' task
> and writeback wq. It might be related with killing queue_congestion
> by blk-mq.

What about using direct-io to ensure order is guaranteed? Pity that 'cp'
doesn't seem to have an option for it. But dd should do the trick.
Andrea, can you replace cp with a dd command (on the slow kernel)?

dd if=<path-to-src-file> of=<path-to-copy-on-flash-device> bs=1M
oflag=direct

 - Bernd

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-29  2:35                                                                                     ` Ming Lei
@ 2019-11-29 14:41                                                                                       ` Andrea Vai
  2019-12-03  2:23                                                                                         ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-11-29 14:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> 
> > [...]
> 
> > Andrea, can you collect the following log when running the test
> > on current new(bad) kernel?
> > 
> > 	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request
> 
> Instead, please run the following trace, given insert may be
> called from other paths, such as flush plug:
> 
> 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert

Attached, for new (patched) bad kernel.

Produced by: start the trace script (with the pendrive already
plugged), wait some seconds, run the test (1 trial, 1 GB), wait for
the test to finish, stop the trace.

The copy took ~1700 seconds.

Thanks,
Andrea

[-- Attachment #2: log_ming_20191129_150609.zip --]
[-- Type: application/zip, Size: 2996 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-29 11:44                                                                                     ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Bernd Schubert
@ 2019-12-02  7:01                                                                                       ` Andrea Vai
  0 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-12-02  7:01 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list

On 29/11/19 12:44:53, Bernd Schubert wrote:
> >> Trace attached. Produced by: start the trace script
> >> (with the pendrive already plugged), wait some seconds, run the test
> >> (1 trial, 1 GB), wait for the test to finish, stop the trace.
> >>
> >> The copy took 73 seconds, roughly as already seen before with the fast
> >> old kernel.
> > 
> > This trace shows a good write IO order because the writeback IOs are
> > queued to block layer serially from the 'cp' task and writeback wq.
> > 
> > However, writeback IO order is changed in current linus tree because
> > the IOs are queued to block layer concurrently from the 'cp' task
> > and writeback wq. It might be related with killing queue_congestion
> > by blk-mq.
> 
> What about using direct-io to ensure order is guaranteed? Pity that 'cp'
> doesn't seem to have an option for it. But dd should do the trick.
> Andrea, can you replace cp with a dd command (on the slow kernel)?
> 
> dd if=<path-to-src-file> of=<path-to-copy-on-flash-device> bs=1M
> oflag=direct

On the "new bad patched" kernel, this command take 68 seconds to complete (mean on 100 trials, with a narrow standard deviation), so perfectly
aligned with the cp command on the old fast good kernel.

Thanks, and bye
Andrea

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-11-29 14:41                                                                                       ` Andrea Vai
@ 2019-12-03  2:23                                                                                         ` Ming Lei
  2019-12-10  7:35                                                                                           ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-03  2:23 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > 
> > > [...]
> > 
> > > Andrea, can you collect the following log when running the test
> > > on current new(bad) kernel?
> > > 
> > > 	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request
> > 
> > Instead, please run the following trace, given insert may be
> > called from other paths, such as flush plug:
> > 
> > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> 
> Attached, for new (patched) bad kernel.
> 
> Produced by: start the trace script (with the pendrive already
> plugged), wait some seconds, run the test (1 trial, 1 GB), wait for
> the test to finish, stop the trace.
> 
> The copy took ~1700 seconds.

See the two path[1][2] of inserting request, and path[1] is triggered
4358 times, and the path[2] is triggered 5763 times.

The path[2] is expected behaviour. Not sure path [1] is correct, given
ext4_release_file() is supposed to be called when this inode is
released. That means the file is closed 4358 times during 1GB file
copying to usb storage.

Cc filesystem list.


[1] insert requests when returning to user mode from syscall

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_writepages'
  b'do_writepages'
  b'__filemap_fdatawrite_range'
  b'ext4_release_file'
  b'__fput'
  b'task_work_run'
  b'exit_to_usermode_loop'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
    4358

[2] insert requests from writeback wq context

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_bio_write_page'
  b'mpage_submit_page'
  b'mpage_process_page_bufs'
  b'mpage_prepare_extent_to_map'
  b'ext4_writepages'
  b'do_writepages'
  b'__writeback_single_inode'
  b'writeback_sb_inodes'
  b'__writeback_inodes_wb'
  b'wb_writeback'
  b'wb_workfn'
  b'process_one_work'
  b'worker_thread'
  b'kthread'
  b'ret_from_fork'
    5763

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-03  2:23                                                                                         ` Ming Lei
@ 2019-12-10  7:35                                                                                           ` Andrea Vai
  2019-12-10  8:05                                                                                             ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-10  7:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > 
> > > > [...]
> > > 
> > > > Andrea, can you collect the following log when running the
> test
> > > > on current new(bad) kernel?
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount  -K
> blk_mq_make_request
> > > 
> > > Instead, please run the following trace, given insert may be
> > > called from other paths, such as flush plug:
> > > 
> > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > 
> > Attached, for new (patched) bad kernel.
> > 
> > Produced by: start the trace script (with the pendrive already
> > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> for
> > the test to finish, stop the trace.
> > 
> > The copy took ~1700 seconds.
> 
> See the two path[1][2] of inserting request, and path[1] is
> triggered
> 4358 times, and the path[2] is triggered 5763 times.
> 
> The path[2] is expected behaviour. Not sure path [1] is correct,
> given
> ext4_release_file() is supposed to be called when this inode is
> released. That means the file is closed 4358 times during 1GB file
> copying to usb storage.
> 
> Cc filesystem list.
> 
> 
> [1] insert requests when returning to user mode from syscall
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     4358
> 
> [2] insert requests from writeback wq context
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_bio_write_page'
>   b'mpage_submit_page'
>   b'mpage_process_page_bufs'
>   b'mpage_prepare_extent_to_map'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__writeback_single_inode'
>   b'writeback_sb_inodes'
>   b'__writeback_inodes_wb'
>   b'wb_writeback'
>   b'wb_workfn'
>   b'process_one_work'
>   b'worker_thread'
>   b'kthread'
>   b'ret_from_fork'
>     5763
> 
> Thanks,
> Ming
> 

Is there any update on this? Sorry if I am making noise, but I would
like to help to improve the kernel (or fix it) if I can help.
Otherwise, please let me know how to consider this case,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  7:35                                                                                           ` Andrea Vai
@ 2019-12-10  8:05                                                                                             ` Ming Lei
  2019-12-11  2:41                                                                                               ` Theodore Y. Ts'o
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-10  8:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Tue, Dec 10, 2019 at 08:35:43AM +0100, Andrea Vai wrote:
> Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > > 
> > > > > [...]
> > > > 
> > > > > Andrea, can you collect the following log when running the
> > test
> > > > > on current new(bad) kernel?
> > > > > 
> > > > > 	/usr/share/bcc/tools/stackcount  -K
> > blk_mq_make_request
> > > > 
> > > > Instead, please run the following trace, given insert may be
> > > > called from other paths, such as flush plug:
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > > 
> > > Attached, for new (patched) bad kernel.
> > > 
> > > Produced by: start the trace script (with the pendrive already
> > > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> > for
> > > the test to finish, stop the trace.
> > > 
> > > The copy took ~1700 seconds.
> > 
> > See the two path[1][2] of inserting request, and path[1] is
> > triggered
> > 4358 times, and the path[2] is triggered 5763 times.
> > 
> > The path[2] is expected behaviour. Not sure path [1] is correct,
> > given
> > ext4_release_file() is supposed to be called when this inode is
> > released. That means the file is closed 4358 times during 1GB file
> > copying to usb storage.
> > 
> > Cc filesystem list.
> > 
> > 
> > [1] insert requests when returning to user mode from syscall
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     4358
> > 
> > [2] insert requests from writeback wq context
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_bio_write_page'
> >   b'mpage_submit_page'
> >   b'mpage_process_page_bufs'
> >   b'mpage_prepare_extent_to_map'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__writeback_single_inode'
> >   b'writeback_sb_inodes'
> >   b'__writeback_inodes_wb'
> >   b'wb_writeback'
> >   b'wb_workfn'
> >   b'process_one_work'
> >   b'worker_thread'
> >   b'kthread'
> >   b'ret_from_fork'
> >     5763
> > 
> > Thanks,
> > Ming
> > 
> 
> Is there any update on this? Sorry if I am making noise, but I would
> like to help to improve the kernel (or fix it) if I can help.
> Otherwise, please let me know how to consider this case,

IMO, the extra write path from exit_to_usermode_loop() isn't expected,
that should be the reason why write IO order is changed, then performance
drops on your USB storage.

We need our fs/ext4 experts to take a look.

Or can you reproduce the issue on xfs or btrfs?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  8:05                                                                                             ` Ming Lei
@ 2019-12-11  2:41                                                                                               ` Theodore Y. Ts'o
  2019-12-11  4:00                                                                                                 ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-11  2:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > given
> > > ext4_release_file() is supposed to be called when this inode is
> > > released. That means the file is closed 4358 times during 1GB file
> > > copying to usb storage.
> > > 
> > > [1] insert requests when returning to user mode from syscall
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     4358

I'm guessing that your workload is repeatedly truncating a file (or
calling open with O_TRUNC) and then writing data to it.  When you do
this, then when the file is closed, we assume that since you were
replacing the previous contents of a file with new contents, that you
would be unhappy if the file contents was replaced by a zero length
file after a crash.  That's because ten years, ago there were a *huge*
number of crappy applications that would replace a file by reading it
into memory, truncating it, and then write out the new contents of the
file.  This could be a high score file for a game, or a KDE or GNOME
state file, etc.

So if someone does open, truncate, write, close, we still immediately
writing out the data on the close, assuming that the programmer really
wanted open, truncate, write, fsync, close, but was too careless to
actually do the right thing.

Some workaround[1] like this is done by all of the major file systems,
and was fallout the agreement from the "O_PONIES"[2] controversy.
This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
the "rename, fsync, and ponies" section.)

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
[2] https://blahg.josefsipek.net/?p=364
[3] https://lwn.net/Articles/327601/

So if you're seeing a call to filemap_fdatawrite_range as the result
of a fput, that's why.

In any case, this behavior has been around for a decade, and it
appears to be incidental to your performance difficulties with your
USB thumbdrive and block-mq.

						- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11  2:41                                                                                               ` Theodore Y. Ts'o
@ 2019-12-11  4:00                                                                                                 ` Ming Lei
  2019-12-11 16:07                                                                                                   ` Theodore Y. Ts'o
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-11  4:00 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 10, 2019 at 09:41:37PM -0500, Theodore Y. Ts'o wrote:
> On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > > given
> > > > ext4_release_file() is supposed to be called when this inode is
> > > > released. That means the file is closed 4358 times during 1GB file
> > > > copying to usb storage.
> > > > 
> > > > [1] insert requests when returning to user mode from syscall
> > > > 
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'dd_insert_requests'
> > > >   b'blk_mq_sched_insert_requests'
> > > >   b'blk_mq_flush_plug_list'
> > > >   b'blk_flush_plug_list'
> > > >   b'io_schedule_prepare'
> > > >   b'io_schedule'
> > > >   b'rq_qos_wait'
> > > >   b'wbt_wait'
> > > >   b'__rq_qos_throttle'
> > > >   b'blk_mq_make_request'
> > > >   b'generic_make_request'
> > > >   b'submit_bio'
> > > >   b'ext4_io_submit'
> > > >   b'ext4_writepages'
> > > >   b'do_writepages'
> > > >   b'__filemap_fdatawrite_range'
> > > >   b'ext4_release_file'
> > > >   b'__fput'
> > > >   b'task_work_run'
> > > >   b'exit_to_usermode_loop'
> > > >   b'do_syscall_64'
> > > >   b'entry_SYSCALL_64_after_hwframe'
> > > >     4358
> 
> I'm guessing that your workload is repeatedly truncating a file (or
> calling open with O_TRUNC) and then writing data to it.  When you do
> this, then when the file is closed, we assume that since you were
> replacing the previous contents of a file with new contents, that you
> would be unhappy if the file contents was replaced by a zero length
> file after a crash.  That's because ten years, ago there were a *huge*
> number of crappy applications that would replace a file by reading it
> into memory, truncating it, and then write out the new contents of the
> file.  This could be a high score file for a game, or a KDE or GNOME
> state file, etc.
> 
> So if someone does open, truncate, write, close, we still immediately
> writing out the data on the close, assuming that the programmer really
> wanted open, truncate, write, fsync, close, but was too careless to
> actually do the right thing.
> 
> Some workaround[1] like this is done by all of the major file systems,
> and was fallout the agreement from the "O_PONIES"[2] controversy.
> This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
> the "rename, fsync, and ponies" section.)
> 
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
> [2] https://blahg.josefsipek.net/?p=364
> [3] https://lwn.net/Articles/327601/
> 
> So if you're seeing a call to filemap_fdatawrite_range as the result
> of a fput, that's why.
> 
> In any case, this behavior has been around for a decade, and it
> appears to be incidental to your performance difficulties with your
> USB thumbdrive and block-mq.

I didn't reproduce the issue in my test environment, and follows
Andrea's test commands[1]:

  mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
  SECONDS=0
  cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
  umount /mnt/pendrive 2>&1 |tee -a $logfile

The 'cp' command supposes to open/close the file just once, however
ext4_release_file() & write pages is observed to run for 4358 times
when executing the above 'cp' test.


[1] https://marc.info/?l=linux-kernel&m=157486689806734&w=2


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11  4:00                                                                                                 ` Ming Lei
@ 2019-12-11 16:07                                                                                                   ` Theodore Y. Ts'o
  2019-12-11 21:33                                                                                                     ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-11 16:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> I didn't reproduce the issue in my test environment, and follows
> Andrea's test commands[1]:
> 
>   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
>   SECONDS=0
>   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
>   umount /mnt/pendrive 2>&1 |tee -a $logfile
> 
> The 'cp' command supposes to open/close the file just once, however
> ext4_release_file() & write pages is observed to run for 4358 times
> when executing the above 'cp' test.

Why are we sure the ext4_release_file() / _fput() is coming from the
cp command, as opposed to something else that might be running on the
system under test?  _fput() is called by the kernel when the last
reference to a struct file is released.  (Specifically, if you have a
fd which is dup'ed, it's only when the last fd corresponding to the
struct file is closed, and the struct file is about to be released,
does the file system's f_ops->release function get called.)

So the first question I'd ask is whether there is anything else going
on the system, and whether the writes are happening to the USB thumb
drive, or to some other storage device.  And if there is something
else which is writing to the pendrive, maybe that's why no one else
has been able to reproduce the OP's complaint....

    	      	 	    	     - Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 16:07                                                                                                   ` Theodore Y. Ts'o
@ 2019-12-11 21:33                                                                                                     ` Ming Lei
  2019-12-12  7:34                                                                                                       ` Andrea Vai
  2019-12-18  8:25                                                                                                       ` Andrea Vai
  0 siblings, 2 replies; 102+ messages in thread
From: Ming Lei @ 2019-12-11 21:33 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > I didn't reproduce the issue in my test environment, and follows
> > Andrea's test commands[1]:
> > 
> >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> >   SECONDS=0
> >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > 
> > The 'cp' command supposes to open/close the file just once, however
> > ext4_release_file() & write pages is observed to run for 4358 times
> > when executing the above 'cp' test.
> 
> Why are we sure the ext4_release_file() / _fput() is coming from the
> cp command, as opposed to something else that might be running on the
> system under test?  _fput() is called by the kernel when the last

Please see the log:

https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip

Which is collected by:

#!/bin/sh
MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'

$MAJ:$MIN points to the USB storage disk.

From the above IO trace, there are two write paths, one is from cp,
another is from writeback wq.

The stackcount trace[1] is consistent with the IO trace log since it
only shows two IO paths, that is why I concluded that the write done via
ext4_release_file() is from 'cp'.

[1] https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip

> reference to a struct file is released.  (Specifically, if you have a
> fd which is dup'ed, it's only when the last fd corresponding to the
> struct file is closed, and the struct file is about to be released,
> does the file system's f_ops->release function get called.)
> 
> So the first question I'd ask is whether there is anything else going
> on the system, and whether the writes are happening to the USB thumb
> drive, or to some other storage device.  And if there is something
> else which is writing to the pendrive, maybe that's why no one else
> has been able to reproduce the OP's complaint....

OK, we can ask Andrea to confirm that via the following trace, which
will add pid/comm info in the stack trace:

/usr/share/bcc/tools/stackcount  blk_mq_sched_request_inserted

Andrew, could you collect the above log again when running new/bad
kernel for confirming if the write done by ext4_release_file() is from
the 'cp' process?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 21:33                                                                                                     ` Ming Lei
@ 2019-12-12  7:34                                                                                                       ` Andrea Vai
  2019-12-18  8:25                                                                                                       ` Andrea Vai
  1 sibling, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-12-12  7:34 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > I didn't reproduce the issue in my test environment, and follows
> > > Andrea's test commands[1]:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > 
> > > The 'cp' command supposes to open/close the file just once,
> however
> > > ext4_release_file() & write pages is observed to run for 4358
> times
> > > when executing the above 'cp' test.
> > 
> > Why are we sure the ext4_release_file() / _fput() is coming from
> the
> > cp command, as opposed to something else that might be running on
> the
> > system under test?  _fput() is called by the kernel when the last
> 
> Please see the log:
> 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> 
> Which is collected by:
> 
> #!/bin/sh
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> $MAJ:$MIN points to the USB storage disk.
> 
> From the above IO trace, there are two write paths, one is from cp,
> another is from writeback wq.
> 
> The stackcount trace[1] is consistent with the IO trace log since it
> only shows two IO paths, that is why I concluded that the write done
> via
> ext4_release_file() is from 'cp'.
> 
> [1] 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> 
> > reference to a struct file is released.  (Specifically, if you
> have a
> > fd which is dup'ed, it's only when the last fd corresponding to
> the
> > struct file is closed, and the struct file is about to be
> released,
> > does the file system's f_ops->release function get called.)
> > 
> > So the first question I'd ask is whether there is anything else
> going
> > on the system, and whether the writes are happening to the USB
> thumb
> > drive, or to some other storage device.  And if there is something
> > else which is writing to the pendrive, maybe that's why no one
> else
> > has been able to reproduce the OP's complaint....
> 
> OK, we can ask Andrea to confirm that via the following trace, which
> will add pid/comm info in the stack trace:
> 
> /usr/share/bcc/tools/stackcount  blk_mq_sched_request_inserted
> 
> Andrea, could you collect the above log again when running new/bad
> kernel for confirming if the write done by ext4_release_file() is
> from
> the 'cp' process?

Yes, I will try to do it as soon as possible and let you know.
I will also try xfs or btrfs, as you suggested in another message.

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 21:33                                                                                                     ` Ming Lei
  2019-12-12  7:34                                                                                                       ` Andrea Vai
@ 2019-12-18  8:25                                                                                                       ` Andrea Vai
  2019-12-18  9:48                                                                                                         ` Ming Lei
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-18  8:25 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 3636 bytes --]

Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > I didn't reproduce the issue in my test environment, and follows
> > > Andrea's test commands[1]:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > 
> > > The 'cp' command supposes to open/close the file just once,
> however
> > > ext4_release_file() & write pages is observed to run for 4358
> times
> > > when executing the above 'cp' test.
> > 
> > Why are we sure the ext4_release_file() / _fput() is coming from
> the
> > cp command, as opposed to something else that might be running on
> the
> > system under test?  _fput() is called by the kernel when the last
> 
> Please see the log:
> 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> 
> Which is collected by:
> 
> #!/bin/sh
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> $MAJ:$MIN points to the USB storage disk.
> 
> From the above IO trace, there are two write paths, one is from cp,
> another is from writeback wq.
> 
> The stackcount trace[1] is consistent with the IO trace log since it
> only shows two IO paths, that is why I concluded that the write done
> via
> ext4_release_file() is from 'cp'.
> 
> [1] 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> 
> > reference to a struct file is released.  (Specifically, if you
> have a
> > fd which is dup'ed, it's only when the last fd corresponding to
> the
> > struct file is closed, and the struct file is about to be
> released,
> > does the file system's f_ops->release function get called.)
> > 
> > So the first question I'd ask is whether there is anything else
> going
> > on the system, and whether the writes are happening to the USB
> thumb
> > drive, or to some other storage device.  And if there is something
> > else which is writing to the pendrive, maybe that's why no one
> else
> > has been able to reproduce the OP's complaint....
> 
> OK, we can ask Andrea to confirm that via the following trace, which
> will add pid/comm info in the stack trace:
> 
> /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> 
> Andrew, could you collect the above log again when running new/bad
> kernel for confirming if the write done by ext4_release_file() is
> from
> the 'cp' process?

You can find the stackcount log attached. It has been produced by:

- /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted > trace.log
- wait some seconds
- run the test (1 copy trial), wait for the test to finish, wait some seconds
- stop the trace (ctrl+C)

The test took 1994 seconds to complete.

I also tried the usual test with btrfs and xfs. Btrfs behavior looks
"good". xfs seems sometimes better, sometimes worse, I would say. I
don't know if it matters, anyway you can also find the results of the
two tests (100 trials each). Basically, btrfs is always between 68 and
89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
almost always very good (63-65s), but sometimes "bad" (>300s).

Thanks,
Andrea

[-- Attachment #2: test_btrfs_20191217.txt --]
[-- Type: text/plain, Size: 11367 bytes --]

*** test btrfs *** -> test_btrfs_20191217.txt

Starting 100 tries with:
Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
-rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
/dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
/dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
/dev/sdf1: LABEL="BAK_ANDVAI" UUID="6ddfec28-3d9a-4676-a726-927fd3fb21e7" UUID_SUB="581c69ab-6758-4662-999a-b6dfe6ee5e69" TYPE="btrfs" PARTUUID="09066b88-01"
/dev/sdg1: LABEL="BAK_ANDVAI" UUID="df777e33-8890-4cee-a718-42233f4cafae" TYPE="ext4" PARTUUID="75265898-01"
cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
Inizio: mar 17 dic 2019, 13:31:00, CET...fine: mar 17 dic 2019, 13:32:26, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:32:26, CET...fine: mar 17 dic 2019, 13:33:36, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 13:33:36, CET...fine: mar 17 dic 2019, 13:35:02, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:35:02, CET...fine: mar 17 dic 2019, 13:36:14, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 13:36:14, CET...fine: mar 17 dic 2019, 13:37:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:37:40, CET...fine: mar 17 dic 2019, 13:38:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:38:51, CET...fine: mar 17 dic 2019, 13:40:18, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 13:40:18, CET...fine: mar 17 dic 2019, 13:41:29, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:41:29, CET...fine: mar 17 dic 2019, 13:42:58, CET --> ci ho messo 89 secondi!
Inizio: mar 17 dic 2019, 13:42:58, CET...fine: mar 17 dic 2019, 13:44:11, CET --> ci ho messo 73 secondi!
Inizio: mar 17 dic 2019, 13:44:11, CET...fine: mar 17 dic 2019, 13:45:40, CET --> ci ho messo 89 secondi!
Inizio: mar 17 dic 2019, 13:45:40, CET...fine: mar 17 dic 2019, 13:46:49, CET --> ci ho messo 69 secondi!
Inizio: mar 17 dic 2019, 13:46:49, CET...fine: mar 17 dic 2019, 13:48:16, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 13:48:16, CET...fine: mar 17 dic 2019, 13:49:27, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:49:27, CET...fine: mar 17 dic 2019, 13:50:53, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:50:53, CET...fine: mar 17 dic 2019, 13:52:04, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:52:04, CET...fine: mar 17 dic 2019, 13:53:30, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:53:30, CET...fine: mar 17 dic 2019, 13:54:41, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:54:41, CET...fine: mar 17 dic 2019, 13:56:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:56:07, CET...fine: mar 17 dic 2019, 13:57:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:57:18, CET...fine: mar 17 dic 2019, 13:58:46, CET --> ci ho messo 88 secondi!
Inizio: mar 17 dic 2019, 13:58:46, CET...fine: mar 17 dic 2019, 13:59:57, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:59:57, CET...fine: mar 17 dic 2019, 14:01:23, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:01:23, CET...fine: mar 17 dic 2019, 14:02:34, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:02:34, CET...fine: mar 17 dic 2019, 14:04:00, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:04:00, CET...fine: mar 17 dic 2019, 14:05:11, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:05:11, CET...fine: mar 17 dic 2019, 14:06:38, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 14:06:38, CET...fine: mar 17 dic 2019, 14:07:49, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:07:49, CET...fine: mar 17 dic 2019, 14:09:15, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:09:15, CET...fine: mar 17 dic 2019, 14:10:26, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:10:26, CET...fine: mar 17 dic 2019, 14:11:52, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:11:52, CET...fine: mar 17 dic 2019, 14:13:03, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:13:03, CET...fine: mar 17 dic 2019, 14:14:29, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:14:29, CET...fine: mar 17 dic 2019, 14:15:41, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:15:41, CET...fine: mar 17 dic 2019, 14:17:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:17:07, CET...fine: mar 17 dic 2019, 14:18:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:18:18, CET...fine: mar 17 dic 2019, 14:19:44, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:19:44, CET...fine: mar 17 dic 2019, 14:20:55, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:20:55, CET...fine: mar 17 dic 2019, 14:22:21, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:22:21, CET...fine: mar 17 dic 2019, 14:23:33, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:23:33, CET...fine: mar 17 dic 2019, 14:24:59, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:24:59, CET...fine: mar 17 dic 2019, 14:26:10, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:26:10, CET...fine: mar 17 dic 2019, 14:27:36, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:27:36, CET...fine: mar 17 dic 2019, 14:28:46, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 14:28:46, CET...fine: mar 17 dic 2019, 14:30:12, CET --> ci ho messo 85 secondi!
Inizio: mar 17 dic 2019, 14:30:12, CET...fine: mar 17 dic 2019, 14:31:23, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:31:23, CET...fine: mar 17 dic 2019, 14:32:49, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:32:49, CET...fine: mar 17 dic 2019, 14:34:01, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:34:01, CET...fine: mar 17 dic 2019, 14:35:27, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:35:27, CET...fine: mar 17 dic 2019, 14:36:38, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:36:38, CET...fine: mar 17 dic 2019, 14:38:04, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:38:04, CET...fine: mar 17 dic 2019, 14:39:15, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:39:15, CET...fine: mar 17 dic 2019, 14:40:41, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:40:41, CET...fine: mar 17 dic 2019, 14:41:52, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:41:52, CET...fine: mar 17 dic 2019, 14:43:18, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:43:18, CET...fine: mar 17 dic 2019, 14:44:29, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:44:29, CET...fine: mar 17 dic 2019, 14:45:55, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:45:55, CET...fine: mar 17 dic 2019, 14:47:06, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:47:06, CET...fine: mar 17 dic 2019, 14:48:32, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:48:32, CET...fine: mar 17 dic 2019, 14:49:43, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:49:43, CET...fine: mar 17 dic 2019, 14:51:10, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 14:51:10, CET...fine: mar 17 dic 2019, 14:52:23, CET --> ci ho messo 73 secondi!
Inizio: mar 17 dic 2019, 14:52:23, CET...fine: mar 17 dic 2019, 14:53:49, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:53:49, CET...fine: mar 17 dic 2019, 14:55:00, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:55:00, CET...fine: mar 17 dic 2019, 14:56:26, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:56:26, CET...fine: mar 17 dic 2019, 14:57:37, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:57:37, CET...fine: mar 17 dic 2019, 14:59:03, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:59:03, CET...fine: mar 17 dic 2019, 15:00:14, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:00:14, CET...fine: mar 17 dic 2019, 15:01:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:01:40, CET...fine: mar 17 dic 2019, 15:02:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:02:51, CET...fine: mar 17 dic 2019, 15:04:17, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:04:17, CET...fine: mar 17 dic 2019, 15:05:28, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:05:28, CET...fine: mar 17 dic 2019, 15:06:53, CET --> ci ho messo 85 secondi!
Inizio: mar 17 dic 2019, 15:06:53, CET...fine: mar 17 dic 2019, 15:08:04, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:08:04, CET...fine: mar 17 dic 2019, 15:09:30, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:09:30, CET...fine: mar 17 dic 2019, 15:10:41, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:10:41, CET...fine: mar 17 dic 2019, 15:12:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:12:07, CET...fine: mar 17 dic 2019, 15:13:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:13:18, CET...fine: mar 17 dic 2019, 15:14:44, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:14:44, CET...fine: mar 17 dic 2019, 15:15:55, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:15:55, CET...fine: mar 17 dic 2019, 15:17:21, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:17:21, CET...fine: mar 17 dic 2019, 15:18:32, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:18:32, CET...fine: mar 17 dic 2019, 15:19:58, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:19:58, CET...fine: mar 17 dic 2019, 15:21:08, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:21:08, CET...fine: mar 17 dic 2019, 15:22:34, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:22:34, CET...fine: mar 17 dic 2019, 15:23:45, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:23:45, CET...fine: mar 17 dic 2019, 15:25:11, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:25:11, CET...fine: mar 17 dic 2019, 15:26:22, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:26:22, CET...fine: mar 17 dic 2019, 15:27:48, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:27:48, CET...fine: mar 17 dic 2019, 15:28:59, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:28:59, CET...fine: mar 17 dic 2019, 15:30:25, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:30:25, CET...fine: mar 17 dic 2019, 15:31:35, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:31:35, CET...fine: mar 17 dic 2019, 15:33:03, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 15:33:03, CET...fine: mar 17 dic 2019, 15:34:14, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:34:14, CET...fine: mar 17 dic 2019, 15:35:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:35:40, CET...fine: mar 17 dic 2019, 15:36:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:36:51, CET...fine: mar 17 dic 2019, 15:38:17, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:38:17, CET...fine: mar 17 dic 2019, 15:39:28, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:39:28, CET...fine: mar 17 dic 2019, 15:40:54, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:40:54, CET...fine: mar 17 dic 2019, 15:42:05, CET --> ci ho messo 71 secondi!

[-- Attachment #3: test_xfs_20191217.txt --]
[-- Type: text/plain, Size: 11206 bytes --]

*** TEST XFS: *** -> test_xfs_20191217.txt

Starting 100 tries with:
Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
-rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
/dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
/dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
/dev/sdf1: UUID="eb5a4791-5b26-44b6-871e-efd464a3adc5" TYPE="xfs" PARTUUID="09066b88-01"
cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
Inizio: mar 17 dic 2019, 23:58:22, CET...fine: mar 17 dic 2019, 23:59:28, CET --> ci ho messo 64 secondi!
Inizio: mar 17 dic 2019, 23:59:28, CET...fine: mer 18 dic 2019, 00:00:33, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:00:33, CET...fine: mer 18 dic 2019, 00:01:39, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:01:39, CET...fine: mer 18 dic 2019, 00:06:35, CET --> ci ho messo 294 secondi!
Inizio: mer 18 dic 2019, 00:06:35, CET...fine: mer 18 dic 2019, 00:07:41, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:07:41, CET...fine: mer 18 dic 2019, 00:08:46, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:08:46, CET...fine: mer 18 dic 2019, 00:09:52, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:09:52, CET...fine: mer 18 dic 2019, 00:10:57, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:10:57, CET...fine: mer 18 dic 2019, 00:12:03, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:12:03, CET...fine: mer 18 dic 2019, 00:13:08, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:13:08, CET...fine: mer 18 dic 2019, 00:14:14, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:14:14, CET...fine: mer 18 dic 2019, 00:15:19, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:15:19, CET...fine: mer 18 dic 2019, 00:21:39, CET --> ci ho messo 379 secondi!
Inizio: mer 18 dic 2019, 00:21:39, CET...fine: mer 18 dic 2019, 00:22:44, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:22:44, CET...fine: mer 18 dic 2019, 00:23:50, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:23:50, CET...fine: mer 18 dic 2019, 00:29:16, CET --> ci ho messo 325 secondi!
Inizio: mer 18 dic 2019, 00:29:16, CET...fine: mer 18 dic 2019, 00:30:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:30:22, CET...fine: mer 18 dic 2019, 00:34:50, CET --> ci ho messo 266 secondi!
Inizio: mer 18 dic 2019, 00:34:50, CET...fine: mer 18 dic 2019, 00:35:56, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:35:56, CET...fine: mer 18 dic 2019, 00:37:01, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:37:01, CET...fine: mer 18 dic 2019, 00:43:39, CET --> ci ho messo 397 secondi!
Inizio: mer 18 dic 2019, 00:43:39, CET...fine: mer 18 dic 2019, 00:48:31, CET --> ci ho messo 291 secondi!
Inizio: mer 18 dic 2019, 00:48:31, CET...fine: mer 18 dic 2019, 00:49:37, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:49:37, CET...fine: mer 18 dic 2019, 00:50:42, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:50:42, CET...fine: mer 18 dic 2019, 00:55:39, CET --> ci ho messo 296 secondi!
Inizio: mer 18 dic 2019, 00:55:39, CET...fine: mer 18 dic 2019, 00:56:44, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:56:44, CET...fine: mer 18 dic 2019, 00:57:50, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:57:50, CET...fine: mer 18 dic 2019, 00:58:54, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:58:54, CET...fine: mer 18 dic 2019, 01:00:01, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:00:01, CET...fine: mer 18 dic 2019, 01:01:05, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:01:05, CET...fine: mer 18 dic 2019, 01:02:11, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:02:11, CET...fine: mer 18 dic 2019, 01:03:16, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:03:16, CET...fine: mer 18 dic 2019, 01:04:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:04:22, CET...fine: mer 18 dic 2019, 01:05:27, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:05:27, CET...fine: mer 18 dic 2019, 01:11:38, CET --> ci ho messo 369 secondi!
Inizio: mer 18 dic 2019, 01:11:38, CET...fine: mer 18 dic 2019, 01:12:43, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:12:43, CET...fine: mer 18 dic 2019, 01:18:20, CET --> ci ho messo 336 secondi!
Inizio: mer 18 dic 2019, 01:18:20, CET...fine: mer 18 dic 2019, 01:19:25, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:19:25, CET...fine: mer 18 dic 2019, 01:21:01, CET --> ci ho messo 95 secondi!
Inizio: mer 18 dic 2019, 01:21:01, CET...fine: mer 18 dic 2019, 01:22:06, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:22:06, CET...fine: mer 18 dic 2019, 01:23:12, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:23:12, CET...fine: mer 18 dic 2019, 01:29:43, CET --> ci ho messo 390 secondi!
Inizio: mer 18 dic 2019, 01:29:43, CET...fine: mer 18 dic 2019, 01:30:49, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:30:49, CET...fine: mer 18 dic 2019, 01:35:36, CET --> ci ho messo 285 secondi!
Inizio: mer 18 dic 2019, 01:35:36, CET...fine: mer 18 dic 2019, 01:36:44, CET --> ci ho messo 66 secondi!
Inizio: mer 18 dic 2019, 01:36:44, CET...fine: mer 18 dic 2019, 01:37:48, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:37:48, CET...fine: mer 18 dic 2019, 01:38:55, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:38:55, CET...fine: mer 18 dic 2019, 01:44:04, CET --> ci ho messo 308 secondi!
Inizio: mer 18 dic 2019, 01:44:04, CET...fine: mer 18 dic 2019, 01:45:10, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:45:10, CET...fine: mer 18 dic 2019, 01:49:42, CET --> ci ho messo 270 secondi!
Inizio: mer 18 dic 2019, 01:49:42, CET...fine: mer 18 dic 2019, 01:50:48, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:50:48, CET...fine: mer 18 dic 2019, 01:51:53, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:51:53, CET...fine: mer 18 dic 2019, 01:58:18, CET --> ci ho messo 383 secondi!
Inizio: mer 18 dic 2019, 01:58:18, CET...fine: mer 18 dic 2019, 01:59:23, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:59:23, CET...fine: mer 18 dic 2019, 02:00:29, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:00:29, CET...fine: mer 18 dic 2019, 02:01:34, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:01:34, CET...fine: mer 18 dic 2019, 02:02:40, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:02:40, CET...fine: mer 18 dic 2019, 02:03:45, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:03:45, CET...fine: mer 18 dic 2019, 02:04:51, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:04:51, CET...fine: mer 18 dic 2019, 02:05:56, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:05:56, CET...fine: mer 18 dic 2019, 02:07:02, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:07:02, CET...fine: mer 18 dic 2019, 02:08:07, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:08:07, CET...fine: mer 18 dic 2019, 02:09:14, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:09:14, CET...fine: mer 18 dic 2019, 02:13:44, CET --> ci ho messo 269 secondi!
Inizio: mer 18 dic 2019, 02:13:44, CET...fine: mer 18 dic 2019, 02:14:51, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:14:51, CET...fine: mer 18 dic 2019, 02:15:56, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:15:56, CET...fine: mer 18 dic 2019, 02:17:02, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:17:02, CET...fine: mer 18 dic 2019, 02:18:07, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:18:07, CET...fine: mer 18 dic 2019, 02:19:13, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:19:13, CET...fine: mer 18 dic 2019, 02:20:18, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:20:18, CET...fine: mer 18 dic 2019, 02:21:24, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:21:24, CET...fine: mer 18 dic 2019, 02:22:29, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:22:29, CET...fine: mer 18 dic 2019, 02:23:35, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:23:35, CET...fine: mer 18 dic 2019, 02:24:40, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:24:40, CET...fine: mer 18 dic 2019, 02:25:46, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:25:46, CET...fine: mer 18 dic 2019, 02:26:51, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:26:51, CET...fine: mer 18 dic 2019, 02:27:57, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:27:57, CET...fine: mer 18 dic 2019, 02:29:02, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:29:02, CET...fine: mer 18 dic 2019, 02:30:08, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:30:08, CET...fine: mer 18 dic 2019, 02:31:13, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:31:13, CET...fine: mer 18 dic 2019, 02:32:19, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:32:19, CET...fine: mer 18 dic 2019, 02:40:06, CET --> ci ho messo 465 secondi!
Inizio: mer 18 dic 2019, 02:40:06, CET...fine: mer 18 dic 2019, 02:41:12, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:41:12, CET...fine: mer 18 dic 2019, 02:42:17, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:42:17, CET...fine: mer 18 dic 2019, 02:43:23, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:43:23, CET...fine: mer 18 dic 2019, 02:44:28, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:44:28, CET...fine: mer 18 dic 2019, 02:45:34, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:45:34, CET...fine: mer 18 dic 2019, 02:46:39, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:46:39, CET...fine: mer 18 dic 2019, 02:47:45, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:47:45, CET...fine: mer 18 dic 2019, 02:48:50, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:48:50, CET...fine: mer 18 dic 2019, 02:54:26, CET --> ci ho messo 334 secondi!
Inizio: mer 18 dic 2019, 02:54:26, CET...fine: mer 18 dic 2019, 02:55:31, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:55:31, CET...fine: mer 18 dic 2019, 02:57:11, CET --> ci ho messo 98 secondi!
Inizio: mer 18 dic 2019, 02:57:11, CET...fine: mer 18 dic 2019, 02:58:16, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:58:16, CET...fine: mer 18 dic 2019, 02:59:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:59:22, CET...fine: mer 18 dic 2019, 03:00:27, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 03:00:27, CET...fine: mer 18 dic 2019, 03:05:55, CET --> ci ho messo 326 secondi!
Inizio: mer 18 dic 2019, 03:05:55, CET...fine: mer 18 dic 2019, 03:11:49, CET --> ci ho messo 352 secondi!
Inizio: mer 18 dic 2019, 03:11:49, CET...fine: mer 18 dic 2019, 03:12:56, CET --> ci ho messo 66 secondi!
Inizio: mer 18 dic 2019, 03:12:56, CET...fine: mer 18 dic 2019, 03:14:01, CET --> ci ho messo 63 secondi!


[-- Attachment #4: trace_20191217.zip --]
[-- Type: application/zip, Size: 14385 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-18  8:25                                                                                                       ` Andrea Vai
@ 2019-12-18  9:48                                                                                                         ` Ming Lei
       [not found]                                                                                                           ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-18  9:48 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > I didn't reproduce the issue in my test environment, and follows
> > > > Andrea's test commands[1]:
> > > > 
> > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > >   SECONDS=0
> > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > 
> > > > The 'cp' command supposes to open/close the file just once,
> > however
> > > > ext4_release_file() & write pages is observed to run for 4358
> > times
> > > > when executing the above 'cp' test.
> > > 
> > > Why are we sure the ext4_release_file() / _fput() is coming from
> > the
> > > cp command, as opposed to something else that might be running on
> > the
> > > system under test?  _fput() is called by the kernel when the last
> > 
> > Please see the log:
> > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > 
> > Which is collected by:
> > 
> > #!/bin/sh
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > $MAJ:$MIN points to the USB storage disk.
> > 
> > From the above IO trace, there are two write paths, one is from cp,
> > another is from writeback wq.
> > 
> > The stackcount trace[1] is consistent with the IO trace log since it
> > only shows two IO paths, that is why I concluded that the write done
> > via
> > ext4_release_file() is from 'cp'.
> > 
> > [1] 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > 
> > > reference to a struct file is released.  (Specifically, if you
> > have a
> > > fd which is dup'ed, it's only when the last fd corresponding to
> > the
> > > struct file is closed, and the struct file is about to be
> > released,
> > > does the file system's f_ops->release function get called.)
> > > 
> > > So the first question I'd ask is whether there is anything else
> > going
> > > on the system, and whether the writes are happening to the USB
> > thumb
> > > drive, or to some other storage device.  And if there is something
> > > else which is writing to the pendrive, maybe that's why no one
> > else
> > > has been able to reproduce the OP's complaint....
> > 
> > OK, we can ask Andrea to confirm that via the following trace, which
> > will add pid/comm info in the stack trace:
> > 
> > /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> > 
> > Andrew, could you collect the above log again when running new/bad
> > kernel for confirming if the write done by ext4_release_file() is
> > from
> > the 'cp' process?
> 
> You can find the stackcount log attached. It has been produced by:
> 
> - /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted > trace.log
> - wait some seconds
> - run the test (1 copy trial), wait for the test to finish, wait some seconds
> - stop the trace (ctrl+C)

Thanks for collecting the log, looks your 'stackcount' doesn't include
comm/pid info, seems there is difference between your bcc and
my bcc in fedora 30.

Could you collect above log again via the following command?

/usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert

which will show the comm/pid info.

Sorry for not seeing the bcc difference.

> 
> The test took 1994 seconds to complete.
> 
> I also tried the usual test with btrfs and xfs. Btrfs behavior looks
> "good". xfs seems sometimes better, sometimes worse, I would say. I
> don't know if it matters, anyway you can also find the results of the
> two tests (100 trials each). Basically, btrfs is always between 68 and
> 89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
> almost always very good (63-65s), but sometimes "bad" (>300s).

If you are interested in digging into this one, the following trace
should be helpful:

https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6


Thanks,
Ming

> 
> Thanks,
> Andrea

> *** test btrfs *** -> test_btrfs_20191217.txt
> 
> Starting 100 tries with:
> Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
> -rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
> /dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
> /dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
> /dev/sdf1: LABEL="BAK_ANDVAI" UUID="6ddfec28-3d9a-4676-a726-927fd3fb21e7" UUID_SUB="581c69ab-6758-4662-999a-b6dfe6ee5e69" TYPE="btrfs" PARTUUID="09066b88-01"
> /dev/sdg1: LABEL="BAK_ANDVAI" UUID="df777e33-8890-4cee-a718-42233f4cafae" TYPE="ext4" PARTUUID="75265898-01"
> cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
> Inizio: mar 17 dic 2019, 13:31:00, CET...fine: mar 17 dic 2019, 13:32:26, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:32:26, CET...fine: mar 17 dic 2019, 13:33:36, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 13:33:36, CET...fine: mar 17 dic 2019, 13:35:02, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:35:02, CET...fine: mar 17 dic 2019, 13:36:14, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 13:36:14, CET...fine: mar 17 dic 2019, 13:37:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:37:40, CET...fine: mar 17 dic 2019, 13:38:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:38:51, CET...fine: mar 17 dic 2019, 13:40:18, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 13:40:18, CET...fine: mar 17 dic 2019, 13:41:29, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:41:29, CET...fine: mar 17 dic 2019, 13:42:58, CET --> ci ho messo 89 secondi!
> Inizio: mar 17 dic 2019, 13:42:58, CET...fine: mar 17 dic 2019, 13:44:11, CET --> ci ho messo 73 secondi!
> Inizio: mar 17 dic 2019, 13:44:11, CET...fine: mar 17 dic 2019, 13:45:40, CET --> ci ho messo 89 secondi!
> Inizio: mar 17 dic 2019, 13:45:40, CET...fine: mar 17 dic 2019, 13:46:49, CET --> ci ho messo 69 secondi!
> Inizio: mar 17 dic 2019, 13:46:49, CET...fine: mar 17 dic 2019, 13:48:16, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 13:48:16, CET...fine: mar 17 dic 2019, 13:49:27, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:49:27, CET...fine: mar 17 dic 2019, 13:50:53, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:50:53, CET...fine: mar 17 dic 2019, 13:52:04, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:52:04, CET...fine: mar 17 dic 2019, 13:53:30, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:53:30, CET...fine: mar 17 dic 2019, 13:54:41, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:54:41, CET...fine: mar 17 dic 2019, 13:56:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:56:07, CET...fine: mar 17 dic 2019, 13:57:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:57:18, CET...fine: mar 17 dic 2019, 13:58:46, CET --> ci ho messo 88 secondi!
> Inizio: mar 17 dic 2019, 13:58:46, CET...fine: mar 17 dic 2019, 13:59:57, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:59:57, CET...fine: mar 17 dic 2019, 14:01:23, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:01:23, CET...fine: mar 17 dic 2019, 14:02:34, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:02:34, CET...fine: mar 17 dic 2019, 14:04:00, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:04:00, CET...fine: mar 17 dic 2019, 14:05:11, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:05:11, CET...fine: mar 17 dic 2019, 14:06:38, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 14:06:38, CET...fine: mar 17 dic 2019, 14:07:49, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:07:49, CET...fine: mar 17 dic 2019, 14:09:15, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:09:15, CET...fine: mar 17 dic 2019, 14:10:26, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:10:26, CET...fine: mar 17 dic 2019, 14:11:52, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:11:52, CET...fine: mar 17 dic 2019, 14:13:03, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:13:03, CET...fine: mar 17 dic 2019, 14:14:29, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:14:29, CET...fine: mar 17 dic 2019, 14:15:41, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:15:41, CET...fine: mar 17 dic 2019, 14:17:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:17:07, CET...fine: mar 17 dic 2019, 14:18:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:18:18, CET...fine: mar 17 dic 2019, 14:19:44, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:19:44, CET...fine: mar 17 dic 2019, 14:20:55, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:20:55, CET...fine: mar 17 dic 2019, 14:22:21, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:22:21, CET...fine: mar 17 dic 2019, 14:23:33, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:23:33, CET...fine: mar 17 dic 2019, 14:24:59, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:24:59, CET...fine: mar 17 dic 2019, 14:26:10, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:26:10, CET...fine: mar 17 dic 2019, 14:27:36, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:27:36, CET...fine: mar 17 dic 2019, 14:28:46, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 14:28:46, CET...fine: mar 17 dic 2019, 14:30:12, CET --> ci ho messo 85 secondi!
> Inizio: mar 17 dic 2019, 14:30:12, CET...fine: mar 17 dic 2019, 14:31:23, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:31:23, CET...fine: mar 17 dic 2019, 14:32:49, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:32:49, CET...fine: mar 17 dic 2019, 14:34:01, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:34:01, CET...fine: mar 17 dic 2019, 14:35:27, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:35:27, CET...fine: mar 17 dic 2019, 14:36:38, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:36:38, CET...fine: mar 17 dic 2019, 14:38:04, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:38:04, CET...fine: mar 17 dic 2019, 14:39:15, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:39:15, CET...fine: mar 17 dic 2019, 14:40:41, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:40:41, CET...fine: mar 17 dic 2019, 14:41:52, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:41:52, CET...fine: mar 17 dic 2019, 14:43:18, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:43:18, CET...fine: mar 17 dic 2019, 14:44:29, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:44:29, CET...fine: mar 17 dic 2019, 14:45:55, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:45:55, CET...fine: mar 17 dic 2019, 14:47:06, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:47:06, CET...fine: mar 17 dic 2019, 14:48:32, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:48:32, CET...fine: mar 17 dic 2019, 14:49:43, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:49:43, CET...fine: mar 17 dic 2019, 14:51:10, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 14:51:10, CET...fine: mar 17 dic 2019, 14:52:23, CET --> ci ho messo 73 secondi!
> Inizio: mar 17 dic 2019, 14:52:23, CET...fine: mar 17 dic 2019, 14:53:49, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:53:49, CET...fine: mar 17 dic 2019, 14:55:00, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:55:00, CET...fine: mar 17 dic 2019, 14:56:26, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:56:26, CET...fine: mar 17 dic 2019, 14:57:37, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:57:37, CET...fine: mar 17 dic 2019, 14:59:03, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:59:03, CET...fine: mar 17 dic 2019, 15:00:14, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:00:14, CET...fine: mar 17 dic 2019, 15:01:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:01:40, CET...fine: mar 17 dic 2019, 15:02:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:02:51, CET...fine: mar 17 dic 2019, 15:04:17, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:04:17, CET...fine: mar 17 dic 2019, 15:05:28, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:05:28, CET...fine: mar 17 dic 2019, 15:06:53, CET --> ci ho messo 85 secondi!
> Inizio: mar 17 dic 2019, 15:06:53, CET...fine: mar 17 dic 2019, 15:08:04, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:08:04, CET...fine: mar 17 dic 2019, 15:09:30, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:09:30, CET...fine: mar 17 dic 2019, 15:10:41, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:10:41, CET...fine: mar 17 dic 2019, 15:12:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:12:07, CET...fine: mar 17 dic 2019, 15:13:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:13:18, CET...fine: mar 17 dic 2019, 15:14:44, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:14:44, CET...fine: mar 17 dic 2019, 15:15:55, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:15:55, CET...fine: mar 17 dic 2019, 15:17:21, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:17:21, CET...fine: mar 17 dic 2019, 15:18:32, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:18:32, CET...fine: mar 17 dic 2019, 15:19:58, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:19:58, CET...fine: mar 17 dic 2019, 15:21:08, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:21:08, CET...fine: mar 17 dic 2019, 15:22:34, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:22:34, CET...fine: mar 17 dic 2019, 15:23:45, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:23:45, CET...fine: mar 17 dic 2019, 15:25:11, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:25:11, CET...fine: mar 17 dic 2019, 15:26:22, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:26:22, CET...fine: mar 17 dic 2019, 15:27:48, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:27:48, CET...fine: mar 17 dic 2019, 15:28:59, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:28:59, CET...fine: mar 17 dic 2019, 15:30:25, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:30:25, CET...fine: mar 17 dic 2019, 15:31:35, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:31:35, CET...fine: mar 17 dic 2019, 15:33:03, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 15:33:03, CET...fine: mar 17 dic 2019, 15:34:14, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:34:14, CET...fine: mar 17 dic 2019, 15:35:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:35:40, CET...fine: mar 17 dic 2019, 15:36:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:36:51, CET...fine: mar 17 dic 2019, 15:38:17, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:38:17, CET...fine: mar 17 dic 2019, 15:39:28, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:39:28, CET...fine: mar 17 dic 2019, 15:40:54, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:40:54, CET...fine: mar 17 dic 2019, 15:42:05, CET --> ci ho messo 71 secondi!

> *** TEST XFS: *** -> test_xfs_20191217.txt
> 
> Starting 100 tries with:
> Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
> -rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
> /dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
> /dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
> /dev/sdf1: UUID="eb5a4791-5b26-44b6-871e-efd464a3adc5" TYPE="xfs" PARTUUID="09066b88-01"
> cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
> Inizio: mar 17 dic 2019, 23:58:22, CET...fine: mar 17 dic 2019, 23:59:28, CET --> ci ho messo 64 secondi!
> Inizio: mar 17 dic 2019, 23:59:28, CET...fine: mer 18 dic 2019, 00:00:33, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:00:33, CET...fine: mer 18 dic 2019, 00:01:39, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:01:39, CET...fine: mer 18 dic 2019, 00:06:35, CET --> ci ho messo 294 secondi!
> Inizio: mer 18 dic 2019, 00:06:35, CET...fine: mer 18 dic 2019, 00:07:41, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:07:41, CET...fine: mer 18 dic 2019, 00:08:46, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:08:46, CET...fine: mer 18 dic 2019, 00:09:52, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:09:52, CET...fine: mer 18 dic 2019, 00:10:57, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:10:57, CET...fine: mer 18 dic 2019, 00:12:03, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:12:03, CET...fine: mer 18 dic 2019, 00:13:08, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:13:08, CET...fine: mer 18 dic 2019, 00:14:14, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:14:14, CET...fine: mer 18 dic 2019, 00:15:19, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:15:19, CET...fine: mer 18 dic 2019, 00:21:39, CET --> ci ho messo 379 secondi!
> Inizio: mer 18 dic 2019, 00:21:39, CET...fine: mer 18 dic 2019, 00:22:44, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:22:44, CET...fine: mer 18 dic 2019, 00:23:50, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:23:50, CET...fine: mer 18 dic 2019, 00:29:16, CET --> ci ho messo 325 secondi!
> Inizio: mer 18 dic 2019, 00:29:16, CET...fine: mer 18 dic 2019, 00:30:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:30:22, CET...fine: mer 18 dic 2019, 00:34:50, CET --> ci ho messo 266 secondi!
> Inizio: mer 18 dic 2019, 00:34:50, CET...fine: mer 18 dic 2019, 00:35:56, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:35:56, CET...fine: mer 18 dic 2019, 00:37:01, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:37:01, CET...fine: mer 18 dic 2019, 00:43:39, CET --> ci ho messo 397 secondi!
> Inizio: mer 18 dic 2019, 00:43:39, CET...fine: mer 18 dic 2019, 00:48:31, CET --> ci ho messo 291 secondi!
> Inizio: mer 18 dic 2019, 00:48:31, CET...fine: mer 18 dic 2019, 00:49:37, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:49:37, CET...fine: mer 18 dic 2019, 00:50:42, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:50:42, CET...fine: mer 18 dic 2019, 00:55:39, CET --> ci ho messo 296 secondi!
> Inizio: mer 18 dic 2019, 00:55:39, CET...fine: mer 18 dic 2019, 00:56:44, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:56:44, CET...fine: mer 18 dic 2019, 00:57:50, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:57:50, CET...fine: mer 18 dic 2019, 00:58:54, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:58:54, CET...fine: mer 18 dic 2019, 01:00:01, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:00:01, CET...fine: mer 18 dic 2019, 01:01:05, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:01:05, CET...fine: mer 18 dic 2019, 01:02:11, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:02:11, CET...fine: mer 18 dic 2019, 01:03:16, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:03:16, CET...fine: mer 18 dic 2019, 01:04:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:04:22, CET...fine: mer 18 dic 2019, 01:05:27, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:05:27, CET...fine: mer 18 dic 2019, 01:11:38, CET --> ci ho messo 369 secondi!
> Inizio: mer 18 dic 2019, 01:11:38, CET...fine: mer 18 dic 2019, 01:12:43, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:12:43, CET...fine: mer 18 dic 2019, 01:18:20, CET --> ci ho messo 336 secondi!
> Inizio: mer 18 dic 2019, 01:18:20, CET...fine: mer 18 dic 2019, 01:19:25, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:19:25, CET...fine: mer 18 dic 2019, 01:21:01, CET --> ci ho messo 95 secondi!
> Inizio: mer 18 dic 2019, 01:21:01, CET...fine: mer 18 dic 2019, 01:22:06, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:22:06, CET...fine: mer 18 dic 2019, 01:23:12, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:23:12, CET...fine: mer 18 dic 2019, 01:29:43, CET --> ci ho messo 390 secondi!
> Inizio: mer 18 dic 2019, 01:29:43, CET...fine: mer 18 dic 2019, 01:30:49, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:30:49, CET...fine: mer 18 dic 2019, 01:35:36, CET --> ci ho messo 285 secondi!
> Inizio: mer 18 dic 2019, 01:35:36, CET...fine: mer 18 dic 2019, 01:36:44, CET --> ci ho messo 66 secondi!
> Inizio: mer 18 dic 2019, 01:36:44, CET...fine: mer 18 dic 2019, 01:37:48, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:37:48, CET...fine: mer 18 dic 2019, 01:38:55, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:38:55, CET...fine: mer 18 dic 2019, 01:44:04, CET --> ci ho messo 308 secondi!
> Inizio: mer 18 dic 2019, 01:44:04, CET...fine: mer 18 dic 2019, 01:45:10, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:45:10, CET...fine: mer 18 dic 2019, 01:49:42, CET --> ci ho messo 270 secondi!
> Inizio: mer 18 dic 2019, 01:49:42, CET...fine: mer 18 dic 2019, 01:50:48, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:50:48, CET...fine: mer 18 dic 2019, 01:51:53, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:51:53, CET...fine: mer 18 dic 2019, 01:58:18, CET --> ci ho messo 383 secondi!
> Inizio: mer 18 dic 2019, 01:58:18, CET...fine: mer 18 dic 2019, 01:59:23, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:59:23, CET...fine: mer 18 dic 2019, 02:00:29, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:00:29, CET...fine: mer 18 dic 2019, 02:01:34, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:01:34, CET...fine: mer 18 dic 2019, 02:02:40, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:02:40, CET...fine: mer 18 dic 2019, 02:03:45, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:03:45, CET...fine: mer 18 dic 2019, 02:04:51, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:04:51, CET...fine: mer 18 dic 2019, 02:05:56, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:05:56, CET...fine: mer 18 dic 2019, 02:07:02, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:07:02, CET...fine: mer 18 dic 2019, 02:08:07, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:08:07, CET...fine: mer 18 dic 2019, 02:09:14, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:09:14, CET...fine: mer 18 dic 2019, 02:13:44, CET --> ci ho messo 269 secondi!
> Inizio: mer 18 dic 2019, 02:13:44, CET...fine: mer 18 dic 2019, 02:14:51, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:14:51, CET...fine: mer 18 dic 2019, 02:15:56, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:15:56, CET...fine: mer 18 dic 2019, 02:17:02, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:17:02, CET...fine: mer 18 dic 2019, 02:18:07, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:18:07, CET...fine: mer 18 dic 2019, 02:19:13, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:19:13, CET...fine: mer 18 dic 2019, 02:20:18, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:20:18, CET...fine: mer 18 dic 2019, 02:21:24, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:21:24, CET...fine: mer 18 dic 2019, 02:22:29, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:22:29, CET...fine: mer 18 dic 2019, 02:23:35, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:23:35, CET...fine: mer 18 dic 2019, 02:24:40, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:24:40, CET...fine: mer 18 dic 2019, 02:25:46, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:25:46, CET...fine: mer 18 dic 2019, 02:26:51, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:26:51, CET...fine: mer 18 dic 2019, 02:27:57, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:27:57, CET...fine: mer 18 dic 2019, 02:29:02, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:29:02, CET...fine: mer 18 dic 2019, 02:30:08, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:30:08, CET...fine: mer 18 dic 2019, 02:31:13, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:31:13, CET...fine: mer 18 dic 2019, 02:32:19, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:32:19, CET...fine: mer 18 dic 2019, 02:40:06, CET --> ci ho messo 465 secondi!
> Inizio: mer 18 dic 2019, 02:40:06, CET...fine: mer 18 dic 2019, 02:41:12, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:41:12, CET...fine: mer 18 dic 2019, 02:42:17, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:42:17, CET...fine: mer 18 dic 2019, 02:43:23, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:43:23, CET...fine: mer 18 dic 2019, 02:44:28, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:44:28, CET...fine: mer 18 dic 2019, 02:45:34, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:45:34, CET...fine: mer 18 dic 2019, 02:46:39, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:46:39, CET...fine: mer 18 dic 2019, 02:47:45, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:47:45, CET...fine: mer 18 dic 2019, 02:48:50, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:48:50, CET...fine: mer 18 dic 2019, 02:54:26, CET --> ci ho messo 334 secondi!
> Inizio: mer 18 dic 2019, 02:54:26, CET...fine: mer 18 dic 2019, 02:55:31, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:55:31, CET...fine: mer 18 dic 2019, 02:57:11, CET --> ci ho messo 98 secondi!
> Inizio: mer 18 dic 2019, 02:57:11, CET...fine: mer 18 dic 2019, 02:58:16, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:58:16, CET...fine: mer 18 dic 2019, 02:59:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:59:22, CET...fine: mer 18 dic 2019, 03:00:27, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 03:00:27, CET...fine: mer 18 dic 2019, 03:05:55, CET --> ci ho messo 326 secondi!
> Inizio: mer 18 dic 2019, 03:05:55, CET...fine: mer 18 dic 2019, 03:11:49, CET --> ci ho messo 352 secondi!
> Inizio: mer 18 dic 2019, 03:11:49, CET...fine: mer 18 dic 2019, 03:12:56, CET --> ci ho messo 66 secondi!
> Inizio: mer 18 dic 2019, 03:12:56, CET...fine: mer 18 dic 2019, 03:14:01, CET --> ci ho messo 63 secondi!
> 



-- 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found]                                                                                                           ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
@ 2019-12-23 13:08                                                                                                             ` Ming Lei
  2019-12-23 14:02                                                                                                               ` Andrea Vai
  2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
  0 siblings, 2 replies; 102+ messages in thread
From: Ming Lei @ 2019-12-23 13:08 UTC (permalink / raw)
  To: Andrea Vai, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha scritto:
> > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o
> > wrote:
> > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > > > I didn't reproduce the issue in my test environment, and
> > follows
> > > > > > Andrea's test commands[1]:
> > > > > > 
> > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > >   SECONDS=0
> > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > 
> > > > > > The 'cp' command supposes to open/close the file just once,
> > > > however
> > > > > > ext4_release_file() & write pages is observed to run for
> > 4358
> > > > times
> > > > > > when executing the above 'cp' test.
> > > > > 
> > > > > Why are we sure the ext4_release_file() / _fput() is coming
> > from
> > > > the
> > > > > cp command, as opposed to something else that might be running
> > on
> > > > the
> > > > > system under test?  _fput() is called by the kernel when the
> > last
> > > > 
> > > > Please see the log:
> > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > 
> > > > Which is collected by:
> > > > 
> > > > #!/bin/sh
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > 
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector' \
> > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > $MAJ:$MIN points to the USB storage disk.
> > > > 
> > > > From the above IO trace, there are two write paths, one is from
> > cp,
> > > > another is from writeback wq.
> > > > 
> > > > The stackcount trace[1] is consistent with the IO trace log
> > since it
> > > > only shows two IO paths, that is why I concluded that the write
> > done
> > > > via
> > > > ext4_release_file() is from 'cp'.
> > > > 
> > > > [1] 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > 
> > > > > reference to a struct file is released.  (Specifically, if you
> > > > have a
> > > > > fd which is dup'ed, it's only when the last fd corresponding
> > to
> > > > the
> > > > > struct file is closed, and the struct file is about to be
> > > > released,
> > > > > does the file system's f_ops->release function get called.)
> > > > > 
> > > > > So the first question I'd ask is whether there is anything
> > else
> > > > going
> > > > > on the system, and whether the writes are happening to the USB
> > > > thumb
> > > > > drive, or to some other storage device.  And if there is
> > something
> > > > > else which is writing to the pendrive, maybe that's why no one
> > > > else
> > > > > has been able to reproduce the OP's complaint....
> > > > 
> > > > OK, we can ask Andrea to confirm that via the following trace,
> > which
> > > > will add pid/comm info in the stack trace:
> > > > 
> > > > /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> > > > 
> > > > Andrew, could you collect the above log again when running
> > new/bad
> > > > kernel for confirming if the write done by ext4_release_file()
> > is
> > > > from
> > > > the 'cp' process?
> > > 
> > > You can find the stackcount log attached. It has been produced by:
> > > 
> > > - /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted >
> > trace.log
> > > - wait some seconds
> > > - run the test (1 copy trial), wait for the test to finish, wait
> > some seconds
> > > - stop the trace (ctrl+C)
> > 
> > Thanks for collecting the log, looks your 'stackcount' doesn't
> > include
> > comm/pid info, seems there is difference between your bcc and
> > my bcc in fedora 30.
> > 
> > Could you collect above log again via the following command?
> > 
> > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> > 
> > which will show the comm/pid info.
> 
> ok, attached (trace_20191219.txt), the test (1 trial) took 3684
> seconds.

From the above trace:

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_writepages'
  b'do_writepages'
  b'__filemap_fdatawrite_range'
  b'ext4_release_file'
  b'__fput'
  b'task_work_run'
  b'exit_to_usermode_loop'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
    b'cp' [19863]
    4400

So this write is clearly from 'cp' process, and it should be one
ext4 fs issue.

Ted, can you take a look at this issue?

> 
> > > I also tried the usual test with btrfs and xfs. Btrfs behavior
> > looks
> > > "good". xfs seems sometimes better, sometimes worse, I would say.
> > I
> > > don't know if it matters, anyway you can also find the results of
> > the
> > > two tests (100 trials each). Basically, btrfs is always between 68
> > and
> > > 89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
> > > almost always very good (63-65s), but sometimes "bad" (>300s).
> > 
> > If you are interested in digging into this one, the following trace
> > should be helpful:
> > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > 
> 
> Attached:
> - trace_xfs_20191223.txt (7 trials, then aborted while doing the 8th),
> times to complete:
> 64s
> 63s
> 64s
> 833s
> 1105s
> 63s
> 64s

oops, looks we have to collect io insert trace with the following bcc script
on xfs for confirming if there is similar issue with ext4, could you run
it again on xfs? And only post the trace done in case of slow 'cp'.


#!/bin/sh

MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 13:08                                                                                                             ` Ming Lei
@ 2019-12-23 14:02                                                                                                               ` Andrea Vai
  2019-12-24  1:32                                                                                                                 ` Ming Lei
  2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
  1 sibling, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-23 14:02 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 7172 bytes --]

Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha scritto:
> > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> scritto:
> > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o
> > > wrote:
> > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > > > > I didn't reproduce the issue in my test environment, and
> > > follows
> > > > > > > Andrea's test commands[1]:
> > > > > > > 
> > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > >   SECONDS=0
> > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > 
> > > > > > > The 'cp' command supposes to open/close the file just
> once,
> > > > > however
> > > > > > > ext4_release_file() & write pages is observed to run for
> > > 4358
> > > > > times
> > > > > > > when executing the above 'cp' test.
> > > > > > 
> > > > > > Why are we sure the ext4_release_file() / _fput() is
> coming
> > > from
> > > > > the
> > > > > > cp command, as opposed to something else that might be
> running
> > > on
> > > > > the
> > > > > > system under test?  _fput() is called by the kernel when
> the
> > > last
> > > > > 
> > > > > Please see the log:
> > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > 
> > > > > Which is collected by:
> > > > > 
> > > > > #!/bin/sh
> > > > > MAJ=$1
> > > > > MIN=$2
> > > > > MAJ=$(( $MAJ << 20 ))
> > > > > DEV=$(( $MAJ | $MIN ))
> > > > > 
> > > > > /usr/share/bcc/tools/trace -t -C \
> > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> %d",
> > > args-
> > > > > >rwbs, args->sector, args->nr_sector' \
> > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d
> %d",
> > > args-
> > > > > >rwbs, args->sector, args->nr_sector'
> > > > > 
> > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > 
> > > > > From the above IO trace, there are two write paths, one is
> from
> > > cp,
> > > > > another is from writeback wq.
> > > > > 
> > > > > The stackcount trace[1] is consistent with the IO trace log
> > > since it
> > > > > only shows two IO paths, that is why I concluded that the
> write
> > > done
> > > > > via
> > > > > ext4_release_file() is from 'cp'.
> > > > > 
> > > > > [1] 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > 
> > > > > > reference to a struct file is released.  (Specifically, if
> you
> > > > > have a
> > > > > > fd which is dup'ed, it's only when the last fd
> corresponding
> > > to
> > > > > the
> > > > > > struct file is closed, and the struct file is about to be
> > > > > released,
> > > > > > does the file system's f_ops->release function get
> called.)
> > > > > > 
> > > > > > So the first question I'd ask is whether there is anything
> > > else
> > > > > going
> > > > > > on the system, and whether the writes are happening to the
> USB
> > > > > thumb
> > > > > > drive, or to some other storage device.  And if there is
> > > something
> > > > > > else which is writing to the pendrive, maybe that's why no
> one
> > > > > else
> > > > > > has been able to reproduce the OP's complaint....
> > > > > 
> > > > > OK, we can ask Andrea to confirm that via the following
> trace,
> > > which
> > > > > will add pid/comm info in the stack trace:
> > > > > 
> > > > > /usr/share/bcc/tools/stackcount
> blk_mq_sched_request_inserted
> > > > > 
> > > > > Andrew, could you collect the above log again when running
> > > new/bad
> > > > > kernel for confirming if the write done by
> ext4_release_file()
> > > is
> > > > > from
> > > > > the 'cp' process?
> > > > 
> > > > You can find the stackcount log attached. It has been produced
> by:
> > > > 
> > > > - /usr/share/bcc/tools/stackcount
> blk_mq_sched_request_inserted >
> > > trace.log
> > > > - wait some seconds
> > > > - run the test (1 copy trial), wait for the test to finish,
> wait
> > > some seconds
> > > > - stop the trace (ctrl+C)
> > > 
> > > Thanks for collecting the log, looks your 'stackcount' doesn't
> > > include
> > > comm/pid info, seems there is difference between your bcc and
> > > my bcc in fedora 30.
> > > 
> > > Could you collect above log again via the following command?
> > > 
> > > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> > > 
> > > which will show the comm/pid info.
> > 
> > ok, attached (trace_20191219.txt), the test (1 trial) took 3684
> > seconds.
> 
> From the above trace:
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     b'cp' [19863]
>     4400
> 
> So this write is clearly from 'cp' process, and it should be one
> ext4 fs issue.
> 
> Ted, can you take a look at this issue?
> 
> > 
> > > > I also tried the usual test with btrfs and xfs. Btrfs behavior
> > > looks
> > > > "good". xfs seems sometimes better, sometimes worse, I would
> say.
> > > I
> > > > don't know if it matters, anyway you can also find the results
> of
> > > the
> > > > two tests (100 trials each). Basically, btrfs is always
> between 68
> > > and
> > > > 89 seconds, with a cyclicity (?) with "period=2 trials". xfs
> looks
> > > > almost always very good (63-65s), but sometimes "bad" (>300s).
> > > 
> > > If you are interested in digging into this one, the following
> trace
> > > should be helpful:
> > > 
> > > 
> https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > 
> > 
> > Attached:
> > - trace_xfs_20191223.txt (7 trials, then aborted while doing the
> 8th),
> > times to complete:
> > 64s
> > 63s
> > 64s
> > 833s
> > 1105s
> > 63s
> > 64s
> 
> oops, looks we have to collect io insert trace with the following
> bcc script
> on xfs for confirming if there is similar issue with ext4, could you
> run
> it again on xfs? And only post the trace done in case of slow 'cp'.
> 
> 
> #!/bin/sh
> 
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> 
here it is (1 trial, 313 seconds to finish)

Thanks,
Andrea

[-- Attachment #2: trace_20191223_xfs_new.zip --]
[-- Type: application/zip, Size: 129317 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 13:08                                                                                                             ` Ming Lei
  2019-12-23 14:02                                                                                                               ` Andrea Vai
@ 2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
  2019-12-23 16:29                                                                                                                 ` Andrea Vai
  1 sibling, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 16:26 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 09:08:28PM +0800, Ming Lei wrote:
> 
> From the above trace:
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     b'cp' [19863]
>     4400
> 
> So this write is clearly from 'cp' process, and it should be one
> ext4 fs issue.

We need a system call trace of the cp process, to understand what
system call is resulting in fput, (eg., I assume it's close(2) but
let's be sure), and often it's calling that system call.

What cp process is it?  Is it from shellutils?  Is it from busybox?

     		   	      	   		- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
@ 2019-12-23 16:29                                                                                                                 ` Andrea Vai
  2019-12-23 17:22                                                                                                                   ` Theodore Y. Ts'o
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-23 16:29 UTC (permalink / raw)
  To: Theodore Y. Ts'o, Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno lun, 23/12/2019 alle 11.26 -0500, Theodore Y. Ts'o ha
scritto:
> On Mon, Dec 23, 2019 at 09:08:28PM +0800, Ming Lei wrote:
> > 
> > From the above trace:
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     b'cp' [19863]
> >     4400
> > 
> > So this write is clearly from 'cp' process, and it should be one
> > ext4 fs issue.
> 
> We need a system call trace of the cp process, to understand what
> system call is resulting in fput, (eg., I assume it's close(2) but
> let's be sure), and often it's calling that system call.
> 
> What cp process is it?  Is it from shellutils?  Is it from busybox?
> 
>      		   	      	   		- Ted

I run the cp command from a bash script, or from a bash shell. I don't
know if this answer your question, otherwise feel free to tell me a
way to find the answer to give you.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 16:29                                                                                                                 ` Andrea Vai
@ 2019-12-23 17:22                                                                                                                   ` Theodore Y. Ts'o
  2019-12-23 18:45                                                                                                                     ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 17:22 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 05:29:27PM +0100, Andrea Vai wrote:
> I run the cp command from a bash script, or from a bash shell. I don't
> know if this answer your question, otherwise feel free to tell me a
> way to find the answer to give you.

What distro are you using, and/or what package is the cp command
coming from, and what is the package name and version?

Also, can you remind me what the bash script is and how many files you are copying?

Can you change the script so that the cp command is prefixed by:

"strace -tTf -o /tmp/st "

e.g.,

	strace -tTf -o /tmp/st cp <args>

And then send me the /tmp/st file.  This will significantly change the
time, so don't do this for measuring performance.  I just want to see
what the /bin/cp command is *doing*.

      	      	     	     	      - Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 17:22                                                                                                                   ` Theodore Y. Ts'o
@ 2019-12-23 18:45                                                                                                                     ` Andrea Vai
  2019-12-23 19:53                                                                                                                       ` Theodore Y. Ts'o
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-23 18:45 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]

Il giorno lun, 23/12/2019 alle 12.22 -0500, Theodore Y. Ts'o ha
scritto:
> On Mon, Dec 23, 2019 at 05:29:27PM +0100, Andrea Vai wrote:
> > I run the cp command from a bash script, or from a bash shell. I
> don't
> > know if this answer your question, otherwise feel free to tell me
> a
> > way to find the answer to give you.
> 
> What distro are you using, and/or what package is the cp command
> coming from, and what is the package name and version?

Fedora 30

$ rpm -qf `which cp`
coreutils-8.31-6.fc30.x86_64

> 
> Also, can you remind me what the bash script is and how many files
> you are copying?

basically, it's:

  mount UUID=$uuid /mnt/pendrive
  SECONDS=0
  cp $testfile /mnt/pendrive
  umount /mnt/pendrive
  tempo=$SECONDS

and it copies one file only. Anyway, you can find the whole script
attached.


> 
> Can you change the script so that the cp command is prefixed by:
> 
> "strace -tTf -o /tmp/st "
> 
> e.g.,
> 
> 	strace -tTf -o /tmp/st cp <args>
> 
> And then send me
btw, please tell me if "me" means only you or I cc: all the
recipients, as usual

>  the /tmp/st file.  This will significantly change the
> time, so don't do this for measuring performance.  I just want to
> see
> what the /bin/cp command is *doing*.

I will do it, but I have a doubt. Since the problem doesn't happen
every time, is it useful to give you a trace of a "fast" run? And, if
it's not, I think I should measure performance with the trace command
prefix, to identify a "slow" run to report you. Does it make sense?

Thanks,
Andrea

[-- Attachment #2: test --]
[-- Type: application/x-shellscript, Size: 1403 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 18:45                                                                                                                     ` Andrea Vai
@ 2019-12-23 19:53                                                                                                                       ` Theodore Y. Ts'o
  2019-12-24  1:27                                                                                                                         ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 19:53 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> basically, it's:
> 
>   mount UUID=$uuid /mnt/pendrive
>   SECONDS=0
>   cp $testfile /mnt/pendrive
>   umount /mnt/pendrive
>   tempo=$SECONDS
> 
> and it copies one file only. Anyway, you can find the whole script
> attached.

OK, so whether we are doing the writeback at the end of cp, or when
you do the umount, it's probably not going to make any difference.  We
can get rid of the stack trace in question by changing the script to
be basically:

mount UUID=$uuid /mnt/pendrive
SECONDS=0
rm -f /mnt/pendrive/$testfile
cp $testfile /mnt/pendrive
umount /mnt/pendrive
tempo=$SECONDS

I predict if you do that, you'll see that all of the time is spent in
the umount, when we are trying to write back the file.

I really don't think then this is a file system problem at all.  It's
just that USB I/O is slow, for whatever reason.  We'll see a stack
trace in the writeback code waiting for the I/O to be completed, but
that doesn't mean that the root cause is in the writeback code or in
the file system which is triggering the writeback.

I suspect the next step is use a blktrace, to see what kind of I/O is
being sent to the USB drive, and how long it takes for the I/O to
complete.  You might also try to capture the output of "iostat -x 1"
while the script is running, and see what the difference might be
between a kernel version that has the problem and one that doesn't,
and see if that gives us a clue.

> > And then send me
> btw, please tell me if "me" means only you or I cc: all the
> recipients, as usual

Well, I don't think we know what the root cause is.  Ming is focusing
on that stack trace, but I think it's a red herring.....  And if it's
not a file system problem, then other people will be best suited to
debug the issue.

   	      	     	   	      	    - Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 19:53                                                                                                                       ` Theodore Y. Ts'o
@ 2019-12-24  1:27                                                                                                                         ` Ming Lei
  2019-12-24  6:49                                                                                                                           ` Andrea Vai
                                                                                                                                             ` (2 more replies)
  0 siblings, 3 replies; 102+ messages in thread
From: Ming Lei @ 2019-12-24  1:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Hi Ted,

On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > basically, it's:
> > 
> >   mount UUID=$uuid /mnt/pendrive
> >   SECONDS=0
> >   cp $testfile /mnt/pendrive
> >   umount /mnt/pendrive
> >   tempo=$SECONDS
> > 
> > and it copies one file only. Anyway, you can find the whole script
> > attached.
> 
> OK, so whether we are doing the writeback at the end of cp, or when
> you do the umount, it's probably not going to make any difference.  We
> can get rid of the stack trace in question by changing the script to
> be basically:
> 
> mount UUID=$uuid /mnt/pendrive
> SECONDS=0
> rm -f /mnt/pendrive/$testfile
> cp $testfile /mnt/pendrive
> umount /mnt/pendrive
> tempo=$SECONDS
> 
> I predict if you do that, you'll see that all of the time is spent in
> the umount, when we are trying to write back the file.
> 
> I really don't think then this is a file system problem at all.  It's
> just that USB I/O is slow, for whatever reason.  We'll see a stack
> trace in the writeback code waiting for the I/O to be completed, but
> that doesn't mean that the root cause is in the writeback code or in
> the file system which is triggering the writeback.

Wrt. the slow write on this usb storage, it is caused by two writeback
path, one is the writeback wq, another is from ext4_release_file() which
is triggered from exit_to_usermode_loop().

When the two write path is run concurrently, the sequential write order
is broken, then write performance drops much on this particular usb
storage.

The ext4_release_file() should be run from read() or write() syscall if
Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
for ext4_release_file() to be run thousands of times when just
running 'cp' once, see comment of ext4_release_file():

	/*
	 * Called when an inode is released. Note that this is different
	 * from ext4_file_open: open gets called at every open, but release
	 * gets called only when /all/ the files are closed.
	 */
	static int ext4_release_file(struct inode *inode, struct file *filp)

> 
> I suspect the next step is use a blktrace, to see what kind of I/O is
> being sent to the USB drive, and how long it takes for the I/O to
> complete.  You might also try to capture the output of "iostat -x 1"
> while the script is running, and see what the difference might be
> between a kernel version that has the problem and one that doesn't,
> and see if that gives us a clue.

That isn't necessary, given we have concluded that the bad write
performance is caused by broken write order.

> 
> > > And then send me
> > btw, please tell me if "me" means only you or I cc: all the
> > recipients, as usual
> 
> Well, I don't think we know what the root cause is.  Ming is focusing
> on that stack trace, but I think it's a red herring.....  And if it's
> not a file system problem, then other people will be best suited to
> debug the issue.

So far, the reason points to the extra writeback path from exit_to_usermode_loop().
If it is not from close() syscall, the issue should be related with file reference
count. If it is from close() syscall, the issue might be in 'cp''s
implementation.

Andrea, please collect the following log or the strace log requested by Ted, then
we can confirm if the extra writeback is from close() or read/write() syscall:

# pass PID of 'cp' to this script
#!/bin/sh
PID=$1
/usr/share/bcc/tools/trace -P $PID  -t -C \
    't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:syscalls:sys_exit_close ' \
    't:syscalls:sys_exit_read ' \
    't:syscalls:sys_exit_write '


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 14:02                                                                                                               ` Andrea Vai
@ 2019-12-24  1:32                                                                                                                 ` Ming Lei
  2019-12-24  8:04                                                                                                                   ` Andrea Vai
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-24  1:32 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha scritto:
> > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > scritto:
> > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o
> > > > wrote:
> > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > > > > > I didn't reproduce the issue in my test environment, and
> > > > follows
> > > > > > > > Andrea's test commands[1]:
> > > > > > > > 
> > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > >   SECONDS=0
> > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > 
> > > > > > > > The 'cp' command supposes to open/close the file just
> > once,
> > > > > > however
> > > > > > > > ext4_release_file() & write pages is observed to run for
> > > > 4358
> > > > > > times
> > > > > > > > when executing the above 'cp' test.
> > > > > > > 
> > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > coming
> > > > from
> > > > > > the
> > > > > > > cp command, as opposed to something else that might be
> > running
> > > > on
> > > > > > the
> > > > > > > system under test?  _fput() is called by the kernel when
> > the
> > > > last
> > > > > > 
> > > > > > Please see the log:
> > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > 
> > > > > > Which is collected by:
> > > > > > 
> > > > > > #!/bin/sh
> > > > > > MAJ=$1
> > > > > > MIN=$2
> > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > 
> > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > %d",
> > > > args-
> > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d
> > %d",
> > > > args-
> > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > 
> > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > 
> > > > > > From the above IO trace, there are two write paths, one is
> > from
> > > > cp,
> > > > > > another is from writeback wq.
> > > > > > 
> > > > > > The stackcount trace[1] is consistent with the IO trace log
> > > > since it
> > > > > > only shows two IO paths, that is why I concluded that the
> > write
> > > > done
> > > > > > via
> > > > > > ext4_release_file() is from 'cp'.
> > > > > > 
> > > > > > [1] 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > 
> > > > > > > reference to a struct file is released.  (Specifically, if
> > you
> > > > > > have a
> > > > > > > fd which is dup'ed, it's only when the last fd
> > corresponding
> > > > to
> > > > > > the
> > > > > > > struct file is closed, and the struct file is about to be
> > > > > > released,
> > > > > > > does the file system's f_ops->release function get
> > called.)
> > > > > > > 
> > > > > > > So the first question I'd ask is whether there is anything
> > > > else
> > > > > > going
> > > > > > > on the system, and whether the writes are happening to the
> > USB
> > > > > > thumb
> > > > > > > drive, or to some other storage device.  And if there is
> > > > something
> > > > > > > else which is writing to the pendrive, maybe that's why no
> > one
> > > > > > else
> > > > > > > has been able to reproduce the OP's complaint....
> > > > > > 
> > > > > > OK, we can ask Andrea to confirm that via the following
> > trace,
> > > > which
> > > > > > will add pid/comm info in the stack trace:
> > > > > > 
> > > > > > /usr/share/bcc/tools/stackcount
> > blk_mq_sched_request_inserted
> > > > > > 
> > > > > > Andrew, could you collect the above log again when running
> > > > new/bad
> > > > > > kernel for confirming if the write done by
> > ext4_release_file()
> > > > is
> > > > > > from
> > > > > > the 'cp' process?
> > > > > 
> > > > > You can find the stackcount log attached. It has been produced
> > by:
> > > > > 
> > > > > - /usr/share/bcc/tools/stackcount
> > blk_mq_sched_request_inserted >
> > > > trace.log
> > > > > - wait some seconds
> > > > > - run the test (1 copy trial), wait for the test to finish,
> > wait
> > > > some seconds
> > > > > - stop the trace (ctrl+C)
> > > > 
> > > > Thanks for collecting the log, looks your 'stackcount' doesn't
> > > > include
> > > > comm/pid info, seems there is difference between your bcc and
> > > > my bcc in fedora 30.
> > > > 
> > > > Could you collect above log again via the following command?
> > > > 
> > > > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> > > > 
> > > > which will show the comm/pid info.
> > > 
> > > ok, attached (trace_20191219.txt), the test (1 trial) took 3684
> > > seconds.
> > 
> > From the above trace:
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     b'cp' [19863]
> >     4400
> > 
> > So this write is clearly from 'cp' process, and it should be one
> > ext4 fs issue.
> > 
> > Ted, can you take a look at this issue?
> > 
> > > 
> > > > > I also tried the usual test with btrfs and xfs. Btrfs behavior
> > > > looks
> > > > > "good". xfs seems sometimes better, sometimes worse, I would
> > say.
> > > > I
> > > > > don't know if it matters, anyway you can also find the results
> > of
> > > > the
> > > > > two tests (100 trials each). Basically, btrfs is always
> > between 68
> > > > and
> > > > > 89 seconds, with a cyclicity (?) with "period=2 trials". xfs
> > looks
> > > > > almost always very good (63-65s), but sometimes "bad" (>300s).
> > > > 
> > > > If you are interested in digging into this one, the following
> > trace
> > > > should be helpful:
> > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > 
> > > 
> > > Attached:
> > > - trace_xfs_20191223.txt (7 trials, then aborted while doing the
> > 8th),
> > > times to complete:
> > > 64s
> > > 63s
> > > 64s
> > > 833s
> > > 1105s
> > > 63s
> > > 64s
> > 
> > oops, looks we have to collect io insert trace with the following
> > bcc script
> > on xfs for confirming if there is similar issue with ext4, could you
> > run
> > it again on xfs? And only post the trace done in case of slow 'cp'.
> > 
> > 
> > #!/bin/sh
> > 
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > 
> here it is (1 trial, 313 seconds to finish)

The above log shows similar issue with ext4 since there is another
writeback IO path from 'cp' process. And the following trace can show if
it is same with ext4's issue:

/usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                                                                                         ` Ming Lei
@ 2019-12-24  6:49                                                                                                                           ` Andrea Vai
  2019-12-24  8:51                                                                                                                           ` Andrea Vai
  2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
  2 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2019-12-24  6:49 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> Hi Ted,
> 
> On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > basically, it's:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive
> > >   umount /mnt/pendrive
> > >   tempo=$SECONDS
> > > 
> > > and it copies one file only. Anyway, you can find the whole
> script
> > > attached.
> > 
> > OK, so whether we are doing the writeback at the end of cp, or
> when
> > you do the umount, it's probably not going to make any
> difference.  We
> > can get rid of the stack trace in question by changing the script
> to
> > be basically:
> > 
> > mount UUID=$uuid /mnt/pendrive
> > SECONDS=0
> > rm -f /mnt/pendrive/$testfile
> > cp $testfile /mnt/pendrive
> > umount /mnt/pendrive
> > tempo=$SECONDS
> > 
> > I predict if you do that, you'll see that all of the time is spent
> in
> > the umount, when we are trying to write back the file.
> > 
> > I really don't think then this is a file system problem at
> all.  It's
> > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > trace in the writeback code waiting for the I/O to be completed,
> but
> > that doesn't mean that the root cause is in the writeback code or
> in
> > the file system which is triggering the writeback.
> 
> Wrt. the slow write on this usb storage, it is caused by two
> writeback
> path, one is the writeback wq, another is from ext4_release_file()
> which
> is triggered from exit_to_usermode_loop().
> 
> When the two write path is run concurrently, the sequential write
> order
> is broken, then write performance drops much on this particular usb
> storage.
> 
> The ext4_release_file() should be run from read() or write() syscall
> if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():
> 
> 	/*
> 	 * Called when an inode is released. Note that this is
> different
> 	 * from ext4_file_open: open gets called at every open, but
> release
> 	 * gets called only when /all/ the files are closed.
> 	 */
> 	static int ext4_release_file(struct inode *inode, struct file
> *filp)
> 
> > 
> > I suspect the next step is use a blktrace, to see what kind of I/O
> is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x
> 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that
> doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.
> 
> > 
> > > > And then send me
> > > btw, please tell me if "me" means only you or I cc: all the
> > > recipients, as usual
> > 
> > Well, I don't think we know what the root cause is.  Ming is
> focusing
> > on that stack trace, but I think it's a red herring.....  And if
> it's
> > not a file system problem, then other people will be best suited
> to
> > debug the issue.
> 
> So far, the reason points to the extra writeback path from
> exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with
> file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.
> 
> Andrea, please collect the following log or the strace log requested
> by Ted, then
> we can confirm if the extra writeback is from close() or
> read/write() syscall:
> 
> # pass PID of 'cp' to this script
> #!/bin/sh
> PID=$1
> /usr/share/bcc/tools/trace -P $PID  -t -C \
>     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> args->nr_sector' \
>     't:syscalls:sys_exit_close ' \
>     't:syscalls:sys_exit_read ' \
>     't:syscalls:sys_exit_write '

Sorry if I am a bit confused, should I run it on ext4 or xfs, or
doesn't matter? What if I get it on a "fast" run? Should I throw it
away and try again until I get a slow one, or it doesn't matter?

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:32                                                                                                                 ` Ming Lei
@ 2019-12-24  8:04                                                                                                                   ` Andrea Vai
  2019-12-24  8:47                                                                                                                     ` Ming Lei
  0 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-24  8:04 UTC (permalink / raw)
  To: Ming Lei
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

Il giorno mar, 24/12/2019 alle 09.32 +0800, Ming Lei ha scritto:
> On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> > Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha
> scritto:
> > > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > > scritto:
> > > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y.
> Ts'o
> > > > > wrote:
> > > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei
> wrote:
> > > > > > > > > I didn't reproduce the issue in my test environment,
> and
> > > > > follows
> > > > > > > > > Andrea's test commands[1]:
> > > > > > > > > 
> > > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a
> $logfile
> > > > > > > > >   SECONDS=0
> > > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > 
> > > > > > > > > The 'cp' command supposes to open/close the file
> just
> > > once,
> > > > > > > however
> > > > > > > > > ext4_release_file() & write pages is observed to run
> for
> > > > > 4358
> > > > > > > times
> > > > > > > > > when executing the above 'cp' test.
> > > > > > > > 
> > > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > > coming
> > > > > from
> > > > > > > the
> > > > > > > > cp command, as opposed to something else that might be
> > > running
> > > > > on
> > > > > > > the
> > > > > > > > system under test?  _fput() is called by the kernel
> when
> > > the
> > > > > last
> > > > > > > 
> > > > > > > Please see the log:
> > > > > > > 
> > > > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > > 
> > > > > > > Which is collected by:
> > > > > > > 
> > > > > > > #!/bin/sh
> > > > > > > MAJ=$1
> > > > > > > MIN=$2
> > > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > > 
> > > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > > %d",
> > > > > args-
> > > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s
> %d
> > > %d",
> > > > > args-
> > > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > > 
> > > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > > 
> > > > > > > From the above IO trace, there are two write paths, one
> is
> > > from
> > > > > cp,
> > > > > > > another is from writeback wq.
> > > > > > > 
> > > > > > > The stackcount trace[1] is consistent with the IO trace
> log
> > > > > since it
> > > > > > > only shows two IO paths, that is why I concluded that
> the
> > > write
> > > > > done
> > > > > > > via
> > > > > > > ext4_release_file() is from 'cp'.
> > > > > > > 
> > > > > > > [1] 
> > > > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > > 
> > > > > > > > reference to a struct file is
> released.  (Specifically, if
> > > you
> > > > > > > have a
> > > > > > > > fd which is dup'ed, it's only when the last fd
> > > corresponding
> > > > > to
> > > > > > > the
> > > > > > > > struct file is closed, and the struct file is about to
> be
> > > > > > > released,
> > > > > > > > does the file system's f_ops->release function get
> > > called.)
> > > > > > > > 
> > > > > > > > So the first question I'd ask is whether there is
> anything
> > > > > else
> > > > > > > going
> > > > > > > > on the system, and whether the writes are happening to
> the
> > > USB
> > > > > > > thumb
> > > > > > > > drive, or to some other storage device.  And if there
> is
> > > > > something
> > > > > > > > else which is writing to the pendrive, maybe that's
> why no
> > > one
> > > > > > > else
> > > > > > > > has been able to reproduce the OP's complaint....
> > > > > > > 
> > > > > > > OK, we can ask Andrea to confirm that via the following
> > > trace,
> > > > > which
> > > > > > > will add pid/comm info in the stack trace:
> > > > > > > 
> > > > > > > /usr/share/bcc/tools/stackcount
> > > blk_mq_sched_request_inserted
> > > > > > > 
> > > > > > > Andrew, could you collect the above log again when
> running
> > > > > new/bad
> > > > > > > kernel for confirming if the write done by
> > > ext4_release_file()
> > > > > is
> > > > > > > from
> > > > > > > the 'cp' process?
> > > > > > 
> > > > > > You can find the stackcount log attached. It has been
> produced
> > > by:
> > > > > > 
> > > > > > - /usr/share/bcc/tools/stackcount
> > > blk_mq_sched_request_inserted >
> > > > > trace.log
> > > > > > - wait some seconds
> > > > > > - run the test (1 copy trial), wait for the test to
> finish,
> > > wait
> > > > > some seconds
> > > > > > - stop the trace (ctrl+C)
> > > > > 
> > > > > Thanks for collecting the log, looks your 'stackcount'
> doesn't
> > > > > include
> > > > > comm/pid info, seems there is difference between your bcc
> and
> > > > > my bcc in fedora 30.
> > > > > 
> > > > > Could you collect above log again via the following command?
> > > > > 
> > > > > /usr/share/bcc/tools/stackcount -P -K
> t:block:block_rq_insert
> > > > > 
> > > > > which will show the comm/pid info.
> > > > 
> > > > ok, attached (trace_20191219.txt), the test (1 trial) took
> 3684
> > > > seconds.
> > > 
> > > From the above trace:
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     b'cp' [19863]
> > >     4400
> > > 
> > > So this write is clearly from 'cp' process, and it should be one
> > > ext4 fs issue.
> > > 
> > > Ted, can you take a look at this issue?
> > > 
> > > > 
> > > > > > I also tried the usual test with btrfs and xfs. Btrfs
> behavior
> > > > > looks
> > > > > > "good". xfs seems sometimes better, sometimes worse, I
> would
> > > say.
> > > > > I
> > > > > > don't know if it matters, anyway you can also find the
> results
> > > of
> > > > > the
> > > > > > two tests (100 trials each). Basically, btrfs is always
> > > between 68
> > > > > and
> > > > > > 89 seconds, with a cyclicity (?) with "period=2 trials".
> xfs
> > > looks
> > > > > > almost always very good (63-65s), but sometimes "bad"
> (>300s).
> > > > > 
> > > > > If you are interested in digging into this one, the
> following
> > > trace
> > > > > should be helpful:
> > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > > 
> > > > 
> > > > Attached:
> > > > - trace_xfs_20191223.txt (7 trials, then aborted while doing
> the
> > > 8th),
> > > > times to complete:
> > > > 64s
> > > > 63s
> > > > 64s
> > > > 833s
> > > > 1105s
> > > > 63s
> > > > 64s
> > > 
> > > oops, looks we have to collect io insert trace with the
> following
> > > bcc script
> > > on xfs for confirming if there is similar issue with ext4, could
> you
> > > run
> > > it again on xfs? And only post the trace done in case of slow
> 'cp'.
> > > 
> > > 
> > > #!/bin/sh
> > > 
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > > 
> > > /usr/share/bcc/tools/trace -t -C \
> > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> args-
> > > >rwbs, args->sector, args->nr_sector' \
> > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> args-
> > > >rwbs, args->sector, args->nr_sector'
> > > 
> > > 
> > here it is (1 trial, 313 seconds to finish)
> 
> The above log shows similar issue with ext4 since there is another
> writeback IO path from 'cp' process. And the following trace can
> show if
> it is same with ext4's issue:
> 
> /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert

sorry, also here please tell me which conditions should I use to run
the test (ext4 or xfs? slow run or not important?)

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  8:04                                                                                                                   ` Andrea Vai
@ 2019-12-24  8:47                                                                                                                     ` Ming Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Ming Lei @ 2019-12-24  8:47 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Tue, Dec 24, 2019 at 09:04:10AM +0100, Andrea Vai wrote:
> Il giorno mar, 24/12/2019 alle 09.32 +0800, Ming Lei ha scritto:
> > On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> > > Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > > > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha
> > scritto:
> > > > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > > > scritto:
> > > > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y.
> > Ts'o
> > > > > > wrote:
> > > > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei
> > wrote:
> > > > > > > > > > I didn't reproduce the issue in my test environment,
> > and
> > > > > > follows
> > > > > > > > > > Andrea's test commands[1]:
> > > > > > > > > > 
> > > > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a
> > $logfile
> > > > > > > > > >   SECONDS=0
> > > > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > > 
> > > > > > > > > > The 'cp' command supposes to open/close the file
> > just
> > > > once,
> > > > > > > > however
> > > > > > > > > > ext4_release_file() & write pages is observed to run
> > for
> > > > > > 4358
> > > > > > > > times
> > > > > > > > > > when executing the above 'cp' test.
> > > > > > > > > 
> > > > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > > > coming
> > > > > > from
> > > > > > > > the
> > > > > > > > > cp command, as opposed to something else that might be
> > > > running
> > > > > > on
> > > > > > > > the
> > > > > > > > > system under test?  _fput() is called by the kernel
> > when
> > > > the
> > > > > > last
> > > > > > > > 
> > > > > > > > Please see the log:
> > > > > > > > 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > > > 
> > > > > > > > Which is collected by:
> > > > > > > > 
> > > > > > > > #!/bin/sh
> > > > > > > > MAJ=$1
> > > > > > > > MIN=$2
> > > > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s
> > %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > > > 
> > > > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > > > 
> > > > > > > > From the above IO trace, there are two write paths, one
> > is
> > > > from
> > > > > > cp,
> > > > > > > > another is from writeback wq.
> > > > > > > > 
> > > > > > > > The stackcount trace[1] is consistent with the IO trace
> > log
> > > > > > since it
> > > > > > > > only shows two IO paths, that is why I concluded that
> > the
> > > > write
> > > > > > done
> > > > > > > > via
> > > > > > > > ext4_release_file() is from 'cp'.
> > > > > > > > 
> > > > > > > > [1] 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > > > 
> > > > > > > > > reference to a struct file is
> > released.  (Specifically, if
> > > > you
> > > > > > > > have a
> > > > > > > > > fd which is dup'ed, it's only when the last fd
> > > > corresponding
> > > > > > to
> > > > > > > > the
> > > > > > > > > struct file is closed, and the struct file is about to
> > be
> > > > > > > > released,
> > > > > > > > > does the file system's f_ops->release function get
> > > > called.)
> > > > > > > > > 
> > > > > > > > > So the first question I'd ask is whether there is
> > anything
> > > > > > else
> > > > > > > > going
> > > > > > > > > on the system, and whether the writes are happening to
> > the
> > > > USB
> > > > > > > > thumb
> > > > > > > > > drive, or to some other storage device.  And if there
> > is
> > > > > > something
> > > > > > > > > else which is writing to the pendrive, maybe that's
> > why no
> > > > one
> > > > > > > > else
> > > > > > > > > has been able to reproduce the OP's complaint....
> > > > > > > > 
> > > > > > > > OK, we can ask Andrea to confirm that via the following
> > > > trace,
> > > > > > which
> > > > > > > > will add pid/comm info in the stack trace:
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted
> > > > > > > > 
> > > > > > > > Andrew, could you collect the above log again when
> > running
> > > > > > new/bad
> > > > > > > > kernel for confirming if the write done by
> > > > ext4_release_file()
> > > > > > is
> > > > > > > > from
> > > > > > > > the 'cp' process?
> > > > > > > 
> > > > > > > You can find the stackcount log attached. It has been
> > produced
> > > > by:
> > > > > > > 
> > > > > > > - /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted >
> > > > > > trace.log
> > > > > > > - wait some seconds
> > > > > > > - run the test (1 copy trial), wait for the test to
> > finish,
> > > > wait
> > > > > > some seconds
> > > > > > > - stop the trace (ctrl+C)
> > > > > > 
> > > > > > Thanks for collecting the log, looks your 'stackcount'
> > doesn't
> > > > > > include
> > > > > > comm/pid info, seems there is difference between your bcc
> > and
> > > > > > my bcc in fedora 30.
> > > > > > 
> > > > > > Could you collect above log again via the following command?
> > > > > > 
> > > > > > /usr/share/bcc/tools/stackcount -P -K
> > t:block:block_rq_insert
> > > > > > 
> > > > > > which will show the comm/pid info.
> > > > > 
> > > > > ok, attached (trace_20191219.txt), the test (1 trial) took
> > 3684
> > > > > seconds.
> > > > 
> > > > From the above trace:
> > > > 
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'dd_insert_requests'
> > > >   b'blk_mq_sched_insert_requests'
> > > >   b'blk_mq_flush_plug_list'
> > > >   b'blk_flush_plug_list'
> > > >   b'io_schedule_prepare'
> > > >   b'io_schedule'
> > > >   b'rq_qos_wait'
> > > >   b'wbt_wait'
> > > >   b'__rq_qos_throttle'
> > > >   b'blk_mq_make_request'
> > > >   b'generic_make_request'
> > > >   b'submit_bio'
> > > >   b'ext4_io_submit'
> > > >   b'ext4_writepages'
> > > >   b'do_writepages'
> > > >   b'__filemap_fdatawrite_range'
> > > >   b'ext4_release_file'
> > > >   b'__fput'
> > > >   b'task_work_run'
> > > >   b'exit_to_usermode_loop'
> > > >   b'do_syscall_64'
> > > >   b'entry_SYSCALL_64_after_hwframe'
> > > >     b'cp' [19863]
> > > >     4400
> > > > 
> > > > So this write is clearly from 'cp' process, and it should be one
> > > > ext4 fs issue.
> > > > 
> > > > Ted, can you take a look at this issue?
> > > > 
> > > > > 
> > > > > > > I also tried the usual test with btrfs and xfs. Btrfs
> > behavior
> > > > > > looks
> > > > > > > "good". xfs seems sometimes better, sometimes worse, I
> > would
> > > > say.
> > > > > > I
> > > > > > > don't know if it matters, anyway you can also find the
> > results
> > > > of
> > > > > > the
> > > > > > > two tests (100 trials each). Basically, btrfs is always
> > > > between 68
> > > > > > and
> > > > > > > 89 seconds, with a cyclicity (?) with "period=2 trials".
> > xfs
> > > > looks
> > > > > > > almost always very good (63-65s), but sometimes "bad"
> > (>300s).
> > > > > > 
> > > > > > If you are interested in digging into this one, the
> > following
> > > > trace
> > > > > > should be helpful:
> > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > > > 
> > > > > 
> > > > > Attached:
> > > > > - trace_xfs_20191223.txt (7 trials, then aborted while doing
> > the
> > > > 8th),
> > > > > times to complete:
> > > > > 64s
> > > > > 63s
> > > > > 64s
> > > > > 833s
> > > > > 1105s
> > > > > 63s
> > > > > 64s
> > > > 
> > > > oops, looks we have to collect io insert trace with the
> > following
> > > > bcc script
> > > > on xfs for confirming if there is similar issue with ext4, could
> > you
> > > > run
> > > > it again on xfs? And only post the trace done in case of slow
> > 'cp'.
> > > > 
> > > > 
> > > > #!/bin/sh
> > > > 
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > 
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector' \
> > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > 
> > > here it is (1 trial, 313 seconds to finish)
> > 
> > The above log shows similar issue with ext4 since there is another
> > writeback IO path from 'cp' process. And the following trace can
> > show if
> > it is same with ext4's issue:
> > 
> > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> 
> sorry, also here please tell me which conditions should I use to run
> the test (ext4 or xfs? slow run or not important?)

Maybe not needed.

After thinking the issue further, looks it is highly related with
removing ioc_batching and BDI congestion by blk-mq.

When there are two writeback paths, the original block layer(legacy)
can set 'cp' process which writes pages during close() as 'batching',
then write pages from writeback wq context is blocked. That said there
is actually single writeback IO path even though two are writing pages,
so write order can be maintained, see the following comment in original
__get_request():

	/*
	 * The queue will fill after this allocation, so set
	 * it as full, and mark this process as "batching".
	 * This process will be allowed to complete a batch of
	 * requests, others will be blocked.
	 */

This behavior can be shown in the IO trace done in old kernel with
legacy block IO path:

https://lore.kernel.org/linux-scsi/f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it/2-log_ming_20191128_182751.zip

IMO, we need to figure out one solution in blk-mq to fix this issue
since HDD. performance might be hurt under this situation.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                                                                                         ` Ming Lei
  2019-12-24  6:49                                                                                                                           ` Andrea Vai
@ 2019-12-24  8:51                                                                                                                           ` Andrea Vai
  2019-12-24  9:35                                                                                                                             ` Ming Lei
  2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
  2 siblings, 1 reply; 102+ messages in thread
From: Andrea Vai @ 2019-12-24  8:51 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 4324 bytes --]

Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> Hi Ted,
> 
> On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > basically, it's:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive
> > >   umount /mnt/pendrive
> > >   tempo=$SECONDS
> > > 
> > > and it copies one file only. Anyway, you can find the whole
> script
> > > attached.
> > 
> > OK, so whether we are doing the writeback at the end of cp, or
> when
> > you do the umount, it's probably not going to make any
> difference.  We
> > can get rid of the stack trace in question by changing the script
> to
> > be basically:
> > 
> > mount UUID=$uuid /mnt/pendrive
> > SECONDS=0
> > rm -f /mnt/pendrive/$testfile
> > cp $testfile /mnt/pendrive
> > umount /mnt/pendrive
> > tempo=$SECONDS
> > 
> > I predict if you do that, you'll see that all of the time is spent
> in
> > the umount, when we are trying to write back the file.
> > 
> > I really don't think then this is a file system problem at
> all.  It's
> > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > trace in the writeback code waiting for the I/O to be completed,
> but
> > that doesn't mean that the root cause is in the writeback code or
> in
> > the file system which is triggering the writeback.
> 
> Wrt. the slow write on this usb storage, it is caused by two
> writeback
> path, one is the writeback wq, another is from ext4_release_file()
> which
> is triggered from exit_to_usermode_loop().
> 
> When the two write path is run concurrently, the sequential write
> order
> is broken, then write performance drops much on this particular usb
> storage.
> 
> The ext4_release_file() should be run from read() or write() syscall
> if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():
> 
> 	/*
> 	 * Called when an inode is released. Note that this is
> different
> 	 * from ext4_file_open: open gets called at every open, but
> release
> 	 * gets called only when /all/ the files are closed.
> 	 */
> 	static int ext4_release_file(struct inode *inode, struct file
> *filp)
> 
> > 
> > I suspect the next step is use a blktrace, to see what kind of I/O
> is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x
> 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that
> doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.
> 
> > 
> > > > And then send me
> > > btw, please tell me if "me" means only you or I cc: all the
> > > recipients, as usual
> > 
> > Well, I don't think we know what the root cause is.  Ming is
> focusing
> > on that stack trace, but I think it's a red herring.....  And if
> it's
> > not a file system problem, then other people will be best suited
> to
> > debug the issue.
> 
> So far, the reason points to the extra writeback path from
> exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with
> file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.
> 
> Andrea, please collect the following log or the strace log requested
> by Ted, then
> we can confirm if the extra writeback is from close() or
> read/write() syscall:
> 
> # pass PID of 'cp' to this script
> #!/bin/sh
> PID=$1
> /usr/share/bcc/tools/trace -P $PID  -t -C \
>     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> args->nr_sector' \
>     't:syscalls:sys_exit_close ' \
>     't:syscalls:sys_exit_read ' \
>     't:syscalls:sys_exit_write '

Meanwhile, I tried to run the test and obtained an error (...usage:
trace [-h] [-b BUFFER_PAGES] [-p PID]...), so assumed the "-P" should
be "-p", corrected and obtained the attached log with ext4 and a slow
copy (2482 seconds) by doing:

- start the test
- look at the cp pid
- run the trace
- wait for the test to finish
- stop the trace.

Thanks,
Andrea

[-- Attachment #2: 20191224_test_ming.zip --]
[-- Type: application/zip, Size: 20830 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  8:51                                                                                                                           ` Andrea Vai
@ 2019-12-24  9:35                                                                                                                             ` Ming Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Ming Lei @ 2019-12-24  9:35 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Tue, Dec 24, 2019 at 09:51:16AM +0100, Andrea Vai wrote:
> Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> > Hi Ted,
> > 
> > On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > > basically, it's:
> > > > 
> > > >   mount UUID=$uuid /mnt/pendrive
> > > >   SECONDS=0
> > > >   cp $testfile /mnt/pendrive
> > > >   umount /mnt/pendrive
> > > >   tempo=$SECONDS
> > > > 
> > > > and it copies one file only. Anyway, you can find the whole
> > script
> > > > attached.
> > > 
> > > OK, so whether we are doing the writeback at the end of cp, or
> > when
> > > you do the umount, it's probably not going to make any
> > difference.  We
> > > can get rid of the stack trace in question by changing the script
> > to
> > > be basically:
> > > 
> > > mount UUID=$uuid /mnt/pendrive
> > > SECONDS=0
> > > rm -f /mnt/pendrive/$testfile
> > > cp $testfile /mnt/pendrive
> > > umount /mnt/pendrive
> > > tempo=$SECONDS
> > > 
> > > I predict if you do that, you'll see that all of the time is spent
> > in
> > > the umount, when we are trying to write back the file.
> > > 
> > > I really don't think then this is a file system problem at
> > all.  It's
> > > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > > trace in the writeback code waiting for the I/O to be completed,
> > but
> > > that doesn't mean that the root cause is in the writeback code or
> > in
> > > the file system which is triggering the writeback.
> > 
> > Wrt. the slow write on this usb storage, it is caused by two
> > writeback
> > path, one is the writeback wq, another is from ext4_release_file()
> > which
> > is triggered from exit_to_usermode_loop().
> > 
> > When the two write path is run concurrently, the sequential write
> > order
> > is broken, then write performance drops much on this particular usb
> > storage.
> > 
> > The ext4_release_file() should be run from read() or write() syscall
> > if
> > Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> > behavior
> > for ext4_release_file() to be run thousands of times when just
> > running 'cp' once, see comment of ext4_release_file():
> > 
> > 	/*
> > 	 * Called when an inode is released. Note that this is
> > different
> > 	 * from ext4_file_open: open gets called at every open, but
> > release
> > 	 * gets called only when /all/ the files are closed.
> > 	 */
> > 	static int ext4_release_file(struct inode *inode, struct file
> > *filp)
> > 
> > > 
> > > I suspect the next step is use a blktrace, to see what kind of I/O
> > is
> > > being sent to the USB drive, and how long it takes for the I/O to
> > > complete.  You might also try to capture the output of "iostat -x
> > 1"
> > > while the script is running, and see what the difference might be
> > > between a kernel version that has the problem and one that
> > doesn't,
> > > and see if that gives us a clue.
> > 
> > That isn't necessary, given we have concluded that the bad write
> > performance is caused by broken write order.
> > 
> > > 
> > > > > And then send me
> > > > btw, please tell me if "me" means only you or I cc: all the
> > > > recipients, as usual
> > > 
> > > Well, I don't think we know what the root cause is.  Ming is
> > focusing
> > > on that stack trace, but I think it's a red herring.....  And if
> > it's
> > > not a file system problem, then other people will be best suited
> > to
> > > debug the issue.
> > 
> > So far, the reason points to the extra writeback path from
> > exit_to_usermode_loop().
> > If it is not from close() syscall, the issue should be related with
> > file reference
> > count. If it is from close() syscall, the issue might be in 'cp''s
> > implementation.
> > 
> > Andrea, please collect the following log or the strace log requested
> > by Ted, then
> > we can confirm if the extra writeback is from close() or
> > read/write() syscall:
> > 
> > # pass PID of 'cp' to this script
> > #!/bin/sh
> > PID=$1
> > /usr/share/bcc/tools/trace -P $PID  -t -C \
> >     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> > args->nr_sector' \
> >     't:syscalls:sys_exit_close ' \
> >     't:syscalls:sys_exit_read ' \
> >     't:syscalls:sys_exit_write '
> 
> Meanwhile, I tried to run the test and obtained an error (...usage:
> trace [-h] [-b BUFFER_PAGES] [-p PID]...), so assumed the "-P" should
> be "-p", corrected and obtained the attached log with ext4 and a slow
> copy (2482 seconds) by doing:
> 
> - start the test
> - look at the cp pid
> - run the trace
> - wait for the test to finish
> - stop the trace.

The log shows all io submission is from close() syscall, so fs code
is fine, and I have provided the reason of this issue in last email:

https://lore.kernel.org/linux-scsi/e3dc2a3e0221c0a0beb91172ba2bff1f6acc0cb7.camel@unipv.it/T/#m845caca2969da5676516c35dc0c3528a79beb886

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                                                                                         ` Ming Lei
  2019-12-24  6:49                                                                                                                           ` Andrea Vai
  2019-12-24  8:51                                                                                                                           ` Andrea Vai
@ 2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
  2019-12-26  2:27                                                                                                                             ` Ming Lei
  2 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-25  5:17 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 24, 2019 at 09:27:07AM +0800, Ming Lei wrote:
> The ext4_release_file() should be run from read() or write() syscall if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():

What's your evidence of that?  As opposed to the writeback taking a
long time, leading to the *one* call of ext4_release_file taking a
long time?  If it's a big file, we might very well be calliing
ext4_writepages multiple times, from a single call to
__filemap_fdatawrite_range().

You confused mightily from that assertion, and that caused me to make
assumptions that cp was doing something crazy.  But I'm quite conviced
now that this is almost certainly not what is happening.

> > I suspect the next step is use a blktrace, to see what kind of I/O is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.

I didn't see any evidence of that from what I had in my inbox, so I
went back to the mailing list archives to figure out what you were
talking about.  Part of the problem is this has been a very
long-spanning thread, and I had deleted from my inbox all of the parts
relating to the MQ scheduler since that was clearly Not My Problem.  :-)

So, summarizing the most of the thread.  The problem started when we
removed the legacy I/O scheduler, since we are now only using the MQ
scheduler.  What the kernel is sending is long writes (240 sectors),
but it is being sent as an interleaved stream of two sequential
writes.  This particular pendrive can't handle this workload, because
it has a very simplistic Flash Translation Layer.  Now, this is not
*broken*, from a storage perspective; it's just that it's more than
the simple little brain of this particular pen drive can handle.

Previously, with a single queue, and specially since the queue depth
supported by this pen drive is 1, the elevator algorithm would sort
the I/O requests so that it would be mostly sequential, and this
wouldn't be much of a problem.  However, once the legacy I/O stack was
removed, the MQ stack is designed so that we don't have to take a
global lock in order to submit an I/O request.  That also means that
we can't do a full elevator sort since that would require locking all
of the queues.

This is not a problem, since HDD's generally have a 16 deep queue, and
SSD's have a super-deep queue depth since they get their speed via
parallel writes to different flash chips.  Unfortunately, it *is* a
problem for super primitive USB sticks.

> So far, the reason points to the extra writeback path from exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.

Oh, it's probably from the close system call; and it's *only* from a
single close system call.  Because there is the auto delayed
allocation resolution to protect against buggy userspace, under
certain circumstances, as I explained earlier, we force a full
writeout on a close for a file decsriptor which was opened with an
O_TRUNC.  This is by *design*, since we are trying to protect against
buggy userspace (application programmers vastly outnumber file system
programmers, and far too many of them want O_PONY).  This is Working
As Intended.

You can disable it by deleting the test file before the cp:

    rm -f /mnt/pendrive/$testfile

Or you can disable the protection against stupid userspace by using
the noauto_da_alloc mount option.  (But then if you have a buggy game
program which writes the top-ten score file by using open(2) w/
O_TRUNC, and then said program closes the OpenGL library, and the
proprietary 3rd party binary-only video driver wedges the X server
requiring a hard reset to recover, and the top-ten score file becomes
a zero-length file, don't come crying to me...  Or if a graphical text
editor forgets to use fsync(2) before saving a source file you spent
hours working on, and then the system crashes at exactly the wrong
moment and your source file becomes zero-length, against, don't come
crying to me.  Blame the stupid application programmer which wrote
your text editor who decided to skip the fsync(2), or who decided that
copying the ACL's and xattrs was Too Hard(tm), and so opening the file
with O_TRUNC and rewriting the file in place was easier for the
application programmer.)

In any case, I think this is all working all as intended.  The MQ I/O
stack is optimized for modern HDD and SSD's, and especially SSD's.
And the file system assumes that parallel sequential writes,
especially if they are large, is really not a big deal, since that's
what NCQ or massive parallelism of pretty much all SSD's want.
(Again, ignoring the legacy of crappy flash drives.

You can argue with storage stack folks about whether we need to have
super-dumb mode for slow, crappy flash which uses a global lock and a
global elevator scheduler for super-crappy flash if you want.  I'm
going to stay out of that argument.

					- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
@ 2019-12-26  2:27                                                                                                                             ` Ming Lei
  2019-12-26  3:30                                                                                                                               ` Theodore Y. Ts'o
  0 siblings, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-26  2:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 25, 2019 at 12:17:22AM -0500, Theodore Y. Ts'o wrote:
> On Tue, Dec 24, 2019 at 09:27:07AM +0800, Ming Lei wrote:
> > The ext4_release_file() should be run from read() or write() syscall if
> > Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
> > for ext4_release_file() to be run thousands of times when just
> > running 'cp' once, see comment of ext4_release_file():
> 
> What's your evidence of that?  As opposed to the writeback taking a
> long time, leading to the *one* call of ext4_release_file taking a
> long time?  If it's a big file, we might very well be calliing
> ext4_writepages multiple times, from a single call to
> __filemap_fdatawrite_range().
> 
> You confused mightily from that assertion, and that caused me to make
> assumptions that cp was doing something crazy.  But I'm quite conviced
> now that this is almost certainly not what is happening.
> 
> > > I suspect the next step is use a blktrace, to see what kind of I/O is
> > > being sent to the USB drive, and how long it takes for the I/O to
> > > complete.  You might also try to capture the output of "iostat -x 1"
> > > while the script is running, and see what the difference might be
> > > between a kernel version that has the problem and one that doesn't,
> > > and see if that gives us a clue.
> > 
> > That isn't necessary, given we have concluded that the bad write
> > performance is caused by broken write order.
> 
> I didn't see any evidence of that from what I had in my inbox, so I
> went back to the mailing list archives to figure out what you were
> talking about.  Part of the problem is this has been a very
> long-spanning thread, and I had deleted from my inbox all of the parts
> relating to the MQ scheduler since that was clearly Not My Problem.  :-)
> 
> So, summarizing the most of the thread.  The problem started when we
> removed the legacy I/O scheduler, since we are now only using the MQ
> scheduler.  What the kernel is sending is long writes (240 sectors),
> but it is being sent as an interleaved stream of two sequential
> writes.  This particular pendrive can't handle this workload, because
> it has a very simplistic Flash Translation Layer.  Now, this is not
> *broken*, from a storage perspective; it's just that it's more than
> the simple little brain of this particular pen drive can handle.
> 
> Previously, with a single queue, and specially since the queue depth
> supported by this pen drive is 1, the elevator algorithm would sort
> the I/O requests so that it would be mostly sequential, and this
> wouldn't be much of a problem.  However, once the legacy I/O stack was
> removed, the MQ stack is designed so that we don't have to take a
> global lock in order to submit an I/O request.  That also means that
> we can't do a full elevator sort since that would require locking all
> of the queues.
> 
> This is not a problem, since HDD's generally have a 16 deep queue, and
> SSD's have a super-deep queue depth since they get their speed via
> parallel writes to different flash chips.  Unfortunately, it *is* a
> problem for super primitive USB sticks.
> 
> > So far, the reason points to the extra writeback path from exit_to_usermode_loop().
> > If it is not from close() syscall, the issue should be related with file reference
> > count. If it is from close() syscall, the issue might be in 'cp''s
> > implementation.
> 
> Oh, it's probably from the close system call; and it's *only* from a
> single close system call.  Because there is the auto delayed

Right. Looks I mis-interpreted the stackcount log, IOs are submitted
from single close syscall.

> allocation resolution to protect against buggy userspace, under
> certain circumstances, as I explained earlier, we force a full
> writeout on a close for a file decsriptor which was opened with an
> O_TRUNC.  This is by *design*, since we are trying to protect against
> buggy userspace (application programmers vastly outnumber file system
> programmers, and far too many of them want O_PONY).  This is Working
> As Intended.
> 
> You can disable it by deleting the test file before the cp:
> 
>     rm -f /mnt/pendrive/$testfile
> 
> Or you can disable the protection against stupid userspace by using
> the noauto_da_alloc mount option.  (But then if you have a buggy game
> program which writes the top-ten score file by using open(2) w/
> O_TRUNC, and then said program closes the OpenGL library, and the
> proprietary 3rd party binary-only video driver wedges the X server
> requiring a hard reset to recover, and the top-ten score file becomes
> a zero-length file, don't come crying to me...  Or if a graphical text
> editor forgets to use fsync(2) before saving a source file you spent
> hours working on, and then the system crashes at exactly the wrong
> moment and your source file becomes zero-length, against, don't come
> crying to me.  Blame the stupid application programmer which wrote
> your text editor who decided to skip the fsync(2), or who decided that
> copying the ACL's and xattrs was Too Hard(tm), and so opening the file
> with O_TRUNC and rewriting the file in place was easier for the
> application programmer.)
> 
> In any case, I think this is all working all as intended.  The MQ I/O
> stack is optimized for modern HDD and SSD's, and especially SSD's.
> And the file system assumes that parallel sequential writes,
> especially if they are large, is really not a big deal, since that's
> what NCQ or massive parallelism of pretty much all SSD's want.
> (Again, ignoring the legacy of crappy flash drives.
> 
> You can argue with storage stack folks about whether we need to have
> super-dumb mode for slow, crappy flash which uses a global lock and a
> global elevator scheduler for super-crappy flash if you want.  I'm
> going to stay out of that argument.

As I mentioned in the following link:

https://lore.kernel.org/linux-scsi/20191224084721.GA27248@ming.t460p/

The reason is that ioc_batching and BDI congestion is removed by blk-mq.

Then after queue is congested, multiple sequential writes can be done
concurrently at the same time. Before ioc_batching and BDI congestion is
removed, writes are done serialized from multiple processes actually, so
IOs are dispatched to drive in strict sequential order.

This way can't be an issue for SSD.

Maybe we need to be careful for HDD., since the request count in scheduler
queue is double of in-flight request count, and in theory NCQ should only
cover all in-flight 32 requests. I will find a sata HDD., and see if
performance drop can be observed in the similar 'cp' test.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  2:27                                                                                                                             ` Ming Lei
@ 2019-12-26  3:30                                                                                                                               ` Theodore Y. Ts'o
  2019-12-26  8:37                                                                                                                                 ` Ming Lei
       [not found]                                                                                                                                 ` <20200101074310.10904-1-hdanton@sina.com>
  0 siblings, 2 replies; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-26  3:30 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> Maybe we need to be careful for HDD., since the request count in scheduler
> queue is double of in-flight request count, and in theory NCQ should only
> cover all in-flight 32 requests. I will find a sata HDD., and see if
> performance drop can be observed in the similar 'cp' test.

Please try to measure it, but I'd be really surprised if it's
significant with with modern HDD's.  That because they typically have
a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just under
32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
125-150 MiB/s.  So suppose every other request sent to the HDD is from
the other request stream.  The disk will chose the 8 requests from its
queue that are contiguous, and so it will be writing around 256 MiB,
which will take 2-3 seconds.  If it then needs to spend between 1 and
10 ms seeking to another location of the disk, before it writes the
next 256 MiB, the worst case overhead of that seek is 10ms / 2s, or
0.5%.  That may very well be within your measurements' error bars.

And of course, note that in real life, we are very *often* writing to
multiple files in parallel, for example, during a "make -j16" while
building the kernel.  Writing a single large file is certainly
something people do (but even there people who are burning a 4G DVD
rip are often browsing the web while they are waiting for it to
complete, and the browser will be writing cache files, etc.).  So
whether or not this is something where we should be stressing over
this specific workload is going to be quite debateable.

						- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  3:30                                                                                                                               ` Theodore Y. Ts'o
@ 2019-12-26  8:37                                                                                                                                 ` Ming Lei
  2020-01-07  7:51                                                                                                                                   ` Andrea Vai
       [not found]                                                                                                                                 ` <20200101074310.10904-1-hdanton@sina.com>
  1 sibling, 1 reply; 102+ messages in thread
From: Ming Lei @ 2019-12-26  8:37 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > Maybe we need to be careful for HDD., since the request count in scheduler
> > queue is double of in-flight request count, and in theory NCQ should only
> > cover all in-flight 32 requests. I will find a sata HDD., and see if
> > performance drop can be observed in the similar 'cp' test.
> 
> Please try to measure it, but I'd be really surprised if it's
> significant with with modern HDD's.

Just find one machine with AHCI SATA, and run the following xfs
overwrite test:

#!/bin/bash
DIR=$1
echo 3 > /proc/sys/vm/drop_caches
fio --readwrite=write --filesize=5g --overwrite=1 --filename=$DIR/fiofile \
        --runtime=60s --time_based --ioengine=psync --direct=0 --bs=4k
		--iodepth=128 --numjobs=2 --group_reporting=1 --name=overwrite

FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32), because the
machine is picked up from RH beaker, and it is the only disk in the box.

#lsblk
NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                               8:0    0 931.5G  0 disk 
├─sda1                            8:1    0     1G  0 part /boot
└─sda2                            8:2    0 930.5G  0 part 
  ├─rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
  ├─rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
  └─rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home


kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which is
the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO path").

            |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
-----------------------------------------------------------
throughput: |244MB/s               |169MB/s               |
-----------------------------------------------------------

Similar result can be observed on v5.4 kernel(184MB/s) with same test
steps.


> That because they typically have
> a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just under
> 32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
> 8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
> 125-150 MiB/s.  So suppose every other request sent to the HDD is from
> the other request stream.  The disk will chose the 8 requests from its
> queue that are contiguous, and so it will be writing around 256 MiB,
> which will take 2-3 seconds.  If it then needs to spend between 1 and
> 10 ms seeking to another location of the disk, before it writes the
> next 256 MiB, the worst case overhead of that seek is 10ms / 2s, or
> 0.5%.  That may very well be within your measurements' error bars.

Looks you assume that disk seeking just happens once when writing around
256MB. This assumption may not be true, given all data can be in page
cache before writing. So when two tasks are submitting IOs concurrently,
IOs from each single task is sequential, and NCQ may order the current batch
submitted from the two streams. However disk seeking may still be needed
for the next batch handled by NCQ.

> And of course, note that in real life, we are very *often* writing to
> multiple files in parallel, for example, during a "make -j16" while
> building the kernel.  Writing a single large file is certainly
> something people do (but even there people who are burning a 4G DVD
> rip are often browsing the web while they are waiting for it to
> complete, and the browser will be writing cache files, etc.).  So
> whether or not this is something where we should be stressing over
> this specific workload is going to be quite debateable.

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: slow IO on USB media
       [not found]                                                                                                                                 ` <20200101074310.10904-1-hdanton@sina.com>
@ 2020-01-01 13:53                                                                                                                                   ` Ming Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Ming Lei @ 2020-01-01 13:53 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Theodore Y. Ts'o, Andrea Vai, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Wed, Jan 01, 2020 at 03:43:10PM +0800, Hillf Danton wrote:
> 
> On Thu, 26 Dec 2019 16:37:06 +0800 Ming Lei wrote:
> > On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> > > On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > > > Maybe we need to be careful for HDD., since the request count in scheduler
> > > > queue is double of in-flight request count, and in theory NCQ should only
> > > > cover all in-flight 32 requests. I will find a sata HDD., and see if
> > > > performance drop can be observed in the similar 'cp' test.
> > >
> > > Please try to measure it, but I'd be really surprised if it's
> > > significant with with modern HDD's.
> > 
> > Just find one machine with AHCI SATA, and run the following xfs
> > overwrite test:
> > 
> > #!/bin/bash
> > DIR=$1
> > echo 3 > /proc/sys/vm/drop_caches
> > fio --readwrite=write --filesize=5g --overwrite=1 --filename=$DIR/fiofile \
> >         --runtime=60s --time_based --ioengine=psync --direct=0 --bs=4k
> > 		--iodepth=128 --numjobs=2 --group_reporting=1 --name=overwrite
> > 
> > FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32), because the
> > machine is picked up from RH beaker, and it is the only disk in the box.
> > 
> > #lsblk
> > NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> > sda                               8:0    0 931.5G  0 disk
> > =E2=94=9C=E2=94=80sda1                            8:1    0     1G  0 part /boot
> > =E2=94=94=E2=94=80sda2                            8:2    0 930.5G  0 part
> >   =E2=94=9C=E2=94=80rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
> >   =E2=94=9C=E2=94=80rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
> >   =E2=94=94=E2=94=80rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home
> > 
> > 
> > kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which is
> > the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO path").
> > 
> >             |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
> > -----------------------------------------------------------
> > throughput: |244MB/s               |169MB/s               |
> > -----------------------------------------------------------
> > 
> > Similar result can be observed on v5.4 kernel(184MB/s) with same test
> > steps.
> 
> 
> The simple diff makes direct issue of requests take pending requests
> also into account and goes the nornal enqueue-and-dequeue path if any
> pending requests exist.
> 
> Then it sorts requests regardless of the number of hard queues in a
> bid to make requests as sequencial as they are. Let's see if the
> added sorting cost can make any sense.
> 
> --->8---
> 
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -410,6 +410,11 @@ run:
>  		blk_mq_run_hw_queue(hctx, async);
>  }
>  
> +static inline bool blk_mq_sched_hctx_has_pending_rq(struct blk_mq_hw_ctx *hctx)
> +{
> +	return sbitmap_any_bit_set(&hctx->ctx_map);
> +}
> +
>  void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
>  				  struct blk_mq_ctx *ctx,
>  				  struct list_head *list, bool run_queue_async)
> @@ -433,7 +438,8 @@ void blk_mq_sched_insert_requests(struct
>  		 * busy in case of 'none' scheduler, and this way may save
>  		 * us one extra enqueue & dequeue to sw queue.
>  		 */
> -		if (!hctx->dispatch_busy && !e && !run_queue_async) {
> +		if (!hctx->dispatch_busy && !e && !run_queue_async &&
> +		    !blk_mq_sched_hctx_has_pending_rq(hctx)) {
>  			blk_mq_try_issue_list_directly(hctx, list);
>  			if (list_empty(list))
>  				goto out;
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1692,7 +1692,7 @@ void blk_mq_flush_plug_list(struct blk_p
>  
>  	list_splice_init(&plug->mq_list, &list);
>  
> -	if (plug->rq_count > 2 && plug->multiple_queues)
> +	if (plug->rq_count > 1)
>  		list_sort(NULL, &list, plug_rq_cmp);
>  
>  	plug->rq_count = 0;

I guess you may not understand the reason, and the issue is related
with neither MQ nor plug.

AHCI/SATA is single queue drive, and for HDD. IO throughput is very
sensitive with IO order in case of sequential IO.

Legacy IO path supports ioc batching and BDI queue congestion. When
there are more than one writeback IO paths, there may be only one
active IO submission path, meantime others are blocked attributed to
ioc batching, so writeback IO is still dispatched to disk in strict
IO order.

But ioc batching and BDI queue congestion is killed when converting to
blk-mq.

Please see the following IO trace with legacy IO request path:

https://lore.kernel.org/linux-scsi/f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it/2-log_ming_20191128_182751.zip


Thanks,
Ming


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  8:37                                                                                                                                 ` Ming Lei
@ 2020-01-07  7:51                                                                                                                                   ` Andrea Vai
  0 siblings, 0 replies; 102+ messages in thread
From: Andrea Vai @ 2020-01-07  7:51 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno gio, 26/12/2019 alle 16.37 +0800, Ming Lei ha scritto:
> On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> > On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > > Maybe we need to be careful for HDD., since the request count in
> scheduler
> > > queue is double of in-flight request count, and in theory NCQ
> should only
> > > cover all in-flight 32 requests. I will find a sata HDD., and
> see if
> > > performance drop can be observed in the similar 'cp' test.
> > 
> > Please try to measure it, but I'd be really surprised if it's
> > significant with with modern HDD's.
> 
> Just find one machine with AHCI SATA, and run the following xfs
> overwrite test:
> 
> #!/bin/bash
> DIR=$1
> echo 3 > /proc/sys/vm/drop_caches
> fio --readwrite=write --filesize=5g --overwrite=1 --
> filename=$DIR/fiofile \
>         --runtime=60s --time_based --ioengine=psync --direct=0 --
> bs=4k
> 		--iodepth=128 --numjobs=2 --group_reporting=1 --
> name=overwrite
> 
> FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32),
> because the
> machine is picked up from RH beaker, and it is the only disk in the
> box.
> 
> #lsblk
> NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> sda                               8:0    0 931.5G  0 disk 
> ├─sda1                            8:1    0     1G  0 part /boot
> └─sda2                            8:2    0 930.5G  0 part 
>   ├─rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
>   ├─rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
>   └─rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home
> 
> 
> kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which
> is
> the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO
> path").
> 
>             |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
> -----------------------------------------------------------
> throughput: |244MB/s               |169MB/s               |
> -----------------------------------------------------------
> 
> Similar result can be observed on v5.4 kernel(184MB/s) with same
> test
> steps.
> 
> 
> > That because they typically have
> > a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just
> under
> > 32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
> > 8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
> > 125-150 MiB/s.  So suppose every other request sent to the HDD is
> from
> > the other request stream.  The disk will chose the 8 requests from
> its
> > queue that are contiguous, and so it will be writing around 256
> MiB,
> > which will take 2-3 seconds.  If it then needs to spend between 1
> and
> > 10 ms seeking to another location of the disk, before it writes
> the
> > next 256 MiB, the worst case overhead of that seek is 10ms / 2s,
> or
> > 0.5%.  That may very well be within your measurements' error bars.
> 
> Looks you assume that disk seeking just happens once when writing
> around
> 256MB. This assumption may not be true, given all data can be in
> page
> cache before writing. So when two tasks are submitting IOs
> concurrently,
> IOs from each single task is sequential, and NCQ may order the
> current batch
> submitted from the two streams. However disk seeking may still be
> needed
> for the next batch handled by NCQ.
> 
> > And of course, note that in real life, we are very *often* writing
> to
> > multiple files in parallel, for example, during a "make -j16"
> while
> > building the kernel.  Writing a single large file is certainly
> > something people do (but even there people who are burning a 4G
> DVD
> > rip are often browsing the web while they are waiting for it to
> > complete, and the browser will be writing cache files, etc.).  So
> > whether or not this is something where we should be stressing over
> > this specific workload is going to be quite debateable.
> 

Hi,
  is there any update on this? Sorry if I am making noise, but I would
like to help to improve the kernel (or fix it) if I can help.
Otherwise, please let me know how to consider this case,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2020-01-07  7:51 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <307581a490b610c3025ee80f79a465a89d68ed19.camel@unipv.it>
2019-08-20 17:13 ` Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Alan Stern
2019-08-23 10:39   ` Andrea Vai
2019-08-23 20:42     ` Alan Stern
2019-08-26  6:09       ` Andrea Vai
2019-08-26 16:33         ` Alan Stern
2019-09-18 15:25           ` Andrea Vai
2019-09-18 16:30             ` Alan Stern
2019-09-19  7:33               ` Andrea Vai
2019-09-19 17:54                 ` Alan Stern
2019-09-20  7:25                   ` Andrea Vai
2019-09-20  7:44                     ` Greg KH
2019-09-19  8:26               ` Damien Le Moal
2019-09-19  8:55                 ` Ming Lei
2019-09-19  9:09                   ` Damien Le Moal
2019-09-19  9:21                     ` Ming Lei
2019-09-19 14:01                 ` Alan Stern
2019-09-19 14:14                   ` Damien Le Moal
2019-09-20  7:03                     ` Andrea Vai
2019-09-25 19:30                       ` Alan Stern
2019-09-25 19:36                         ` Jens Axboe
2019-09-27 15:47                           ` Andrea Vai
2019-11-04 16:00                             ` Andrea Vai
2019-11-04 18:20                               ` Alan Stern
2019-11-05 11:48                                 ` Andrea Vai
2019-11-05 18:31                                   ` Alan Stern
2019-11-05 23:29                                     ` Jens Axboe
2019-11-06 16:03                                       ` Alan Stern
2019-11-06 22:13                                         ` Damien Le Moal
2019-11-07  7:04                                           ` Andrea Vai
2019-11-07  7:54                                             ` Damien Le Moal
2019-11-07 18:59                                               ` Andrea Vai
2019-11-08  8:42                                                 ` Damien Le Moal
2019-11-08 14:33                                                   ` Jens Axboe
2019-11-11 10:46                                                     ` Andrea Vai
2019-11-09 10:09                                                   ` Ming Lei
2019-11-09 22:28                                                 ` Ming Lei
2019-11-11 10:50                                                   ` Andrea Vai
2019-11-11 11:05                                                     ` Ming Lei
2019-11-11 11:13                                                       ` Andrea Vai
2019-11-22 19:16                                                   ` Andrea Vai
2019-11-23  7:28                                                     ` Ming Lei
2019-11-23 15:44                                                       ` Andrea Vai
2019-11-25  3:54                                                         ` Ming Lei
2019-11-25 10:11                                                           ` Andrea Vai
2019-11-25 10:29                                                             ` Ming Lei
2019-11-25 14:58                                                               ` Andrea Vai
2019-11-25 15:15                                                                 ` Ming Lei
2019-11-25 18:51                                                                   ` Andrea Vai
2019-11-26  2:32                                                                     ` Ming Lei
2019-11-26  7:46                                                                       ` Andrea Vai
2019-11-26  9:15                                                                         ` Ming Lei
2019-11-26 10:24                                                                           ` Ming Lei
2019-11-26 11:14                                                                           ` Andrea Vai
2019-11-27  2:05                                                                             ` Ming Lei
2019-11-27  9:39                                                                               ` Andrea Vai
2019-11-27 13:08                                                                                 ` Ming Lei
2019-11-27 15:01                                                                                   ` Andrea Vai
2019-11-27  0:21                                                                         ` Finn Thain
2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
2019-11-27 21:49                                                                             ` Finn Thain
2019-11-28  7:46                                                                             ` Andrea Vai
2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
2019-11-28 11:40                                                                                 ` Andrea Vai
2019-11-28 17:39                                                                                 ` Alan Stern
2019-11-28  9:17                                                                               ` Ming Lei
2019-11-28 17:34                                                                                 ` Andrea Vai
2019-11-29  0:57                                                                                   ` Ming Lei
2019-11-29  2:35                                                                                     ` Ming Lei
2019-11-29 14:41                                                                                       ` Andrea Vai
2019-12-03  2:23                                                                                         ` Ming Lei
2019-12-10  7:35                                                                                           ` Andrea Vai
2019-12-10  8:05                                                                                             ` Ming Lei
2019-12-11  2:41                                                                                               ` Theodore Y. Ts'o
2019-12-11  4:00                                                                                                 ` Ming Lei
2019-12-11 16:07                                                                                                   ` Theodore Y. Ts'o
2019-12-11 21:33                                                                                                     ` Ming Lei
2019-12-12  7:34                                                                                                       ` Andrea Vai
2019-12-18  8:25                                                                                                       ` Andrea Vai
2019-12-18  9:48                                                                                                         ` Ming Lei
     [not found]                                                                                                           ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
2019-12-23 13:08                                                                                                             ` Ming Lei
2019-12-23 14:02                                                                                                               ` Andrea Vai
2019-12-24  1:32                                                                                                                 ` Ming Lei
2019-12-24  8:04                                                                                                                   ` Andrea Vai
2019-12-24  8:47                                                                                                                     ` Ming Lei
2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
2019-12-23 16:29                                                                                                                 ` Andrea Vai
2019-12-23 17:22                                                                                                                   ` Theodore Y. Ts'o
2019-12-23 18:45                                                                                                                     ` Andrea Vai
2019-12-23 19:53                                                                                                                       ` Theodore Y. Ts'o
2019-12-24  1:27                                                                                                                         ` Ming Lei
2019-12-24  6:49                                                                                                                           ` Andrea Vai
2019-12-24  8:51                                                                                                                           ` Andrea Vai
2019-12-24  9:35                                                                                                                             ` Ming Lei
2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
2019-12-26  2:27                                                                                                                             ` Ming Lei
2019-12-26  3:30                                                                                                                               ` Theodore Y. Ts'o
2019-12-26  8:37                                                                                                                                 ` Ming Lei
2020-01-07  7:51                                                                                                                                   ` Andrea Vai
     [not found]                                                                                                                                 ` <20200101074310.10904-1-hdanton@sina.com>
2020-01-01 13:53                                                                                                                                   ` slow IO on USB media Ming Lei
2019-11-29 11:44                                                                                     ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Bernd Schubert
2019-12-02  7:01                                                                                       ` Andrea Vai
2019-11-28 17:10                                                                           ` Andrea Vai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).