Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found]                 ` <320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it>
@ 2019-12-03  2:23                   ` Ming Lei
  2019-12-10  7:35                     ` Andrea Vai
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-03  2:23 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > 
> > > [...]
> > 
> > > Andrea, can you collect the following log when running the test
> > > on current new(bad) kernel?
> > > 
> > > 	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request
> > 
> > Instead, please run the following trace, given insert may be
> > called from other paths, such as flush plug:
> > 
> > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> 
> Attached, for new (patched) bad kernel.
> 
> Produced by: start the trace script (with the pendrive already
> plugged), wait some seconds, run the test (1 trial, 1 GB), wait for
> the test to finish, stop the trace.
> 
> The copy took ~1700 seconds.

See the two path[1][2] of inserting request, and path[1] is triggered
4358 times, and the path[2] is triggered 5763 times.

The path[2] is expected behaviour. Not sure path [1] is correct, given
ext4_release_file() is supposed to be called when this inode is
released. That means the file is closed 4358 times during 1GB file
copying to usb storage.

Cc filesystem list.


[1] insert requests when returning to user mode from syscall

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_writepages'
  b'do_writepages'
  b'__filemap_fdatawrite_range'
  b'ext4_release_file'
  b'__fput'
  b'task_work_run'
  b'exit_to_usermode_loop'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
    4358

[2] insert requests from writeback wq context

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_bio_write_page'
  b'mpage_submit_page'
  b'mpage_process_page_bufs'
  b'mpage_prepare_extent_to_map'
  b'ext4_writepages'
  b'do_writepages'
  b'__writeback_single_inode'
  b'writeback_sb_inodes'
  b'__writeback_inodes_wb'
  b'wb_writeback'
  b'wb_workfn'
  b'process_one_work'
  b'worker_thread'
  b'kthread'
  b'ret_from_fork'
    5763

Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-03  2:23                   ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
@ 2019-12-10  7:35                     ` Andrea Vai
  2019-12-10  8:05                       ` Ming Lei
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-10  7:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > 
> > > > [...]
> > > 
> > > > Andrea, can you collect the following log when running the
> test
> > > > on current new(bad) kernel?
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount  -K
> blk_mq_make_request
> > > 
> > > Instead, please run the following trace, given insert may be
> > > called from other paths, such as flush plug:
> > > 
> > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > 
> > Attached, for new (patched) bad kernel.
> > 
> > Produced by: start the trace script (with the pendrive already
> > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> for
> > the test to finish, stop the trace.
> > 
> > The copy took ~1700 seconds.
> 
> See the two path[1][2] of inserting request, and path[1] is
> triggered
> 4358 times, and the path[2] is triggered 5763 times.
> 
> The path[2] is expected behaviour. Not sure path [1] is correct,
> given
> ext4_release_file() is supposed to be called when this inode is
> released. That means the file is closed 4358 times during 1GB file
> copying to usb storage.
> 
> Cc filesystem list.
> 
> 
> [1] insert requests when returning to user mode from syscall
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     4358
> 
> [2] insert requests from writeback wq context
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_bio_write_page'
>   b'mpage_submit_page'
>   b'mpage_process_page_bufs'
>   b'mpage_prepare_extent_to_map'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__writeback_single_inode'
>   b'writeback_sb_inodes'
>   b'__writeback_inodes_wb'
>   b'wb_writeback'
>   b'wb_workfn'
>   b'process_one_work'
>   b'worker_thread'
>   b'kthread'
>   b'ret_from_fork'
>     5763
> 
> Thanks,
> Ming
> 

Is there any update on this? Sorry if I am making noise, but I would
like to help to improve the kernel (or fix it) if I can help.
Otherwise, please let me know how to consider this case,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  7:35                     ` Andrea Vai
@ 2019-12-10  8:05                       ` Ming Lei
  2019-12-11  2:41                         ` Theodore Y. Ts'o
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-10  8:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Tue, Dec 10, 2019 at 08:35:43AM +0100, Andrea Vai wrote:
> Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > > 
> > > > > [...]
> > > > 
> > > > > Andrea, can you collect the following log when running the
> > test
> > > > > on current new(bad) kernel?
> > > > > 
> > > > > 	/usr/share/bcc/tools/stackcount  -K
> > blk_mq_make_request
> > > > 
> > > > Instead, please run the following trace, given insert may be
> > > > called from other paths, such as flush plug:
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > > 
> > > Attached, for new (patched) bad kernel.
> > > 
> > > Produced by: start the trace script (with the pendrive already
> > > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> > for
> > > the test to finish, stop the trace.
> > > 
> > > The copy took ~1700 seconds.
> > 
> > See the two path[1][2] of inserting request, and path[1] is
> > triggered
> > 4358 times, and the path[2] is triggered 5763 times.
> > 
> > The path[2] is expected behaviour. Not sure path [1] is correct,
> > given
> > ext4_release_file() is supposed to be called when this inode is
> > released. That means the file is closed 4358 times during 1GB file
> > copying to usb storage.
> > 
> > Cc filesystem list.
> > 
> > 
> > [1] insert requests when returning to user mode from syscall
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     4358
> > 
> > [2] insert requests from writeback wq context
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_bio_write_page'
> >   b'mpage_submit_page'
> >   b'mpage_process_page_bufs'
> >   b'mpage_prepare_extent_to_map'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__writeback_single_inode'
> >   b'writeback_sb_inodes'
> >   b'__writeback_inodes_wb'
> >   b'wb_writeback'
> >   b'wb_workfn'
> >   b'process_one_work'
> >   b'worker_thread'
> >   b'kthread'
> >   b'ret_from_fork'
> >     5763
> > 
> > Thanks,
> > Ming
> > 
> 
> Is there any update on this? Sorry if I am making noise, but I would
> like to help to improve the kernel (or fix it) if I can help.
> Otherwise, please let me know how to consider this case,

IMO, the extra write path from exit_to_usermode_loop() isn't expected,
that should be the reason why write IO order is changed, then performance
drops on your USB storage.

We need our fs/ext4 experts to take a look.

Or can you reproduce the issue on xfs or btrfs?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  8:05                       ` Ming Lei
@ 2019-12-11  2:41                         ` Theodore Y. Ts'o
  2019-12-11  4:00                           ` Ming Lei
  0 siblings, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-11  2:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > given
> > > ext4_release_file() is supposed to be called when this inode is
> > > released. That means the file is closed 4358 times during 1GB file
> > > copying to usb storage.
> > > 
> > > [1] insert requests when returning to user mode from syscall
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     4358

I'm guessing that your workload is repeatedly truncating a file (or
calling open with O_TRUNC) and then writing data to it.  When you do
this, then when the file is closed, we assume that since you were
replacing the previous contents of a file with new contents, that you
would be unhappy if the file contents was replaced by a zero length
file after a crash.  That's because ten years, ago there were a *huge*
number of crappy applications that would replace a file by reading it
into memory, truncating it, and then write out the new contents of the
file.  This could be a high score file for a game, or a KDE or GNOME
state file, etc.

So if someone does open, truncate, write, close, we still immediately
writing out the data on the close, assuming that the programmer really
wanted open, truncate, write, fsync, close, but was too careless to
actually do the right thing.

Some workaround[1] like this is done by all of the major file systems,
and was fallout the agreement from the "O_PONIES"[2] controversy.
This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
the "rename, fsync, and ponies" section.)

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
[2] https://blahg.josefsipek.net/?p=364
[3] https://lwn.net/Articles/327601/

So if you're seeing a call to filemap_fdatawrite_range as the result
of a fput, that's why.

In any case, this behavior has been around for a decade, and it
appears to be incidental to your performance difficulties with your
USB thumbdrive and block-mq.

						- Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11  2:41                         ` Theodore Y. Ts'o
@ 2019-12-11  4:00                           ` Ming Lei
  2019-12-11 16:07                             ` Theodore Y. Ts'o
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-11  4:00 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 10, 2019 at 09:41:37PM -0500, Theodore Y. Ts'o wrote:
> On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > > given
> > > > ext4_release_file() is supposed to be called when this inode is
> > > > released. That means the file is closed 4358 times during 1GB file
> > > > copying to usb storage.
> > > > 
> > > > [1] insert requests when returning to user mode from syscall
> > > > 
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'dd_insert_requests'
> > > >   b'blk_mq_sched_insert_requests'
> > > >   b'blk_mq_flush_plug_list'
> > > >   b'blk_flush_plug_list'
> > > >   b'io_schedule_prepare'
> > > >   b'io_schedule'
> > > >   b'rq_qos_wait'
> > > >   b'wbt_wait'
> > > >   b'__rq_qos_throttle'
> > > >   b'blk_mq_make_request'
> > > >   b'generic_make_request'
> > > >   b'submit_bio'
> > > >   b'ext4_io_submit'
> > > >   b'ext4_writepages'
> > > >   b'do_writepages'
> > > >   b'__filemap_fdatawrite_range'
> > > >   b'ext4_release_file'
> > > >   b'__fput'
> > > >   b'task_work_run'
> > > >   b'exit_to_usermode_loop'
> > > >   b'do_syscall_64'
> > > >   b'entry_SYSCALL_64_after_hwframe'
> > > >     4358
> 
> I'm guessing that your workload is repeatedly truncating a file (or
> calling open with O_TRUNC) and then writing data to it.  When you do
> this, then when the file is closed, we assume that since you were
> replacing the previous contents of a file with new contents, that you
> would be unhappy if the file contents was replaced by a zero length
> file after a crash.  That's because ten years, ago there were a *huge*
> number of crappy applications that would replace a file by reading it
> into memory, truncating it, and then write out the new contents of the
> file.  This could be a high score file for a game, or a KDE or GNOME
> state file, etc.
> 
> So if someone does open, truncate, write, close, we still immediately
> writing out the data on the close, assuming that the programmer really
> wanted open, truncate, write, fsync, close, but was too careless to
> actually do the right thing.
> 
> Some workaround[1] like this is done by all of the major file systems,
> and was fallout the agreement from the "O_PONIES"[2] controversy.
> This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
> the "rename, fsync, and ponies" section.)
> 
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
> [2] https://blahg.josefsipek.net/?p=364
> [3] https://lwn.net/Articles/327601/
> 
> So if you're seeing a call to filemap_fdatawrite_range as the result
> of a fput, that's why.
> 
> In any case, this behavior has been around for a decade, and it
> appears to be incidental to your performance difficulties with your
> USB thumbdrive and block-mq.

I didn't reproduce the issue in my test environment, and follows
Andrea's test commands[1]:

  mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
  SECONDS=0
  cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
  umount /mnt/pendrive 2>&1 |tee -a $logfile

The 'cp' command supposes to open/close the file just once, however
ext4_release_file() & write pages is observed to run for 4358 times
when executing the above 'cp' test.


[1] https://marc.info/?l=linux-kernel&m=157486689806734&w=2


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11  4:00                           ` Ming Lei
@ 2019-12-11 16:07                             ` Theodore Y. Ts'o
  2019-12-11 21:33                               ` Ming Lei
  0 siblings, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-11 16:07 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> I didn't reproduce the issue in my test environment, and follows
> Andrea's test commands[1]:
> 
>   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
>   SECONDS=0
>   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
>   umount /mnt/pendrive 2>&1 |tee -a $logfile
> 
> The 'cp' command supposes to open/close the file just once, however
> ext4_release_file() & write pages is observed to run for 4358 times
> when executing the above 'cp' test.

Why are we sure the ext4_release_file() / _fput() is coming from the
cp command, as opposed to something else that might be running on the
system under test?  _fput() is called by the kernel when the last
reference to a struct file is released.  (Specifically, if you have a
fd which is dup'ed, it's only when the last fd corresponding to the
struct file is closed, and the struct file is about to be released,
does the file system's f_ops->release function get called.)

So the first question I'd ask is whether there is anything else going
on the system, and whether the writes are happening to the USB thumb
drive, or to some other storage device.  And if there is something
else which is writing to the pendrive, maybe that's why no one else
has been able to reproduce the OP's complaint....

    	      	 	    	     - Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 16:07                             ` Theodore Y. Ts'o
@ 2019-12-11 21:33                               ` Ming Lei
  2019-12-12  7:34                                 ` Andrea Vai
  2019-12-18  8:25                                 ` Andrea Vai
  0 siblings, 2 replies; 29+ messages in thread
From: Ming Lei @ 2019-12-11 21:33 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > I didn't reproduce the issue in my test environment, and follows
> > Andrea's test commands[1]:
> > 
> >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> >   SECONDS=0
> >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > 
> > The 'cp' command supposes to open/close the file just once, however
> > ext4_release_file() & write pages is observed to run for 4358 times
> > when executing the above 'cp' test.
> 
> Why are we sure the ext4_release_file() / _fput() is coming from the
> cp command, as opposed to something else that might be running on the
> system under test?  _fput() is called by the kernel when the last

Please see the log:

https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip

Which is collected by:

#!/bin/sh
MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'

$MAJ:$MIN points to the USB storage disk.

From the above IO trace, there are two write paths, one is from cp,
another is from writeback wq.

The stackcount trace[1] is consistent with the IO trace log since it
only shows two IO paths, that is why I concluded that the write done via
ext4_release_file() is from 'cp'.

[1] https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip

> reference to a struct file is released.  (Specifically, if you have a
> fd which is dup'ed, it's only when the last fd corresponding to the
> struct file is closed, and the struct file is about to be released,
> does the file system's f_ops->release function get called.)
> 
> So the first question I'd ask is whether there is anything else going
> on the system, and whether the writes are happening to the USB thumb
> drive, or to some other storage device.  And if there is something
> else which is writing to the pendrive, maybe that's why no one else
> has been able to reproduce the OP's complaint....

OK, we can ask Andrea to confirm that via the following trace, which
will add pid/comm info in the stack trace:

/usr/share/bcc/tools/stackcount  blk_mq_sched_request_inserted

Andrew, could you collect the above log again when running new/bad
kernel for confirming if the write done by ext4_release_file() is from
the 'cp' process?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 21:33                               ` Ming Lei
@ 2019-12-12  7:34                                 ` Andrea Vai
  2019-12-18  8:25                                 ` Andrea Vai
  1 sibling, 0 replies; 29+ messages in thread
From: Andrea Vai @ 2019-12-12  7:34 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > I didn't reproduce the issue in my test environment, and follows
> > > Andrea's test commands[1]:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > 
> > > The 'cp' command supposes to open/close the file just once,
> however
> > > ext4_release_file() & write pages is observed to run for 4358
> times
> > > when executing the above 'cp' test.
> > 
> > Why are we sure the ext4_release_file() / _fput() is coming from
> the
> > cp command, as opposed to something else that might be running on
> the
> > system under test?  _fput() is called by the kernel when the last
> 
> Please see the log:
> 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> 
> Which is collected by:
> 
> #!/bin/sh
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> $MAJ:$MIN points to the USB storage disk.
> 
> From the above IO trace, there are two write paths, one is from cp,
> another is from writeback wq.
> 
> The stackcount trace[1] is consistent with the IO trace log since it
> only shows two IO paths, that is why I concluded that the write done
> via
> ext4_release_file() is from 'cp'.
> 
> [1] 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> 
> > reference to a struct file is released.  (Specifically, if you
> have a
> > fd which is dup'ed, it's only when the last fd corresponding to
> the
> > struct file is closed, and the struct file is about to be
> released,
> > does the file system's f_ops->release function get called.)
> > 
> > So the first question I'd ask is whether there is anything else
> going
> > on the system, and whether the writes are happening to the USB
> thumb
> > drive, or to some other storage device.  And if there is something
> > else which is writing to the pendrive, maybe that's why no one
> else
> > has been able to reproduce the OP's complaint....
> 
> OK, we can ask Andrea to confirm that via the following trace, which
> will add pid/comm info in the stack trace:
> 
> /usr/share/bcc/tools/stackcount  blk_mq_sched_request_inserted
> 
> Andrea, could you collect the above log again when running new/bad
> kernel for confirming if the write done by ext4_release_file() is
> from
> the 'cp' process?

Yes, I will try to do it as soon as possible and let you know.
I will also try xfs or btrfs, as you suggested in another message.

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-11 21:33                               ` Ming Lei
  2019-12-12  7:34                                 ` Andrea Vai
@ 2019-12-18  8:25                                 ` Andrea Vai
  2019-12-18  9:48                                   ` Ming Lei
  1 sibling, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-18  8:25 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 3636 bytes --]

Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > I didn't reproduce the issue in my test environment, and follows
> > > Andrea's test commands[1]:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > 
> > > The 'cp' command supposes to open/close the file just once,
> however
> > > ext4_release_file() & write pages is observed to run for 4358
> times
> > > when executing the above 'cp' test.
> > 
> > Why are we sure the ext4_release_file() / _fput() is coming from
> the
> > cp command, as opposed to something else that might be running on
> the
> > system under test?  _fput() is called by the kernel when the last
> 
> Please see the log:
> 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> 
> Which is collected by:
> 
> #!/bin/sh
> MAJ=$1
> MIN=$2
> MAJ=$(( $MAJ << 20 ))
> DEV=$(( $MAJ | $MIN ))
> 
> /usr/share/bcc/tools/trace -t -C \
>     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector' \
>     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> >rwbs, args->sector, args->nr_sector'
> 
> $MAJ:$MIN points to the USB storage disk.
> 
> From the above IO trace, there are two write paths, one is from cp,
> another is from writeback wq.
> 
> The stackcount trace[1] is consistent with the IO trace log since it
> only shows two IO paths, that is why I concluded that the write done
> via
> ext4_release_file() is from 'cp'.
> 
> [1] 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> 
> > reference to a struct file is released.  (Specifically, if you
> have a
> > fd which is dup'ed, it's only when the last fd corresponding to
> the
> > struct file is closed, and the struct file is about to be
> released,
> > does the file system's f_ops->release function get called.)
> > 
> > So the first question I'd ask is whether there is anything else
> going
> > on the system, and whether the writes are happening to the USB
> thumb
> > drive, or to some other storage device.  And if there is something
> > else which is writing to the pendrive, maybe that's why no one
> else
> > has been able to reproduce the OP's complaint....
> 
> OK, we can ask Andrea to confirm that via the following trace, which
> will add pid/comm info in the stack trace:
> 
> /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> 
> Andrew, could you collect the above log again when running new/bad
> kernel for confirming if the write done by ext4_release_file() is
> from
> the 'cp' process?

You can find the stackcount log attached. It has been produced by:

- /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted > trace.log
- wait some seconds
- run the test (1 copy trial), wait for the test to finish, wait some seconds
- stop the trace (ctrl+C)

The test took 1994 seconds to complete.

I also tried the usual test with btrfs and xfs. Btrfs behavior looks
"good". xfs seems sometimes better, sometimes worse, I would say. I
don't know if it matters, anyway you can also find the results of the
two tests (100 trials each). Basically, btrfs is always between 68 and
89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
almost always very good (63-65s), but sometimes "bad" (>300s).

Thanks,
Andrea

[-- Attachment #2: test_btrfs_20191217.txt --]
[-- Type: text/plain, Size: 11367 bytes --]

*** test btrfs *** -> test_btrfs_20191217.txt

Starting 100 tries with:
Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
-rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
/dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
/dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
/dev/sdf1: LABEL="BAK_ANDVAI" UUID="6ddfec28-3d9a-4676-a726-927fd3fb21e7" UUID_SUB="581c69ab-6758-4662-999a-b6dfe6ee5e69" TYPE="btrfs" PARTUUID="09066b88-01"
/dev/sdg1: LABEL="BAK_ANDVAI" UUID="df777e33-8890-4cee-a718-42233f4cafae" TYPE="ext4" PARTUUID="75265898-01"
cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
Inizio: mar 17 dic 2019, 13:31:00, CET...fine: mar 17 dic 2019, 13:32:26, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:32:26, CET...fine: mar 17 dic 2019, 13:33:36, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 13:33:36, CET...fine: mar 17 dic 2019, 13:35:02, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:35:02, CET...fine: mar 17 dic 2019, 13:36:14, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 13:36:14, CET...fine: mar 17 dic 2019, 13:37:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:37:40, CET...fine: mar 17 dic 2019, 13:38:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:38:51, CET...fine: mar 17 dic 2019, 13:40:18, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 13:40:18, CET...fine: mar 17 dic 2019, 13:41:29, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:41:29, CET...fine: mar 17 dic 2019, 13:42:58, CET --> ci ho messo 89 secondi!
Inizio: mar 17 dic 2019, 13:42:58, CET...fine: mar 17 dic 2019, 13:44:11, CET --> ci ho messo 73 secondi!
Inizio: mar 17 dic 2019, 13:44:11, CET...fine: mar 17 dic 2019, 13:45:40, CET --> ci ho messo 89 secondi!
Inizio: mar 17 dic 2019, 13:45:40, CET...fine: mar 17 dic 2019, 13:46:49, CET --> ci ho messo 69 secondi!
Inizio: mar 17 dic 2019, 13:46:49, CET...fine: mar 17 dic 2019, 13:48:16, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 13:48:16, CET...fine: mar 17 dic 2019, 13:49:27, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:49:27, CET...fine: mar 17 dic 2019, 13:50:53, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:50:53, CET...fine: mar 17 dic 2019, 13:52:04, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:52:04, CET...fine: mar 17 dic 2019, 13:53:30, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:53:30, CET...fine: mar 17 dic 2019, 13:54:41, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:54:41, CET...fine: mar 17 dic 2019, 13:56:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 13:56:07, CET...fine: mar 17 dic 2019, 13:57:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:57:18, CET...fine: mar 17 dic 2019, 13:58:46, CET --> ci ho messo 88 secondi!
Inizio: mar 17 dic 2019, 13:58:46, CET...fine: mar 17 dic 2019, 13:59:57, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 13:59:57, CET...fine: mar 17 dic 2019, 14:01:23, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:01:23, CET...fine: mar 17 dic 2019, 14:02:34, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:02:34, CET...fine: mar 17 dic 2019, 14:04:00, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:04:00, CET...fine: mar 17 dic 2019, 14:05:11, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:05:11, CET...fine: mar 17 dic 2019, 14:06:38, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 14:06:38, CET...fine: mar 17 dic 2019, 14:07:49, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:07:49, CET...fine: mar 17 dic 2019, 14:09:15, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:09:15, CET...fine: mar 17 dic 2019, 14:10:26, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:10:26, CET...fine: mar 17 dic 2019, 14:11:52, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:11:52, CET...fine: mar 17 dic 2019, 14:13:03, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:13:03, CET...fine: mar 17 dic 2019, 14:14:29, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:14:29, CET...fine: mar 17 dic 2019, 14:15:41, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:15:41, CET...fine: mar 17 dic 2019, 14:17:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:17:07, CET...fine: mar 17 dic 2019, 14:18:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:18:18, CET...fine: mar 17 dic 2019, 14:19:44, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:19:44, CET...fine: mar 17 dic 2019, 14:20:55, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:20:55, CET...fine: mar 17 dic 2019, 14:22:21, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:22:21, CET...fine: mar 17 dic 2019, 14:23:33, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:23:33, CET...fine: mar 17 dic 2019, 14:24:59, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:24:59, CET...fine: mar 17 dic 2019, 14:26:10, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:26:10, CET...fine: mar 17 dic 2019, 14:27:36, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:27:36, CET...fine: mar 17 dic 2019, 14:28:46, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 14:28:46, CET...fine: mar 17 dic 2019, 14:30:12, CET --> ci ho messo 85 secondi!
Inizio: mar 17 dic 2019, 14:30:12, CET...fine: mar 17 dic 2019, 14:31:23, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:31:23, CET...fine: mar 17 dic 2019, 14:32:49, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:32:49, CET...fine: mar 17 dic 2019, 14:34:01, CET --> ci ho messo 72 secondi!
Inizio: mar 17 dic 2019, 14:34:01, CET...fine: mar 17 dic 2019, 14:35:27, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:35:27, CET...fine: mar 17 dic 2019, 14:36:38, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:36:38, CET...fine: mar 17 dic 2019, 14:38:04, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:38:04, CET...fine: mar 17 dic 2019, 14:39:15, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:39:15, CET...fine: mar 17 dic 2019, 14:40:41, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:40:41, CET...fine: mar 17 dic 2019, 14:41:52, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:41:52, CET...fine: mar 17 dic 2019, 14:43:18, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:43:18, CET...fine: mar 17 dic 2019, 14:44:29, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:44:29, CET...fine: mar 17 dic 2019, 14:45:55, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:45:55, CET...fine: mar 17 dic 2019, 14:47:06, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:47:06, CET...fine: mar 17 dic 2019, 14:48:32, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:48:32, CET...fine: mar 17 dic 2019, 14:49:43, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:49:43, CET...fine: mar 17 dic 2019, 14:51:10, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 14:51:10, CET...fine: mar 17 dic 2019, 14:52:23, CET --> ci ho messo 73 secondi!
Inizio: mar 17 dic 2019, 14:52:23, CET...fine: mar 17 dic 2019, 14:53:49, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:53:49, CET...fine: mar 17 dic 2019, 14:55:00, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:55:00, CET...fine: mar 17 dic 2019, 14:56:26, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:56:26, CET...fine: mar 17 dic 2019, 14:57:37, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 14:57:37, CET...fine: mar 17 dic 2019, 14:59:03, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 14:59:03, CET...fine: mar 17 dic 2019, 15:00:14, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:00:14, CET...fine: mar 17 dic 2019, 15:01:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:01:40, CET...fine: mar 17 dic 2019, 15:02:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:02:51, CET...fine: mar 17 dic 2019, 15:04:17, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:04:17, CET...fine: mar 17 dic 2019, 15:05:28, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:05:28, CET...fine: mar 17 dic 2019, 15:06:53, CET --> ci ho messo 85 secondi!
Inizio: mar 17 dic 2019, 15:06:53, CET...fine: mar 17 dic 2019, 15:08:04, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:08:04, CET...fine: mar 17 dic 2019, 15:09:30, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:09:30, CET...fine: mar 17 dic 2019, 15:10:41, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:10:41, CET...fine: mar 17 dic 2019, 15:12:07, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:12:07, CET...fine: mar 17 dic 2019, 15:13:18, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:13:18, CET...fine: mar 17 dic 2019, 15:14:44, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:14:44, CET...fine: mar 17 dic 2019, 15:15:55, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:15:55, CET...fine: mar 17 dic 2019, 15:17:21, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:17:21, CET...fine: mar 17 dic 2019, 15:18:32, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:18:32, CET...fine: mar 17 dic 2019, 15:19:58, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:19:58, CET...fine: mar 17 dic 2019, 15:21:08, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:21:08, CET...fine: mar 17 dic 2019, 15:22:34, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:22:34, CET...fine: mar 17 dic 2019, 15:23:45, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:23:45, CET...fine: mar 17 dic 2019, 15:25:11, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:25:11, CET...fine: mar 17 dic 2019, 15:26:22, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:26:22, CET...fine: mar 17 dic 2019, 15:27:48, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:27:48, CET...fine: mar 17 dic 2019, 15:28:59, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:28:59, CET...fine: mar 17 dic 2019, 15:30:25, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:30:25, CET...fine: mar 17 dic 2019, 15:31:35, CET --> ci ho messo 70 secondi!
Inizio: mar 17 dic 2019, 15:31:35, CET...fine: mar 17 dic 2019, 15:33:03, CET --> ci ho messo 87 secondi!
Inizio: mar 17 dic 2019, 15:33:03, CET...fine: mar 17 dic 2019, 15:34:14, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:34:14, CET...fine: mar 17 dic 2019, 15:35:40, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:35:40, CET...fine: mar 17 dic 2019, 15:36:51, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:36:51, CET...fine: mar 17 dic 2019, 15:38:17, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:38:17, CET...fine: mar 17 dic 2019, 15:39:28, CET --> ci ho messo 71 secondi!
Inizio: mar 17 dic 2019, 15:39:28, CET...fine: mar 17 dic 2019, 15:40:54, CET --> ci ho messo 86 secondi!
Inizio: mar 17 dic 2019, 15:40:54, CET...fine: mar 17 dic 2019, 15:42:05, CET --> ci ho messo 71 secondi!

[-- Attachment #3: test_xfs_20191217.txt --]
[-- Type: text/plain, Size: 11206 bytes --]

*** TEST XFS: *** -> test_xfs_20191217.txt

Starting 100 tries with:
Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
-rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
/dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
/dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
/dev/sdf1: UUID="eb5a4791-5b26-44b6-871e-efd464a3adc5" TYPE="xfs" PARTUUID="09066b88-01"
cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
Inizio: mar 17 dic 2019, 23:58:22, CET...fine: mar 17 dic 2019, 23:59:28, CET --> ci ho messo 64 secondi!
Inizio: mar 17 dic 2019, 23:59:28, CET...fine: mer 18 dic 2019, 00:00:33, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:00:33, CET...fine: mer 18 dic 2019, 00:01:39, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:01:39, CET...fine: mer 18 dic 2019, 00:06:35, CET --> ci ho messo 294 secondi!
Inizio: mer 18 dic 2019, 00:06:35, CET...fine: mer 18 dic 2019, 00:07:41, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:07:41, CET...fine: mer 18 dic 2019, 00:08:46, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:08:46, CET...fine: mer 18 dic 2019, 00:09:52, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:09:52, CET...fine: mer 18 dic 2019, 00:10:57, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:10:57, CET...fine: mer 18 dic 2019, 00:12:03, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:12:03, CET...fine: mer 18 dic 2019, 00:13:08, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:13:08, CET...fine: mer 18 dic 2019, 00:14:14, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:14:14, CET...fine: mer 18 dic 2019, 00:15:19, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:15:19, CET...fine: mer 18 dic 2019, 00:21:39, CET --> ci ho messo 379 secondi!
Inizio: mer 18 dic 2019, 00:21:39, CET...fine: mer 18 dic 2019, 00:22:44, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:22:44, CET...fine: mer 18 dic 2019, 00:23:50, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:23:50, CET...fine: mer 18 dic 2019, 00:29:16, CET --> ci ho messo 325 secondi!
Inizio: mer 18 dic 2019, 00:29:16, CET...fine: mer 18 dic 2019, 00:30:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:30:22, CET...fine: mer 18 dic 2019, 00:34:50, CET --> ci ho messo 266 secondi!
Inizio: mer 18 dic 2019, 00:34:50, CET...fine: mer 18 dic 2019, 00:35:56, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:35:56, CET...fine: mer 18 dic 2019, 00:37:01, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:37:01, CET...fine: mer 18 dic 2019, 00:43:39, CET --> ci ho messo 397 secondi!
Inizio: mer 18 dic 2019, 00:43:39, CET...fine: mer 18 dic 2019, 00:48:31, CET --> ci ho messo 291 secondi!
Inizio: mer 18 dic 2019, 00:48:31, CET...fine: mer 18 dic 2019, 00:49:37, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:49:37, CET...fine: mer 18 dic 2019, 00:50:42, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:50:42, CET...fine: mer 18 dic 2019, 00:55:39, CET --> ci ho messo 296 secondi!
Inizio: mer 18 dic 2019, 00:55:39, CET...fine: mer 18 dic 2019, 00:56:44, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 00:56:44, CET...fine: mer 18 dic 2019, 00:57:50, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 00:57:50, CET...fine: mer 18 dic 2019, 00:58:54, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 00:58:54, CET...fine: mer 18 dic 2019, 01:00:01, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:00:01, CET...fine: mer 18 dic 2019, 01:01:05, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:01:05, CET...fine: mer 18 dic 2019, 01:02:11, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:02:11, CET...fine: mer 18 dic 2019, 01:03:16, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:03:16, CET...fine: mer 18 dic 2019, 01:04:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:04:22, CET...fine: mer 18 dic 2019, 01:05:27, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:05:27, CET...fine: mer 18 dic 2019, 01:11:38, CET --> ci ho messo 369 secondi!
Inizio: mer 18 dic 2019, 01:11:38, CET...fine: mer 18 dic 2019, 01:12:43, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:12:43, CET...fine: mer 18 dic 2019, 01:18:20, CET --> ci ho messo 336 secondi!
Inizio: mer 18 dic 2019, 01:18:20, CET...fine: mer 18 dic 2019, 01:19:25, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:19:25, CET...fine: mer 18 dic 2019, 01:21:01, CET --> ci ho messo 95 secondi!
Inizio: mer 18 dic 2019, 01:21:01, CET...fine: mer 18 dic 2019, 01:22:06, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:22:06, CET...fine: mer 18 dic 2019, 01:23:12, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:23:12, CET...fine: mer 18 dic 2019, 01:29:43, CET --> ci ho messo 390 secondi!
Inizio: mer 18 dic 2019, 01:29:43, CET...fine: mer 18 dic 2019, 01:30:49, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:30:49, CET...fine: mer 18 dic 2019, 01:35:36, CET --> ci ho messo 285 secondi!
Inizio: mer 18 dic 2019, 01:35:36, CET...fine: mer 18 dic 2019, 01:36:44, CET --> ci ho messo 66 secondi!
Inizio: mer 18 dic 2019, 01:36:44, CET...fine: mer 18 dic 2019, 01:37:48, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:37:48, CET...fine: mer 18 dic 2019, 01:38:55, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:38:55, CET...fine: mer 18 dic 2019, 01:44:04, CET --> ci ho messo 308 secondi!
Inizio: mer 18 dic 2019, 01:44:04, CET...fine: mer 18 dic 2019, 01:45:10, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 01:45:10, CET...fine: mer 18 dic 2019, 01:49:42, CET --> ci ho messo 270 secondi!
Inizio: mer 18 dic 2019, 01:49:42, CET...fine: mer 18 dic 2019, 01:50:48, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 01:50:48, CET...fine: mer 18 dic 2019, 01:51:53, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:51:53, CET...fine: mer 18 dic 2019, 01:58:18, CET --> ci ho messo 383 secondi!
Inizio: mer 18 dic 2019, 01:58:18, CET...fine: mer 18 dic 2019, 01:59:23, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 01:59:23, CET...fine: mer 18 dic 2019, 02:00:29, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:00:29, CET...fine: mer 18 dic 2019, 02:01:34, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:01:34, CET...fine: mer 18 dic 2019, 02:02:40, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:02:40, CET...fine: mer 18 dic 2019, 02:03:45, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:03:45, CET...fine: mer 18 dic 2019, 02:04:51, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:04:51, CET...fine: mer 18 dic 2019, 02:05:56, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:05:56, CET...fine: mer 18 dic 2019, 02:07:02, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:07:02, CET...fine: mer 18 dic 2019, 02:08:07, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:08:07, CET...fine: mer 18 dic 2019, 02:09:14, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:09:14, CET...fine: mer 18 dic 2019, 02:13:44, CET --> ci ho messo 269 secondi!
Inizio: mer 18 dic 2019, 02:13:44, CET...fine: mer 18 dic 2019, 02:14:51, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:14:51, CET...fine: mer 18 dic 2019, 02:15:56, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:15:56, CET...fine: mer 18 dic 2019, 02:17:02, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:17:02, CET...fine: mer 18 dic 2019, 02:18:07, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:18:07, CET...fine: mer 18 dic 2019, 02:19:13, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:19:13, CET...fine: mer 18 dic 2019, 02:20:18, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:20:18, CET...fine: mer 18 dic 2019, 02:21:24, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:21:24, CET...fine: mer 18 dic 2019, 02:22:29, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:22:29, CET...fine: mer 18 dic 2019, 02:23:35, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:23:35, CET...fine: mer 18 dic 2019, 02:24:40, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:24:40, CET...fine: mer 18 dic 2019, 02:25:46, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:25:46, CET...fine: mer 18 dic 2019, 02:26:51, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:26:51, CET...fine: mer 18 dic 2019, 02:27:57, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:27:57, CET...fine: mer 18 dic 2019, 02:29:02, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:29:02, CET...fine: mer 18 dic 2019, 02:30:08, CET --> ci ho messo 65 secondi!
Inizio: mer 18 dic 2019, 02:30:08, CET...fine: mer 18 dic 2019, 02:31:13, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:31:13, CET...fine: mer 18 dic 2019, 02:32:19, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:32:19, CET...fine: mer 18 dic 2019, 02:40:06, CET --> ci ho messo 465 secondi!
Inizio: mer 18 dic 2019, 02:40:06, CET...fine: mer 18 dic 2019, 02:41:12, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:41:12, CET...fine: mer 18 dic 2019, 02:42:17, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:42:17, CET...fine: mer 18 dic 2019, 02:43:23, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:43:23, CET...fine: mer 18 dic 2019, 02:44:28, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:44:28, CET...fine: mer 18 dic 2019, 02:45:34, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:45:34, CET...fine: mer 18 dic 2019, 02:46:39, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:46:39, CET...fine: mer 18 dic 2019, 02:47:45, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:47:45, CET...fine: mer 18 dic 2019, 02:48:50, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:48:50, CET...fine: mer 18 dic 2019, 02:54:26, CET --> ci ho messo 334 secondi!
Inizio: mer 18 dic 2019, 02:54:26, CET...fine: mer 18 dic 2019, 02:55:31, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:55:31, CET...fine: mer 18 dic 2019, 02:57:11, CET --> ci ho messo 98 secondi!
Inizio: mer 18 dic 2019, 02:57:11, CET...fine: mer 18 dic 2019, 02:58:16, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 02:58:16, CET...fine: mer 18 dic 2019, 02:59:22, CET --> ci ho messo 64 secondi!
Inizio: mer 18 dic 2019, 02:59:22, CET...fine: mer 18 dic 2019, 03:00:27, CET --> ci ho messo 63 secondi!
Inizio: mer 18 dic 2019, 03:00:27, CET...fine: mer 18 dic 2019, 03:05:55, CET --> ci ho messo 326 secondi!
Inizio: mer 18 dic 2019, 03:05:55, CET...fine: mer 18 dic 2019, 03:11:49, CET --> ci ho messo 352 secondi!
Inizio: mer 18 dic 2019, 03:11:49, CET...fine: mer 18 dic 2019, 03:12:56, CET --> ci ho messo 66 secondi!
Inizio: mer 18 dic 2019, 03:12:56, CET...fine: mer 18 dic 2019, 03:14:01, CET --> ci ho messo 63 secondi!


[-- Attachment #4: trace_20191217.zip --]
[-- Type: application/zip, Size: 14385 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-18  8:25                                 ` Andrea Vai
@ 2019-12-18  9:48                                   ` Ming Lei
       [not found]                                     ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-18  9:48 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o wrote:
> > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > I didn't reproduce the issue in my test environment, and follows
> > > > Andrea's test commands[1]:
> > > > 
> > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > >   SECONDS=0
> > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > 
> > > > The 'cp' command supposes to open/close the file just once,
> > however
> > > > ext4_release_file() & write pages is observed to run for 4358
> > times
> > > > when executing the above 'cp' test.
> > > 
> > > Why are we sure the ext4_release_file() / _fput() is coming from
> > the
> > > cp command, as opposed to something else that might be running on
> > the
> > > system under test?  _fput() is called by the kernel when the last
> > 
> > Please see the log:
> > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > 
> > Which is collected by:
> > 
> > #!/bin/sh
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > $MAJ:$MIN points to the USB storage disk.
> > 
> > From the above IO trace, there are two write paths, one is from cp,
> > another is from writeback wq.
> > 
> > The stackcount trace[1] is consistent with the IO trace log since it
> > only shows two IO paths, that is why I concluded that the write done
> > via
> > ext4_release_file() is from 'cp'.
> > 
> > [1] 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > 
> > > reference to a struct file is released.  (Specifically, if you
> > have a
> > > fd which is dup'ed, it's only when the last fd corresponding to
> > the
> > > struct file is closed, and the struct file is about to be
> > released,
> > > does the file system's f_ops->release function get called.)
> > > 
> > > So the first question I'd ask is whether there is anything else
> > going
> > > on the system, and whether the writes are happening to the USB
> > thumb
> > > drive, or to some other storage device.  And if there is something
> > > else which is writing to the pendrive, maybe that's why no one
> > else
> > > has been able to reproduce the OP's complaint....
> > 
> > OK, we can ask Andrea to confirm that via the following trace, which
> > will add pid/comm info in the stack trace:
> > 
> > /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> > 
> > Andrew, could you collect the above log again when running new/bad
> > kernel for confirming if the write done by ext4_release_file() is
> > from
> > the 'cp' process?
> 
> You can find the stackcount log attached. It has been produced by:
> 
> - /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted > trace.log
> - wait some seconds
> - run the test (1 copy trial), wait for the test to finish, wait some seconds
> - stop the trace (ctrl+C)

Thanks for collecting the log, looks your 'stackcount' doesn't include
comm/pid info, seems there is difference between your bcc and
my bcc in fedora 30.

Could you collect above log again via the following command?

/usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert

which will show the comm/pid info.

Sorry for not seeing the bcc difference.

> 
> The test took 1994 seconds to complete.
> 
> I also tried the usual test with btrfs and xfs. Btrfs behavior looks
> "good". xfs seems sometimes better, sometimes worse, I would say. I
> don't know if it matters, anyway you can also find the results of the
> two tests (100 trials each). Basically, btrfs is always between 68 and
> 89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
> almost always very good (63-65s), but sometimes "bad" (>300s).

If you are interested in digging into this one, the following trace
should be helpful:

https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6


Thanks,
Ming

> 
> Thanks,
> Andrea

> *** test btrfs *** -> test_btrfs_20191217.txt
> 
> Starting 100 tries with:
> Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
> -rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
> /dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
> /dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
> /dev/sdf1: LABEL="BAK_ANDVAI" UUID="6ddfec28-3d9a-4676-a726-927fd3fb21e7" UUID_SUB="581c69ab-6758-4662-999a-b6dfe6ee5e69" TYPE="btrfs" PARTUUID="09066b88-01"
> /dev/sdg1: LABEL="BAK_ANDVAI" UUID="df777e33-8890-4cee-a718-42233f4cafae" TYPE="ext4" PARTUUID="75265898-01"
> cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
> Inizio: mar 17 dic 2019, 13:31:00, CET...fine: mar 17 dic 2019, 13:32:26, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:32:26, CET...fine: mar 17 dic 2019, 13:33:36, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 13:33:36, CET...fine: mar 17 dic 2019, 13:35:02, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:35:02, CET...fine: mar 17 dic 2019, 13:36:14, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 13:36:14, CET...fine: mar 17 dic 2019, 13:37:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:37:40, CET...fine: mar 17 dic 2019, 13:38:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:38:51, CET...fine: mar 17 dic 2019, 13:40:18, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 13:40:18, CET...fine: mar 17 dic 2019, 13:41:29, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:41:29, CET...fine: mar 17 dic 2019, 13:42:58, CET --> ci ho messo 89 secondi!
> Inizio: mar 17 dic 2019, 13:42:58, CET...fine: mar 17 dic 2019, 13:44:11, CET --> ci ho messo 73 secondi!
> Inizio: mar 17 dic 2019, 13:44:11, CET...fine: mar 17 dic 2019, 13:45:40, CET --> ci ho messo 89 secondi!
> Inizio: mar 17 dic 2019, 13:45:40, CET...fine: mar 17 dic 2019, 13:46:49, CET --> ci ho messo 69 secondi!
> Inizio: mar 17 dic 2019, 13:46:49, CET...fine: mar 17 dic 2019, 13:48:16, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 13:48:16, CET...fine: mar 17 dic 2019, 13:49:27, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:49:27, CET...fine: mar 17 dic 2019, 13:50:53, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:50:53, CET...fine: mar 17 dic 2019, 13:52:04, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:52:04, CET...fine: mar 17 dic 2019, 13:53:30, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:53:30, CET...fine: mar 17 dic 2019, 13:54:41, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:54:41, CET...fine: mar 17 dic 2019, 13:56:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 13:56:07, CET...fine: mar 17 dic 2019, 13:57:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:57:18, CET...fine: mar 17 dic 2019, 13:58:46, CET --> ci ho messo 88 secondi!
> Inizio: mar 17 dic 2019, 13:58:46, CET...fine: mar 17 dic 2019, 13:59:57, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 13:59:57, CET...fine: mar 17 dic 2019, 14:01:23, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:01:23, CET...fine: mar 17 dic 2019, 14:02:34, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:02:34, CET...fine: mar 17 dic 2019, 14:04:00, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:04:00, CET...fine: mar 17 dic 2019, 14:05:11, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:05:11, CET...fine: mar 17 dic 2019, 14:06:38, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 14:06:38, CET...fine: mar 17 dic 2019, 14:07:49, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:07:49, CET...fine: mar 17 dic 2019, 14:09:15, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:09:15, CET...fine: mar 17 dic 2019, 14:10:26, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:10:26, CET...fine: mar 17 dic 2019, 14:11:52, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:11:52, CET...fine: mar 17 dic 2019, 14:13:03, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:13:03, CET...fine: mar 17 dic 2019, 14:14:29, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:14:29, CET...fine: mar 17 dic 2019, 14:15:41, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:15:41, CET...fine: mar 17 dic 2019, 14:17:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:17:07, CET...fine: mar 17 dic 2019, 14:18:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:18:18, CET...fine: mar 17 dic 2019, 14:19:44, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:19:44, CET...fine: mar 17 dic 2019, 14:20:55, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:20:55, CET...fine: mar 17 dic 2019, 14:22:21, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:22:21, CET...fine: mar 17 dic 2019, 14:23:33, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:23:33, CET...fine: mar 17 dic 2019, 14:24:59, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:24:59, CET...fine: mar 17 dic 2019, 14:26:10, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:26:10, CET...fine: mar 17 dic 2019, 14:27:36, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:27:36, CET...fine: mar 17 dic 2019, 14:28:46, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 14:28:46, CET...fine: mar 17 dic 2019, 14:30:12, CET --> ci ho messo 85 secondi!
> Inizio: mar 17 dic 2019, 14:30:12, CET...fine: mar 17 dic 2019, 14:31:23, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:31:23, CET...fine: mar 17 dic 2019, 14:32:49, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:32:49, CET...fine: mar 17 dic 2019, 14:34:01, CET --> ci ho messo 72 secondi!
> Inizio: mar 17 dic 2019, 14:34:01, CET...fine: mar 17 dic 2019, 14:35:27, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:35:27, CET...fine: mar 17 dic 2019, 14:36:38, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:36:38, CET...fine: mar 17 dic 2019, 14:38:04, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:38:04, CET...fine: mar 17 dic 2019, 14:39:15, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:39:15, CET...fine: mar 17 dic 2019, 14:40:41, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:40:41, CET...fine: mar 17 dic 2019, 14:41:52, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:41:52, CET...fine: mar 17 dic 2019, 14:43:18, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:43:18, CET...fine: mar 17 dic 2019, 14:44:29, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:44:29, CET...fine: mar 17 dic 2019, 14:45:55, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:45:55, CET...fine: mar 17 dic 2019, 14:47:06, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:47:06, CET...fine: mar 17 dic 2019, 14:48:32, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:48:32, CET...fine: mar 17 dic 2019, 14:49:43, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:49:43, CET...fine: mar 17 dic 2019, 14:51:10, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 14:51:10, CET...fine: mar 17 dic 2019, 14:52:23, CET --> ci ho messo 73 secondi!
> Inizio: mar 17 dic 2019, 14:52:23, CET...fine: mar 17 dic 2019, 14:53:49, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:53:49, CET...fine: mar 17 dic 2019, 14:55:00, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:55:00, CET...fine: mar 17 dic 2019, 14:56:26, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:56:26, CET...fine: mar 17 dic 2019, 14:57:37, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 14:57:37, CET...fine: mar 17 dic 2019, 14:59:03, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 14:59:03, CET...fine: mar 17 dic 2019, 15:00:14, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:00:14, CET...fine: mar 17 dic 2019, 15:01:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:01:40, CET...fine: mar 17 dic 2019, 15:02:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:02:51, CET...fine: mar 17 dic 2019, 15:04:17, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:04:17, CET...fine: mar 17 dic 2019, 15:05:28, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:05:28, CET...fine: mar 17 dic 2019, 15:06:53, CET --> ci ho messo 85 secondi!
> Inizio: mar 17 dic 2019, 15:06:53, CET...fine: mar 17 dic 2019, 15:08:04, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:08:04, CET...fine: mar 17 dic 2019, 15:09:30, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:09:30, CET...fine: mar 17 dic 2019, 15:10:41, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:10:41, CET...fine: mar 17 dic 2019, 15:12:07, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:12:07, CET...fine: mar 17 dic 2019, 15:13:18, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:13:18, CET...fine: mar 17 dic 2019, 15:14:44, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:14:44, CET...fine: mar 17 dic 2019, 15:15:55, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:15:55, CET...fine: mar 17 dic 2019, 15:17:21, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:17:21, CET...fine: mar 17 dic 2019, 15:18:32, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:18:32, CET...fine: mar 17 dic 2019, 15:19:58, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:19:58, CET...fine: mar 17 dic 2019, 15:21:08, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:21:08, CET...fine: mar 17 dic 2019, 15:22:34, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:22:34, CET...fine: mar 17 dic 2019, 15:23:45, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:23:45, CET...fine: mar 17 dic 2019, 15:25:11, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:25:11, CET...fine: mar 17 dic 2019, 15:26:22, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:26:22, CET...fine: mar 17 dic 2019, 15:27:48, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:27:48, CET...fine: mar 17 dic 2019, 15:28:59, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:28:59, CET...fine: mar 17 dic 2019, 15:30:25, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:30:25, CET...fine: mar 17 dic 2019, 15:31:35, CET --> ci ho messo 70 secondi!
> Inizio: mar 17 dic 2019, 15:31:35, CET...fine: mar 17 dic 2019, 15:33:03, CET --> ci ho messo 87 secondi!
> Inizio: mar 17 dic 2019, 15:33:03, CET...fine: mar 17 dic 2019, 15:34:14, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:34:14, CET...fine: mar 17 dic 2019, 15:35:40, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:35:40, CET...fine: mar 17 dic 2019, 15:36:51, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:36:51, CET...fine: mar 17 dic 2019, 15:38:17, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:38:17, CET...fine: mar 17 dic 2019, 15:39:28, CET --> ci ho messo 71 secondi!
> Inizio: mar 17 dic 2019, 15:39:28, CET...fine: mar 17 dic 2019, 15:40:54, CET --> ci ho messo 86 secondi!
> Inizio: mar 17 dic 2019, 15:40:54, CET...fine: mar 17 dic 2019, 15:42:05, CET --> ci ho messo 71 secondi!

> *** TEST XFS: *** -> test_xfs_20191217.txt
> 
> Starting 100 tries with:
> Linux angus.unipv.it 5.4.0+ #1 SMP Mon Nov 25 11:31:34 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
> -rw-r--r--. 1 root root 1,0G 25 nov 13.29 /NoBackup/testfile
> /dev/sda1: LABEL="Fedora30" UUID="a7ca2491-c807-4b10-b33f-ef425699148d" TYPE="ext4" PARTUUID="8b16fbdd-01"
> /dev/sda2: LABEL="Swap_4GB" UUID="ba020b1e-4cdc-4f94-b92c-bdc11613388d" TYPE="swap" PARTUUID="8b16fbdd-02"
> /dev/sdf1: UUID="eb5a4791-5b26-44b6-871e-efd464a3adc5" TYPE="xfs" PARTUUID="09066b88-01"
> cat /sys/block/sdf/queue/scheduler --> [mq-deadline] none
> Inizio: mar 17 dic 2019, 23:58:22, CET...fine: mar 17 dic 2019, 23:59:28, CET --> ci ho messo 64 secondi!
> Inizio: mar 17 dic 2019, 23:59:28, CET...fine: mer 18 dic 2019, 00:00:33, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:00:33, CET...fine: mer 18 dic 2019, 00:01:39, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:01:39, CET...fine: mer 18 dic 2019, 00:06:35, CET --> ci ho messo 294 secondi!
> Inizio: mer 18 dic 2019, 00:06:35, CET...fine: mer 18 dic 2019, 00:07:41, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:07:41, CET...fine: mer 18 dic 2019, 00:08:46, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:08:46, CET...fine: mer 18 dic 2019, 00:09:52, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:09:52, CET...fine: mer 18 dic 2019, 00:10:57, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:10:57, CET...fine: mer 18 dic 2019, 00:12:03, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:12:03, CET...fine: mer 18 dic 2019, 00:13:08, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:13:08, CET...fine: mer 18 dic 2019, 00:14:14, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:14:14, CET...fine: mer 18 dic 2019, 00:15:19, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:15:19, CET...fine: mer 18 dic 2019, 00:21:39, CET --> ci ho messo 379 secondi!
> Inizio: mer 18 dic 2019, 00:21:39, CET...fine: mer 18 dic 2019, 00:22:44, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:22:44, CET...fine: mer 18 dic 2019, 00:23:50, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:23:50, CET...fine: mer 18 dic 2019, 00:29:16, CET --> ci ho messo 325 secondi!
> Inizio: mer 18 dic 2019, 00:29:16, CET...fine: mer 18 dic 2019, 00:30:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:30:22, CET...fine: mer 18 dic 2019, 00:34:50, CET --> ci ho messo 266 secondi!
> Inizio: mer 18 dic 2019, 00:34:50, CET...fine: mer 18 dic 2019, 00:35:56, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:35:56, CET...fine: mer 18 dic 2019, 00:37:01, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:37:01, CET...fine: mer 18 dic 2019, 00:43:39, CET --> ci ho messo 397 secondi!
> Inizio: mer 18 dic 2019, 00:43:39, CET...fine: mer 18 dic 2019, 00:48:31, CET --> ci ho messo 291 secondi!
> Inizio: mer 18 dic 2019, 00:48:31, CET...fine: mer 18 dic 2019, 00:49:37, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:49:37, CET...fine: mer 18 dic 2019, 00:50:42, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:50:42, CET...fine: mer 18 dic 2019, 00:55:39, CET --> ci ho messo 296 secondi!
> Inizio: mer 18 dic 2019, 00:55:39, CET...fine: mer 18 dic 2019, 00:56:44, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 00:56:44, CET...fine: mer 18 dic 2019, 00:57:50, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 00:57:50, CET...fine: mer 18 dic 2019, 00:58:54, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 00:58:54, CET...fine: mer 18 dic 2019, 01:00:01, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:00:01, CET...fine: mer 18 dic 2019, 01:01:05, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:01:05, CET...fine: mer 18 dic 2019, 01:02:11, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:02:11, CET...fine: mer 18 dic 2019, 01:03:16, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:03:16, CET...fine: mer 18 dic 2019, 01:04:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:04:22, CET...fine: mer 18 dic 2019, 01:05:27, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:05:27, CET...fine: mer 18 dic 2019, 01:11:38, CET --> ci ho messo 369 secondi!
> Inizio: mer 18 dic 2019, 01:11:38, CET...fine: mer 18 dic 2019, 01:12:43, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:12:43, CET...fine: mer 18 dic 2019, 01:18:20, CET --> ci ho messo 336 secondi!
> Inizio: mer 18 dic 2019, 01:18:20, CET...fine: mer 18 dic 2019, 01:19:25, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:19:25, CET...fine: mer 18 dic 2019, 01:21:01, CET --> ci ho messo 95 secondi!
> Inizio: mer 18 dic 2019, 01:21:01, CET...fine: mer 18 dic 2019, 01:22:06, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:22:06, CET...fine: mer 18 dic 2019, 01:23:12, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:23:12, CET...fine: mer 18 dic 2019, 01:29:43, CET --> ci ho messo 390 secondi!
> Inizio: mer 18 dic 2019, 01:29:43, CET...fine: mer 18 dic 2019, 01:30:49, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:30:49, CET...fine: mer 18 dic 2019, 01:35:36, CET --> ci ho messo 285 secondi!
> Inizio: mer 18 dic 2019, 01:35:36, CET...fine: mer 18 dic 2019, 01:36:44, CET --> ci ho messo 66 secondi!
> Inizio: mer 18 dic 2019, 01:36:44, CET...fine: mer 18 dic 2019, 01:37:48, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:37:48, CET...fine: mer 18 dic 2019, 01:38:55, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:38:55, CET...fine: mer 18 dic 2019, 01:44:04, CET --> ci ho messo 308 secondi!
> Inizio: mer 18 dic 2019, 01:44:04, CET...fine: mer 18 dic 2019, 01:45:10, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 01:45:10, CET...fine: mer 18 dic 2019, 01:49:42, CET --> ci ho messo 270 secondi!
> Inizio: mer 18 dic 2019, 01:49:42, CET...fine: mer 18 dic 2019, 01:50:48, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 01:50:48, CET...fine: mer 18 dic 2019, 01:51:53, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:51:53, CET...fine: mer 18 dic 2019, 01:58:18, CET --> ci ho messo 383 secondi!
> Inizio: mer 18 dic 2019, 01:58:18, CET...fine: mer 18 dic 2019, 01:59:23, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 01:59:23, CET...fine: mer 18 dic 2019, 02:00:29, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:00:29, CET...fine: mer 18 dic 2019, 02:01:34, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:01:34, CET...fine: mer 18 dic 2019, 02:02:40, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:02:40, CET...fine: mer 18 dic 2019, 02:03:45, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:03:45, CET...fine: mer 18 dic 2019, 02:04:51, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:04:51, CET...fine: mer 18 dic 2019, 02:05:56, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:05:56, CET...fine: mer 18 dic 2019, 02:07:02, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:07:02, CET...fine: mer 18 dic 2019, 02:08:07, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:08:07, CET...fine: mer 18 dic 2019, 02:09:14, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:09:14, CET...fine: mer 18 dic 2019, 02:13:44, CET --> ci ho messo 269 secondi!
> Inizio: mer 18 dic 2019, 02:13:44, CET...fine: mer 18 dic 2019, 02:14:51, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:14:51, CET...fine: mer 18 dic 2019, 02:15:56, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:15:56, CET...fine: mer 18 dic 2019, 02:17:02, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:17:02, CET...fine: mer 18 dic 2019, 02:18:07, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:18:07, CET...fine: mer 18 dic 2019, 02:19:13, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:19:13, CET...fine: mer 18 dic 2019, 02:20:18, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:20:18, CET...fine: mer 18 dic 2019, 02:21:24, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:21:24, CET...fine: mer 18 dic 2019, 02:22:29, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:22:29, CET...fine: mer 18 dic 2019, 02:23:35, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:23:35, CET...fine: mer 18 dic 2019, 02:24:40, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:24:40, CET...fine: mer 18 dic 2019, 02:25:46, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:25:46, CET...fine: mer 18 dic 2019, 02:26:51, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:26:51, CET...fine: mer 18 dic 2019, 02:27:57, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:27:57, CET...fine: mer 18 dic 2019, 02:29:02, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:29:02, CET...fine: mer 18 dic 2019, 02:30:08, CET --> ci ho messo 65 secondi!
> Inizio: mer 18 dic 2019, 02:30:08, CET...fine: mer 18 dic 2019, 02:31:13, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:31:13, CET...fine: mer 18 dic 2019, 02:32:19, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:32:19, CET...fine: mer 18 dic 2019, 02:40:06, CET --> ci ho messo 465 secondi!
> Inizio: mer 18 dic 2019, 02:40:06, CET...fine: mer 18 dic 2019, 02:41:12, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:41:12, CET...fine: mer 18 dic 2019, 02:42:17, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:42:17, CET...fine: mer 18 dic 2019, 02:43:23, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:43:23, CET...fine: mer 18 dic 2019, 02:44:28, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:44:28, CET...fine: mer 18 dic 2019, 02:45:34, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:45:34, CET...fine: mer 18 dic 2019, 02:46:39, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:46:39, CET...fine: mer 18 dic 2019, 02:47:45, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:47:45, CET...fine: mer 18 dic 2019, 02:48:50, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:48:50, CET...fine: mer 18 dic 2019, 02:54:26, CET --> ci ho messo 334 secondi!
> Inizio: mer 18 dic 2019, 02:54:26, CET...fine: mer 18 dic 2019, 02:55:31, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:55:31, CET...fine: mer 18 dic 2019, 02:57:11, CET --> ci ho messo 98 secondi!
> Inizio: mer 18 dic 2019, 02:57:11, CET...fine: mer 18 dic 2019, 02:58:16, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 02:58:16, CET...fine: mer 18 dic 2019, 02:59:22, CET --> ci ho messo 64 secondi!
> Inizio: mer 18 dic 2019, 02:59:22, CET...fine: mer 18 dic 2019, 03:00:27, CET --> ci ho messo 63 secondi!
> Inizio: mer 18 dic 2019, 03:00:27, CET...fine: mer 18 dic 2019, 03:05:55, CET --> ci ho messo 326 secondi!
> Inizio: mer 18 dic 2019, 03:05:55, CET...fine: mer 18 dic 2019, 03:11:49, CET --> ci ho messo 352 secondi!
> Inizio: mer 18 dic 2019, 03:11:49, CET...fine: mer 18 dic 2019, 03:12:56, CET --> ci ho messo 66 secondi!
> Inizio: mer 18 dic 2019, 03:12:56, CET...fine: mer 18 dic 2019, 03:14:01, CET --> ci ho messo 63 secondi!
> 



-- 
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found]                                     ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
@ 2019-12-23 13:08                                       ` Ming Lei
  2019-12-23 16:26                                         ` Theodore Y. Ts'o
       [not found]                                         ` <fc6f73fc5f57ade8890b472d63c7f0bd559de538.camel@unipv.it>
  0 siblings, 2 replies; 29+ messages in thread
From: Ming Lei @ 2019-12-23 13:08 UTC (permalink / raw)
  To: Andrea Vai, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha scritto:
> > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha scritto:
> > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o
> > wrote:
> > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > > > I didn't reproduce the issue in my test environment, and
> > follows
> > > > > > Andrea's test commands[1]:
> > > > > > 
> > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > >   SECONDS=0
> > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > 
> > > > > > The 'cp' command supposes to open/close the file just once,
> > > > however
> > > > > > ext4_release_file() & write pages is observed to run for
> > 4358
> > > > times
> > > > > > when executing the above 'cp' test.
> > > > > 
> > > > > Why are we sure the ext4_release_file() / _fput() is coming
> > from
> > > > the
> > > > > cp command, as opposed to something else that might be running
> > on
> > > > the
> > > > > system under test?  _fput() is called by the kernel when the
> > last
> > > > 
> > > > Please see the log:
> > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > 
> > > > Which is collected by:
> > > > 
> > > > #!/bin/sh
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > 
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector' \
> > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > $MAJ:$MIN points to the USB storage disk.
> > > > 
> > > > From the above IO trace, there are two write paths, one is from
> > cp,
> > > > another is from writeback wq.
> > > > 
> > > > The stackcount trace[1] is consistent with the IO trace log
> > since it
> > > > only shows two IO paths, that is why I concluded that the write
> > done
> > > > via
> > > > ext4_release_file() is from 'cp'.
> > > > 
> > > > [1] 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > 
> > > > > reference to a struct file is released.  (Specifically, if you
> > > > have a
> > > > > fd which is dup'ed, it's only when the last fd corresponding
> > to
> > > > the
> > > > > struct file is closed, and the struct file is about to be
> > > > released,
> > > > > does the file system's f_ops->release function get called.)
> > > > > 
> > > > > So the first question I'd ask is whether there is anything
> > else
> > > > going
> > > > > on the system, and whether the writes are happening to the USB
> > > > thumb
> > > > > drive, or to some other storage device.  And if there is
> > something
> > > > > else which is writing to the pendrive, maybe that's why no one
> > > > else
> > > > > has been able to reproduce the OP's complaint....
> > > > 
> > > > OK, we can ask Andrea to confirm that via the following trace,
> > which
> > > > will add pid/comm info in the stack trace:
> > > > 
> > > > /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted
> > > > 
> > > > Andrew, could you collect the above log again when running
> > new/bad
> > > > kernel for confirming if the write done by ext4_release_file()
> > is
> > > > from
> > > > the 'cp' process?
> > > 
> > > You can find the stackcount log attached. It has been produced by:
> > > 
> > > - /usr/share/bcc/tools/stackcount blk_mq_sched_request_inserted >
> > trace.log
> > > - wait some seconds
> > > - run the test (1 copy trial), wait for the test to finish, wait
> > some seconds
> > > - stop the trace (ctrl+C)
> > 
> > Thanks for collecting the log, looks your 'stackcount' doesn't
> > include
> > comm/pid info, seems there is difference between your bcc and
> > my bcc in fedora 30.
> > 
> > Could you collect above log again via the following command?
> > 
> > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> > 
> > which will show the comm/pid info.
> 
> ok, attached (trace_20191219.txt), the test (1 trial) took 3684
> seconds.

From the above trace:

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_writepages'
  b'do_writepages'
  b'__filemap_fdatawrite_range'
  b'ext4_release_file'
  b'__fput'
  b'task_work_run'
  b'exit_to_usermode_loop'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
    b'cp' [19863]
    4400

So this write is clearly from 'cp' process, and it should be one
ext4 fs issue.

Ted, can you take a look at this issue?

> 
> > > I also tried the usual test with btrfs and xfs. Btrfs behavior
> > looks
> > > "good". xfs seems sometimes better, sometimes worse, I would say.
> > I
> > > don't know if it matters, anyway you can also find the results of
> > the
> > > two tests (100 trials each). Basically, btrfs is always between 68
> > and
> > > 89 seconds, with a cyclicity (?) with "period=2 trials". xfs looks
> > > almost always very good (63-65s), but sometimes "bad" (>300s).
> > 
> > If you are interested in digging into this one, the following trace
> > should be helpful:
> > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > 
> 
> Attached:
> - trace_xfs_20191223.txt (7 trials, then aborted while doing the 8th),
> times to complete:
> 64s
> 63s
> 64s
> 833s
> 1105s
> 63s
> 64s

oops, looks we have to collect io insert trace with the following bcc script
on xfs for confirming if there is similar issue with ext4, could you run
it again on xfs? And only post the trace done in case of slow 'cp'.


#!/bin/sh

MAJ=$1
MIN=$2
MAJ=$(( $MAJ << 20 ))
DEV=$(( $MAJ | $MIN ))

/usr/share/bcc/tools/trace -t -C \
    't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args->rwbs, args->sector, args->nr_sector'


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 13:08                                       ` Ming Lei
@ 2019-12-23 16:26                                         ` Theodore Y. Ts'o
  2019-12-23 16:29                                           ` Andrea Vai
       [not found]                                         ` <fc6f73fc5f57ade8890b472d63c7f0bd559de538.camel@unipv.it>
  1 sibling, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 16:26 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 09:08:28PM +0800, Ming Lei wrote:
> 
> From the above trace:
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     b'cp' [19863]
>     4400
> 
> So this write is clearly from 'cp' process, and it should be one
> ext4 fs issue.

We need a system call trace of the cp process, to understand what
system call is resulting in fput, (eg., I assume it's close(2) but
let's be sure), and often it's calling that system call.

What cp process is it?  Is it from shellutils?  Is it from busybox?

     		   	      	   		- Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 16:26                                         ` Theodore Y. Ts'o
@ 2019-12-23 16:29                                           ` Andrea Vai
  2019-12-23 17:22                                             ` Theodore Y. Ts'o
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-23 16:29 UTC (permalink / raw)
  To: Theodore Y. Ts'o, Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno lun, 23/12/2019 alle 11.26 -0500, Theodore Y. Ts'o ha
scritto:
> On Mon, Dec 23, 2019 at 09:08:28PM +0800, Ming Lei wrote:
> > 
> > From the above trace:
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     b'cp' [19863]
> >     4400
> > 
> > So this write is clearly from 'cp' process, and it should be one
> > ext4 fs issue.
> 
> We need a system call trace of the cp process, to understand what
> system call is resulting in fput, (eg., I assume it's close(2) but
> let's be sure), and often it's calling that system call.
> 
> What cp process is it?  Is it from shellutils?  Is it from busybox?
> 
>      		   	      	   		- Ted

I run the cp command from a bash script, or from a bash shell. I don't
know if this answer your question, otherwise feel free to tell me a
way to find the answer to give you.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 16:29                                           ` Andrea Vai
@ 2019-12-23 17:22                                             ` Theodore Y. Ts'o
  2019-12-23 18:45                                               ` Andrea Vai
  0 siblings, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 17:22 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 05:29:27PM +0100, Andrea Vai wrote:
> I run the cp command from a bash script, or from a bash shell. I don't
> know if this answer your question, otherwise feel free to tell me a
> way to find the answer to give you.

What distro are you using, and/or what package is the cp command
coming from, and what is the package name and version?

Also, can you remind me what the bash script is and how many files you are copying?

Can you change the script so that the cp command is prefixed by:

"strace -tTf -o /tmp/st "

e.g.,

	strace -tTf -o /tmp/st cp <args>

And then send me the /tmp/st file.  This will significantly change the
time, so don't do this for measuring performance.  I just want to see
what the /bin/cp command is *doing*.

      	      	     	     	      - Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 17:22                                             ` Theodore Y. Ts'o
@ 2019-12-23 18:45                                               ` Andrea Vai
  2019-12-23 19:53                                                 ` Theodore Y. Ts'o
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-23 18:45 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]

Il giorno lun, 23/12/2019 alle 12.22 -0500, Theodore Y. Ts'o ha
scritto:
> On Mon, Dec 23, 2019 at 05:29:27PM +0100, Andrea Vai wrote:
> > I run the cp command from a bash script, or from a bash shell. I
> don't
> > know if this answer your question, otherwise feel free to tell me
> a
> > way to find the answer to give you.
> 
> What distro are you using, and/or what package is the cp command
> coming from, and what is the package name and version?

Fedora 30

$ rpm -qf `which cp`
coreutils-8.31-6.fc30.x86_64

> 
> Also, can you remind me what the bash script is and how many files
> you are copying?

basically, it's:

  mount UUID=$uuid /mnt/pendrive
  SECONDS=0
  cp $testfile /mnt/pendrive
  umount /mnt/pendrive
  tempo=$SECONDS

and it copies one file only. Anyway, you can find the whole script
attached.


> 
> Can you change the script so that the cp command is prefixed by:
> 
> "strace -tTf -o /tmp/st "
> 
> e.g.,
> 
> 	strace -tTf -o /tmp/st cp <args>
> 
> And then send me
btw, please tell me if "me" means only you or I cc: all the
recipients, as usual

>  the /tmp/st file.  This will significantly change the
> time, so don't do this for measuring performance.  I just want to
> see
> what the /bin/cp command is *doing*.

I will do it, but I have a doubt. Since the problem doesn't happen
every time, is it useful to give you a trace of a "fast" run? And, if
it's not, I think I should measure performance with the trace command
prefix, to identify a "slow" run to report you. Does it make sense?

Thanks,
Andrea

[-- Attachment #2: test --]
[-- Type: application/x-shellscript, Size: 1403 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 18:45                                               ` Andrea Vai
@ 2019-12-23 19:53                                                 ` Theodore Y. Ts'o
  2019-12-24  1:27                                                   ` Ming Lei
  0 siblings, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-23 19:53 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Ming Lei, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> basically, it's:
> 
>   mount UUID=$uuid /mnt/pendrive
>   SECONDS=0
>   cp $testfile /mnt/pendrive
>   umount /mnt/pendrive
>   tempo=$SECONDS
> 
> and it copies one file only. Anyway, you can find the whole script
> attached.

OK, so whether we are doing the writeback at the end of cp, or when
you do the umount, it's probably not going to make any difference.  We
can get rid of the stack trace in question by changing the script to
be basically:

mount UUID=$uuid /mnt/pendrive
SECONDS=0
rm -f /mnt/pendrive/$testfile
cp $testfile /mnt/pendrive
umount /mnt/pendrive
tempo=$SECONDS

I predict if you do that, you'll see that all of the time is spent in
the umount, when we are trying to write back the file.

I really don't think then this is a file system problem at all.  It's
just that USB I/O is slow, for whatever reason.  We'll see a stack
trace in the writeback code waiting for the I/O to be completed, but
that doesn't mean that the root cause is in the writeback code or in
the file system which is triggering the writeback.

I suspect the next step is use a blktrace, to see what kind of I/O is
being sent to the USB drive, and how long it takes for the I/O to
complete.  You might also try to capture the output of "iostat -x 1"
while the script is running, and see what the difference might be
between a kernel version that has the problem and one that doesn't,
and see if that gives us a clue.

> > And then send me
> btw, please tell me if "me" means only you or I cc: all the
> recipients, as usual

Well, I don't think we know what the root cause is.  Ming is focusing
on that stack trace, but I think it's a red herring.....  And if it's
not a file system problem, then other people will be best suited to
debug the issue.

   	      	     	   	      	    - Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-23 19:53                                                 ` Theodore Y. Ts'o
@ 2019-12-24  1:27                                                   ` Ming Lei
  2019-12-24  6:49                                                     ` Andrea Vai
                                                                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Ming Lei @ 2019-12-24  1:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Hi Ted,

On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > basically, it's:
> > 
> >   mount UUID=$uuid /mnt/pendrive
> >   SECONDS=0
> >   cp $testfile /mnt/pendrive
> >   umount /mnt/pendrive
> >   tempo=$SECONDS
> > 
> > and it copies one file only. Anyway, you can find the whole script
> > attached.
> 
> OK, so whether we are doing the writeback at the end of cp, or when
> you do the umount, it's probably not going to make any difference.  We
> can get rid of the stack trace in question by changing the script to
> be basically:
> 
> mount UUID=$uuid /mnt/pendrive
> SECONDS=0
> rm -f /mnt/pendrive/$testfile
> cp $testfile /mnt/pendrive
> umount /mnt/pendrive
> tempo=$SECONDS
> 
> I predict if you do that, you'll see that all of the time is spent in
> the umount, when we are trying to write back the file.
> 
> I really don't think then this is a file system problem at all.  It's
> just that USB I/O is slow, for whatever reason.  We'll see a stack
> trace in the writeback code waiting for the I/O to be completed, but
> that doesn't mean that the root cause is in the writeback code or in
> the file system which is triggering the writeback.

Wrt. the slow write on this usb storage, it is caused by two writeback
path, one is the writeback wq, another is from ext4_release_file() which
is triggered from exit_to_usermode_loop().

When the two write path is run concurrently, the sequential write order
is broken, then write performance drops much on this particular usb
storage.

The ext4_release_file() should be run from read() or write() syscall if
Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
for ext4_release_file() to be run thousands of times when just
running 'cp' once, see comment of ext4_release_file():

	/*
	 * Called when an inode is released. Note that this is different
	 * from ext4_file_open: open gets called at every open, but release
	 * gets called only when /all/ the files are closed.
	 */
	static int ext4_release_file(struct inode *inode, struct file *filp)

> 
> I suspect the next step is use a blktrace, to see what kind of I/O is
> being sent to the USB drive, and how long it takes for the I/O to
> complete.  You might also try to capture the output of "iostat -x 1"
> while the script is running, and see what the difference might be
> between a kernel version that has the problem and one that doesn't,
> and see if that gives us a clue.

That isn't necessary, given we have concluded that the bad write
performance is caused by broken write order.

> 
> > > And then send me
> > btw, please tell me if "me" means only you or I cc: all the
> > recipients, as usual
> 
> Well, I don't think we know what the root cause is.  Ming is focusing
> on that stack trace, but I think it's a red herring.....  And if it's
> not a file system problem, then other people will be best suited to
> debug the issue.

So far, the reason points to the extra writeback path from exit_to_usermode_loop().
If it is not from close() syscall, the issue should be related with file reference
count. If it is from close() syscall, the issue might be in 'cp''s
implementation.

Andrea, please collect the following log or the strace log requested by Ted, then
we can confirm if the extra writeback is from close() or read/write() syscall:

# pass PID of 'cp' to this script
#!/bin/sh
PID=$1
/usr/share/bcc/tools/trace -P $PID  -t -C \
    't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector, args->nr_sector' \
    't:syscalls:sys_exit_close ' \
    't:syscalls:sys_exit_read ' \
    't:syscalls:sys_exit_write '


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found]                                         ` <fc6f73fc5f57ade8890b472d63c7f0bd559de538.camel@unipv.it>
@ 2019-12-24  1:32                                           ` Ming Lei
  2019-12-24  8:04                                             ` Andrea Vai
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-24  1:32 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha scritto:
> > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > scritto:
> > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y. Ts'o
> > > > wrote:
> > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei wrote:
> > > > > > > > I didn't reproduce the issue in my test environment, and
> > > > follows
> > > > > > > > Andrea's test commands[1]:
> > > > > > > > 
> > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > >   SECONDS=0
> > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > 
> > > > > > > > The 'cp' command supposes to open/close the file just
> > once,
> > > > > > however
> > > > > > > > ext4_release_file() & write pages is observed to run for
> > > > 4358
> > > > > > times
> > > > > > > > when executing the above 'cp' test.
> > > > > > > 
> > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > coming
> > > > from
> > > > > > the
> > > > > > > cp command, as opposed to something else that might be
> > running
> > > > on
> > > > > > the
> > > > > > > system under test?  _fput() is called by the kernel when
> > the
> > > > last
> > > > > > 
> > > > > > Please see the log:
> > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > 
> > > > > > Which is collected by:
> > > > > > 
> > > > > > #!/bin/sh
> > > > > > MAJ=$1
> > > > > > MIN=$2
> > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > 
> > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > %d",
> > > > args-
> > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d
> > %d",
> > > > args-
> > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > 
> > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > 
> > > > > > From the above IO trace, there are two write paths, one is
> > from
> > > > cp,
> > > > > > another is from writeback wq.
> > > > > > 
> > > > > > The stackcount trace[1] is consistent with the IO trace log
> > > > since it
> > > > > > only shows two IO paths, that is why I concluded that the
> > write
> > > > done
> > > > > > via
> > > > > > ext4_release_file() is from 'cp'.
> > > > > > 
> > > > > > [1] 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > 
> > > > > > > reference to a struct file is released.  (Specifically, if
> > you
> > > > > > have a
> > > > > > > fd which is dup'ed, it's only when the last fd
> > corresponding
> > > > to
> > > > > > the
> > > > > > > struct file is closed, and the struct file is about to be
> > > > > > released,
> > > > > > > does the file system's f_ops->release function get
> > called.)
> > > > > > > 
> > > > > > > So the first question I'd ask is whether there is anything
> > > > else
> > > > > > going
> > > > > > > on the system, and whether the writes are happening to the
> > USB
> > > > > > thumb
> > > > > > > drive, or to some other storage device.  And if there is
> > > > something
> > > > > > > else which is writing to the pendrive, maybe that's why no
> > one
> > > > > > else
> > > > > > > has been able to reproduce the OP's complaint....
> > > > > > 
> > > > > > OK, we can ask Andrea to confirm that via the following
> > trace,
> > > > which
> > > > > > will add pid/comm info in the stack trace:
> > > > > > 
> > > > > > /usr/share/bcc/tools/stackcount
> > blk_mq_sched_request_inserted
> > > > > > 
> > > > > > Andrew, could you collect the above log again when running
> > > > new/bad
> > > > > > kernel for confirming if the write done by
> > ext4_release_file()
> > > > is
> > > > > > from
> > > > > > the 'cp' process?
> > > > > 
> > > > > You can find the stackcount log attached. It has been produced
> > by:
> > > > > 
> > > > > - /usr/share/bcc/tools/stackcount
> > blk_mq_sched_request_inserted >
> > > > trace.log
> > > > > - wait some seconds
> > > > > - run the test (1 copy trial), wait for the test to finish,
> > wait
> > > > some seconds
> > > > > - stop the trace (ctrl+C)
> > > > 
> > > > Thanks for collecting the log, looks your 'stackcount' doesn't
> > > > include
> > > > comm/pid info, seems there is difference between your bcc and
> > > > my bcc in fedora 30.
> > > > 
> > > > Could you collect above log again via the following command?
> > > > 
> > > > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> > > > 
> > > > which will show the comm/pid info.
> > > 
> > > ok, attached (trace_20191219.txt), the test (1 trial) took 3684
> > > seconds.
> > 
> > From the above trace:
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     b'cp' [19863]
> >     4400
> > 
> > So this write is clearly from 'cp' process, and it should be one
> > ext4 fs issue.
> > 
> > Ted, can you take a look at this issue?
> > 
> > > 
> > > > > I also tried the usual test with btrfs and xfs. Btrfs behavior
> > > > looks
> > > > > "good". xfs seems sometimes better, sometimes worse, I would
> > say.
> > > > I
> > > > > don't know if it matters, anyway you can also find the results
> > of
> > > > the
> > > > > two tests (100 trials each). Basically, btrfs is always
> > between 68
> > > > and
> > > > > 89 seconds, with a cyclicity (?) with "period=2 trials". xfs
> > looks
> > > > > almost always very good (63-65s), but sometimes "bad" (>300s).
> > > > 
> > > > If you are interested in digging into this one, the following
> > trace
> > > > should be helpful:
> > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > 
> > > 
> > > Attached:
> > > - trace_xfs_20191223.txt (7 trials, then aborted while doing the
> > 8th),
> > > times to complete:
> > > 64s
> > > 63s
> > > 64s
> > > 833s
> > > 1105s
> > > 63s
> > > 64s
> > 
> > oops, looks we have to collect io insert trace with the following
> > bcc script
> > on xfs for confirming if there is similar issue with ext4, could you
> > run
> > it again on xfs? And only post the trace done in case of slow 'cp'.
> > 
> > 
> > #!/bin/sh
> > 
> > MAJ=$1
> > MIN=$2
> > MAJ=$(( $MAJ << 20 ))
> > DEV=$(( $MAJ | $MIN ))
> > 
> > /usr/share/bcc/tools/trace -t -C \
> >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector' \
> >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d", args-
> > >rwbs, args->sector, args->nr_sector'
> > 
> > 
> here it is (1 trial, 313 seconds to finish)

The above log shows similar issue with ext4 since there is another
writeback IO path from 'cp' process. And the following trace can show if
it is same with ext4's issue:

/usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                   ` Ming Lei
@ 2019-12-24  6:49                                                     ` Andrea Vai
  2019-12-24  8:51                                                     ` Andrea Vai
  2019-12-25  5:17                                                     ` Theodore Y. Ts'o
  2 siblings, 0 replies; 29+ messages in thread
From: Andrea Vai @ 2019-12-24  6:49 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> Hi Ted,
> 
> On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > basically, it's:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive
> > >   umount /mnt/pendrive
> > >   tempo=$SECONDS
> > > 
> > > and it copies one file only. Anyway, you can find the whole
> script
> > > attached.
> > 
> > OK, so whether we are doing the writeback at the end of cp, or
> when
> > you do the umount, it's probably not going to make any
> difference.  We
> > can get rid of the stack trace in question by changing the script
> to
> > be basically:
> > 
> > mount UUID=$uuid /mnt/pendrive
> > SECONDS=0
> > rm -f /mnt/pendrive/$testfile
> > cp $testfile /mnt/pendrive
> > umount /mnt/pendrive
> > tempo=$SECONDS
> > 
> > I predict if you do that, you'll see that all of the time is spent
> in
> > the umount, when we are trying to write back the file.
> > 
> > I really don't think then this is a file system problem at
> all.  It's
> > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > trace in the writeback code waiting for the I/O to be completed,
> but
> > that doesn't mean that the root cause is in the writeback code or
> in
> > the file system which is triggering the writeback.
> 
> Wrt. the slow write on this usb storage, it is caused by two
> writeback
> path, one is the writeback wq, another is from ext4_release_file()
> which
> is triggered from exit_to_usermode_loop().
> 
> When the two write path is run concurrently, the sequential write
> order
> is broken, then write performance drops much on this particular usb
> storage.
> 
> The ext4_release_file() should be run from read() or write() syscall
> if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():
> 
> 	/*
> 	 * Called when an inode is released. Note that this is
> different
> 	 * from ext4_file_open: open gets called at every open, but
> release
> 	 * gets called only when /all/ the files are closed.
> 	 */
> 	static int ext4_release_file(struct inode *inode, struct file
> *filp)
> 
> > 
> > I suspect the next step is use a blktrace, to see what kind of I/O
> is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x
> 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that
> doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.
> 
> > 
> > > > And then send me
> > > btw, please tell me if "me" means only you or I cc: all the
> > > recipients, as usual
> > 
> > Well, I don't think we know what the root cause is.  Ming is
> focusing
> > on that stack trace, but I think it's a red herring.....  And if
> it's
> > not a file system problem, then other people will be best suited
> to
> > debug the issue.
> 
> So far, the reason points to the extra writeback path from
> exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with
> file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.
> 
> Andrea, please collect the following log or the strace log requested
> by Ted, then
> we can confirm if the extra writeback is from close() or
> read/write() syscall:
> 
> # pass PID of 'cp' to this script
> #!/bin/sh
> PID=$1
> /usr/share/bcc/tools/trace -P $PID  -t -C \
>     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> args->nr_sector' \
>     't:syscalls:sys_exit_close ' \
>     't:syscalls:sys_exit_read ' \
>     't:syscalls:sys_exit_write '

Sorry if I am a bit confused, should I run it on ext4 or xfs, or
doesn't matter? What if I get it on a "fast" run? Should I throw it
away and try again until I get a slow one, or it doesn't matter?

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:32                                           ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
@ 2019-12-24  8:04                                             ` Andrea Vai
  2019-12-24  8:47                                               ` Ming Lei
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-24  8:04 UTC (permalink / raw)
  To: Ming Lei
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

Il giorno mar, 24/12/2019 alle 09.32 +0800, Ming Lei ha scritto:
> On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> > Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha
> scritto:
> > > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > > scritto:
> > > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y.
> Ts'o
> > > > > wrote:
> > > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei
> wrote:
> > > > > > > > > I didn't reproduce the issue in my test environment,
> and
> > > > > follows
> > > > > > > > > Andrea's test commands[1]:
> > > > > > > > > 
> > > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a
> $logfile
> > > > > > > > >   SECONDS=0
> > > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > 
> > > > > > > > > The 'cp' command supposes to open/close the file
> just
> > > once,
> > > > > > > however
> > > > > > > > > ext4_release_file() & write pages is observed to run
> for
> > > > > 4358
> > > > > > > times
> > > > > > > > > when executing the above 'cp' test.
> > > > > > > > 
> > > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > > coming
> > > > > from
> > > > > > > the
> > > > > > > > cp command, as opposed to something else that might be
> > > running
> > > > > on
> > > > > > > the
> > > > > > > > system under test?  _fput() is called by the kernel
> when
> > > the
> > > > > last
> > > > > > > 
> > > > > > > Please see the log:
> > > > > > > 
> > > > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > > 
> > > > > > > Which is collected by:
> > > > > > > 
> > > > > > > #!/bin/sh
> > > > > > > MAJ=$1
> > > > > > > MIN=$2
> > > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > > 
> > > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > > %d",
> > > > > args-
> > > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s
> %d
> > > %d",
> > > > > args-
> > > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > > 
> > > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > > 
> > > > > > > From the above IO trace, there are two write paths, one
> is
> > > from
> > > > > cp,
> > > > > > > another is from writeback wq.
> > > > > > > 
> > > > > > > The stackcount trace[1] is consistent with the IO trace
> log
> > > > > since it
> > > > > > > only shows two IO paths, that is why I concluded that
> the
> > > write
> > > > > done
> > > > > > > via
> > > > > > > ext4_release_file() is from 'cp'.
> > > > > > > 
> > > > > > > [1] 
> > > > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > > 
> > > > > > > > reference to a struct file is
> released.  (Specifically, if
> > > you
> > > > > > > have a
> > > > > > > > fd which is dup'ed, it's only when the last fd
> > > corresponding
> > > > > to
> > > > > > > the
> > > > > > > > struct file is closed, and the struct file is about to
> be
> > > > > > > released,
> > > > > > > > does the file system's f_ops->release function get
> > > called.)
> > > > > > > > 
> > > > > > > > So the first question I'd ask is whether there is
> anything
> > > > > else
> > > > > > > going
> > > > > > > > on the system, and whether the writes are happening to
> the
> > > USB
> > > > > > > thumb
> > > > > > > > drive, or to some other storage device.  And if there
> is
> > > > > something
> > > > > > > > else which is writing to the pendrive, maybe that's
> why no
> > > one
> > > > > > > else
> > > > > > > > has been able to reproduce the OP's complaint....
> > > > > > > 
> > > > > > > OK, we can ask Andrea to confirm that via the following
> > > trace,
> > > > > which
> > > > > > > will add pid/comm info in the stack trace:
> > > > > > > 
> > > > > > > /usr/share/bcc/tools/stackcount
> > > blk_mq_sched_request_inserted
> > > > > > > 
> > > > > > > Andrew, could you collect the above log again when
> running
> > > > > new/bad
> > > > > > > kernel for confirming if the write done by
> > > ext4_release_file()
> > > > > is
> > > > > > > from
> > > > > > > the 'cp' process?
> > > > > > 
> > > > > > You can find the stackcount log attached. It has been
> produced
> > > by:
> > > > > > 
> > > > > > - /usr/share/bcc/tools/stackcount
> > > blk_mq_sched_request_inserted >
> > > > > trace.log
> > > > > > - wait some seconds
> > > > > > - run the test (1 copy trial), wait for the test to
> finish,
> > > wait
> > > > > some seconds
> > > > > > - stop the trace (ctrl+C)
> > > > > 
> > > > > Thanks for collecting the log, looks your 'stackcount'
> doesn't
> > > > > include
> > > > > comm/pid info, seems there is difference between your bcc
> and
> > > > > my bcc in fedora 30.
> > > > > 
> > > > > Could you collect above log again via the following command?
> > > > > 
> > > > > /usr/share/bcc/tools/stackcount -P -K
> t:block:block_rq_insert
> > > > > 
> > > > > which will show the comm/pid info.
> > > > 
> > > > ok, attached (trace_20191219.txt), the test (1 trial) took
> 3684
> > > > seconds.
> > > 
> > > From the above trace:
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     b'cp' [19863]
> > >     4400
> > > 
> > > So this write is clearly from 'cp' process, and it should be one
> > > ext4 fs issue.
> > > 
> > > Ted, can you take a look at this issue?
> > > 
> > > > 
> > > > > > I also tried the usual test with btrfs and xfs. Btrfs
> behavior
> > > > > looks
> > > > > > "good". xfs seems sometimes better, sometimes worse, I
> would
> > > say.
> > > > > I
> > > > > > don't know if it matters, anyway you can also find the
> results
> > > of
> > > > > the
> > > > > > two tests (100 trials each). Basically, btrfs is always
> > > between 68
> > > > > and
> > > > > > 89 seconds, with a cyclicity (?) with "period=2 trials".
> xfs
> > > looks
> > > > > > almost always very good (63-65s), but sometimes "bad"
> (>300s).
> > > > > 
> > > > > If you are interested in digging into this one, the
> following
> > > trace
> > > > > should be helpful:
> > > > > 
> > > > > 
> > > 
> https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > > 
> > > > 
> > > > Attached:
> > > > - trace_xfs_20191223.txt (7 trials, then aborted while doing
> the
> > > 8th),
> > > > times to complete:
> > > > 64s
> > > > 63s
> > > > 64s
> > > > 833s
> > > > 1105s
> > > > 63s
> > > > 64s
> > > 
> > > oops, looks we have to collect io insert trace with the
> following
> > > bcc script
> > > on xfs for confirming if there is similar issue with ext4, could
> you
> > > run
> > > it again on xfs? And only post the trace done in case of slow
> 'cp'.
> > > 
> > > 
> > > #!/bin/sh
> > > 
> > > MAJ=$1
> > > MIN=$2
> > > MAJ=$(( $MAJ << 20 ))
> > > DEV=$(( $MAJ | $MIN ))
> > > 
> > > /usr/share/bcc/tools/trace -t -C \
> > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> args-
> > > >rwbs, args->sector, args->nr_sector' \
> > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> args-
> > > >rwbs, args->sector, args->nr_sector'
> > > 
> > > 
> > here it is (1 trial, 313 seconds to finish)
> 
> The above log shows similar issue with ext4 since there is another
> writeback IO path from 'cp' process. And the following trace can
> show if
> it is same with ext4's issue:
> 
> /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert

sorry, also here please tell me which conditions should I use to run
the test (ext4 or xfs? slow run or not important?)

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  8:04                                             ` Andrea Vai
@ 2019-12-24  8:47                                               ` Ming Lei
  0 siblings, 0 replies; 29+ messages in thread
From: Ming Lei @ 2019-12-24  8:47 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Tue, Dec 24, 2019 at 09:04:10AM +0100, Andrea Vai wrote:
> Il giorno mar, 24/12/2019 alle 09.32 +0800, Ming Lei ha scritto:
> > On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> > > Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > > > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha
> > scritto:
> > > > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > > > scritto:
> > > > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y.
> > Ts'o
> > > > > > wrote:
> > > > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei
> > wrote:
> > > > > > > > > > I didn't reproduce the issue in my test environment,
> > and
> > > > > > follows
> > > > > > > > > > Andrea's test commands[1]:
> > > > > > > > > > 
> > > > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a
> > $logfile
> > > > > > > > > >   SECONDS=0
> > > > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > > 
> > > > > > > > > > The 'cp' command supposes to open/close the file
> > just
> > > > once,
> > > > > > > > however
> > > > > > > > > > ext4_release_file() & write pages is observed to run
> > for
> > > > > > 4358
> > > > > > > > times
> > > > > > > > > > when executing the above 'cp' test.
> > > > > > > > > 
> > > > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > > > coming
> > > > > > from
> > > > > > > > the
> > > > > > > > > cp command, as opposed to something else that might be
> > > > running
> > > > > > on
> > > > > > > > the
> > > > > > > > > system under test?  _fput() is called by the kernel
> > when
> > > > the
> > > > > > last
> > > > > > > > 
> > > > > > > > Please see the log:
> > > > > > > > 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > > > 
> > > > > > > > Which is collected by:
> > > > > > > > 
> > > > > > > > #!/bin/sh
> > > > > > > > MAJ=$1
> > > > > > > > MIN=$2
> > > > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s
> > %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > > > 
> > > > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > > > 
> > > > > > > > From the above IO trace, there are two write paths, one
> > is
> > > > from
> > > > > > cp,
> > > > > > > > another is from writeback wq.
> > > > > > > > 
> > > > > > > > The stackcount trace[1] is consistent with the IO trace
> > log
> > > > > > since it
> > > > > > > > only shows two IO paths, that is why I concluded that
> > the
> > > > write
> > > > > > done
> > > > > > > > via
> > > > > > > > ext4_release_file() is from 'cp'.
> > > > > > > > 
> > > > > > > > [1] 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > > > 
> > > > > > > > > reference to a struct file is
> > released.  (Specifically, if
> > > > you
> > > > > > > > have a
> > > > > > > > > fd which is dup'ed, it's only when the last fd
> > > > corresponding
> > > > > > to
> > > > > > > > the
> > > > > > > > > struct file is closed, and the struct file is about to
> > be
> > > > > > > > released,
> > > > > > > > > does the file system's f_ops->release function get
> > > > called.)
> > > > > > > > > 
> > > > > > > > > So the first question I'd ask is whether there is
> > anything
> > > > > > else
> > > > > > > > going
> > > > > > > > > on the system, and whether the writes are happening to
> > the
> > > > USB
> > > > > > > > thumb
> > > > > > > > > drive, or to some other storage device.  And if there
> > is
> > > > > > something
> > > > > > > > > else which is writing to the pendrive, maybe that's
> > why no
> > > > one
> > > > > > > > else
> > > > > > > > > has been able to reproduce the OP's complaint....
> > > > > > > > 
> > > > > > > > OK, we can ask Andrea to confirm that via the following
> > > > trace,
> > > > > > which
> > > > > > > > will add pid/comm info in the stack trace:
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted
> > > > > > > > 
> > > > > > > > Andrew, could you collect the above log again when
> > running
> > > > > > new/bad
> > > > > > > > kernel for confirming if the write done by
> > > > ext4_release_file()
> > > > > > is
> > > > > > > > from
> > > > > > > > the 'cp' process?
> > > > > > > 
> > > > > > > You can find the stackcount log attached. It has been
> > produced
> > > > by:
> > > > > > > 
> > > > > > > - /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted >
> > > > > > trace.log
> > > > > > > - wait some seconds
> > > > > > > - run the test (1 copy trial), wait for the test to
> > finish,
> > > > wait
> > > > > > some seconds
> > > > > > > - stop the trace (ctrl+C)
> > > > > > 
> > > > > > Thanks for collecting the log, looks your 'stackcount'
> > doesn't
> > > > > > include
> > > > > > comm/pid info, seems there is difference between your bcc
> > and
> > > > > > my bcc in fedora 30.
> > > > > > 
> > > > > > Could you collect above log again via the following command?
> > > > > > 
> > > > > > /usr/share/bcc/tools/stackcount -P -K
> > t:block:block_rq_insert
> > > > > > 
> > > > > > which will show the comm/pid info.
> > > > > 
> > > > > ok, attached (trace_20191219.txt), the test (1 trial) took
> > 3684
> > > > > seconds.
> > > > 
> > > > From the above trace:
> > > > 
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'dd_insert_requests'
> > > >   b'blk_mq_sched_insert_requests'
> > > >   b'blk_mq_flush_plug_list'
> > > >   b'blk_flush_plug_list'
> > > >   b'io_schedule_prepare'
> > > >   b'io_schedule'
> > > >   b'rq_qos_wait'
> > > >   b'wbt_wait'
> > > >   b'__rq_qos_throttle'
> > > >   b'blk_mq_make_request'
> > > >   b'generic_make_request'
> > > >   b'submit_bio'
> > > >   b'ext4_io_submit'
> > > >   b'ext4_writepages'
> > > >   b'do_writepages'
> > > >   b'__filemap_fdatawrite_range'
> > > >   b'ext4_release_file'
> > > >   b'__fput'
> > > >   b'task_work_run'
> > > >   b'exit_to_usermode_loop'
> > > >   b'do_syscall_64'
> > > >   b'entry_SYSCALL_64_after_hwframe'
> > > >     b'cp' [19863]
> > > >     4400
> > > > 
> > > > So this write is clearly from 'cp' process, and it should be one
> > > > ext4 fs issue.
> > > > 
> > > > Ted, can you take a look at this issue?
> > > > 
> > > > > 
> > > > > > > I also tried the usual test with btrfs and xfs. Btrfs
> > behavior
> > > > > > looks
> > > > > > > "good". xfs seems sometimes better, sometimes worse, I
> > would
> > > > say.
> > > > > > I
> > > > > > > don't know if it matters, anyway you can also find the
> > results
> > > > of
> > > > > > the
> > > > > > > two tests (100 trials each). Basically, btrfs is always
> > > > between 68
> > > > > > and
> > > > > > > 89 seconds, with a cyclicity (?) with "period=2 trials".
> > xfs
> > > > looks
> > > > > > > almost always very good (63-65s), but sometimes "bad"
> > (>300s).
> > > > > > 
> > > > > > If you are interested in digging into this one, the
> > following
> > > > trace
> > > > > > should be helpful:
> > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > > > 
> > > > > 
> > > > > Attached:
> > > > > - trace_xfs_20191223.txt (7 trials, then aborted while doing
> > the
> > > > 8th),
> > > > > times to complete:
> > > > > 64s
> > > > > 63s
> > > > > 64s
> > > > > 833s
> > > > > 1105s
> > > > > 63s
> > > > > 64s
> > > > 
> > > > oops, looks we have to collect io insert trace with the
> > following
> > > > bcc script
> > > > on xfs for confirming if there is similar issue with ext4, could
> > you
> > > > run
> > > > it again on xfs? And only post the trace done in case of slow
> > 'cp'.
> > > > 
> > > > 
> > > > #!/bin/sh
> > > > 
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > 
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector' \
> > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > 
> > > here it is (1 trial, 313 seconds to finish)
> > 
> > The above log shows similar issue with ext4 since there is another
> > writeback IO path from 'cp' process. And the following trace can
> > show if
> > it is same with ext4's issue:
> > 
> > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> 
> sorry, also here please tell me which conditions should I use to run
> the test (ext4 or xfs? slow run or not important?)

Maybe not needed.

After thinking the issue further, looks it is highly related with
removing ioc_batching and BDI congestion by blk-mq.

When there are two writeback paths, the original block layer(legacy)
can set 'cp' process which writes pages during close() as 'batching',
then write pages from writeback wq context is blocked. That said there
is actually single writeback IO path even though two are writing pages,
so write order can be maintained, see the following comment in original
__get_request():

	/*
	 * The queue will fill after this allocation, so set
	 * it as full, and mark this process as "batching".
	 * This process will be allowed to complete a batch of
	 * requests, others will be blocked.
	 */

This behavior can be shown in the IO trace done in old kernel with
legacy block IO path:

https://lore.kernel.org/linux-scsi/f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it/2-log_ming_20191128_182751.zip

IMO, we need to figure out one solution in blk-mq to fix this issue
since HDD. performance might be hurt under this situation.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                   ` Ming Lei
  2019-12-24  6:49                                                     ` Andrea Vai
@ 2019-12-24  8:51                                                     ` Andrea Vai
  2019-12-24  9:35                                                       ` Ming Lei
  2019-12-25  5:17                                                     ` Theodore Y. Ts'o
  2 siblings, 1 reply; 29+ messages in thread
From: Andrea Vai @ 2019-12-24  8:51 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 4324 bytes --]

Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> Hi Ted,
> 
> On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > basically, it's:
> > > 
> > >   mount UUID=$uuid /mnt/pendrive
> > >   SECONDS=0
> > >   cp $testfile /mnt/pendrive
> > >   umount /mnt/pendrive
> > >   tempo=$SECONDS
> > > 
> > > and it copies one file only. Anyway, you can find the whole
> script
> > > attached.
> > 
> > OK, so whether we are doing the writeback at the end of cp, or
> when
> > you do the umount, it's probably not going to make any
> difference.  We
> > can get rid of the stack trace in question by changing the script
> to
> > be basically:
> > 
> > mount UUID=$uuid /mnt/pendrive
> > SECONDS=0
> > rm -f /mnt/pendrive/$testfile
> > cp $testfile /mnt/pendrive
> > umount /mnt/pendrive
> > tempo=$SECONDS
> > 
> > I predict if you do that, you'll see that all of the time is spent
> in
> > the umount, when we are trying to write back the file.
> > 
> > I really don't think then this is a file system problem at
> all.  It's
> > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > trace in the writeback code waiting for the I/O to be completed,
> but
> > that doesn't mean that the root cause is in the writeback code or
> in
> > the file system which is triggering the writeback.
> 
> Wrt. the slow write on this usb storage, it is caused by two
> writeback
> path, one is the writeback wq, another is from ext4_release_file()
> which
> is triggered from exit_to_usermode_loop().
> 
> When the two write path is run concurrently, the sequential write
> order
> is broken, then write performance drops much on this particular usb
> storage.
> 
> The ext4_release_file() should be run from read() or write() syscall
> if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():
> 
> 	/*
> 	 * Called when an inode is released. Note that this is
> different
> 	 * from ext4_file_open: open gets called at every open, but
> release
> 	 * gets called only when /all/ the files are closed.
> 	 */
> 	static int ext4_release_file(struct inode *inode, struct file
> *filp)
> 
> > 
> > I suspect the next step is use a blktrace, to see what kind of I/O
> is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x
> 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that
> doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.
> 
> > 
> > > > And then send me
> > > btw, please tell me if "me" means only you or I cc: all the
> > > recipients, as usual
> > 
> > Well, I don't think we know what the root cause is.  Ming is
> focusing
> > on that stack trace, but I think it's a red herring.....  And if
> it's
> > not a file system problem, then other people will be best suited
> to
> > debug the issue.
> 
> So far, the reason points to the extra writeback path from
> exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with
> file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.
> 
> Andrea, please collect the following log or the strace log requested
> by Ted, then
> we can confirm if the extra writeback is from close() or
> read/write() syscall:
> 
> # pass PID of 'cp' to this script
> #!/bin/sh
> PID=$1
> /usr/share/bcc/tools/trace -P $PID  -t -C \
>     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> args->nr_sector' \
>     't:syscalls:sys_exit_close ' \
>     't:syscalls:sys_exit_read ' \
>     't:syscalls:sys_exit_write '

Meanwhile, I tried to run the test and obtained an error (...usage:
trace [-h] [-b BUFFER_PAGES] [-p PID]...), so assumed the "-P" should
be "-p", corrected and obtained the attached log with ext4 and a slow
copy (2482 seconds) by doing:

- start the test
- look at the cp pid
- run the trace
- wait for the test to finish
- stop the trace.

Thanks,
Andrea

[-- Attachment #2: 20191224_test_ming.zip --]
[-- Type: application/zip, Size: 20830 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  8:51                                                     ` Andrea Vai
@ 2019-12-24  9:35                                                       ` Ming Lei
  0 siblings, 0 replies; 29+ messages in thread
From: Ming Lei @ 2019-12-24  9:35 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Theodore Y. Ts'o, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Tue, Dec 24, 2019 at 09:51:16AM +0100, Andrea Vai wrote:
> Il giorno mar, 24/12/2019 alle 09.27 +0800, Ming Lei ha scritto:
> > Hi Ted,
> > 
> > On Mon, Dec 23, 2019 at 02:53:01PM -0500, Theodore Y. Ts'o wrote:
> > > On Mon, Dec 23, 2019 at 07:45:57PM +0100, Andrea Vai wrote:
> > > > basically, it's:
> > > > 
> > > >   mount UUID=$uuid /mnt/pendrive
> > > >   SECONDS=0
> > > >   cp $testfile /mnt/pendrive
> > > >   umount /mnt/pendrive
> > > >   tempo=$SECONDS
> > > > 
> > > > and it copies one file only. Anyway, you can find the whole
> > script
> > > > attached.
> > > 
> > > OK, so whether we are doing the writeback at the end of cp, or
> > when
> > > you do the umount, it's probably not going to make any
> > difference.  We
> > > can get rid of the stack trace in question by changing the script
> > to
> > > be basically:
> > > 
> > > mount UUID=$uuid /mnt/pendrive
> > > SECONDS=0
> > > rm -f /mnt/pendrive/$testfile
> > > cp $testfile /mnt/pendrive
> > > umount /mnt/pendrive
> > > tempo=$SECONDS
> > > 
> > > I predict if you do that, you'll see that all of the time is spent
> > in
> > > the umount, when we are trying to write back the file.
> > > 
> > > I really don't think then this is a file system problem at
> > all.  It's
> > > just that USB I/O is slow, for whatever reason.  We'll see a stack
> > > trace in the writeback code waiting for the I/O to be completed,
> > but
> > > that doesn't mean that the root cause is in the writeback code or
> > in
> > > the file system which is triggering the writeback.
> > 
> > Wrt. the slow write on this usb storage, it is caused by two
> > writeback
> > path, one is the writeback wq, another is from ext4_release_file()
> > which
> > is triggered from exit_to_usermode_loop().
> > 
> > When the two write path is run concurrently, the sequential write
> > order
> > is broken, then write performance drops much on this particular usb
> > storage.
> > 
> > The ext4_release_file() should be run from read() or write() syscall
> > if
> > Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected
> > behavior
> > for ext4_release_file() to be run thousands of times when just
> > running 'cp' once, see comment of ext4_release_file():
> > 
> > 	/*
> > 	 * Called when an inode is released. Note that this is
> > different
> > 	 * from ext4_file_open: open gets called at every open, but
> > release
> > 	 * gets called only when /all/ the files are closed.
> > 	 */
> > 	static int ext4_release_file(struct inode *inode, struct file
> > *filp)
> > 
> > > 
> > > I suspect the next step is use a blktrace, to see what kind of I/O
> > is
> > > being sent to the USB drive, and how long it takes for the I/O to
> > > complete.  You might also try to capture the output of "iostat -x
> > 1"
> > > while the script is running, and see what the difference might be
> > > between a kernel version that has the problem and one that
> > doesn't,
> > > and see if that gives us a clue.
> > 
> > That isn't necessary, given we have concluded that the bad write
> > performance is caused by broken write order.
> > 
> > > 
> > > > > And then send me
> > > > btw, please tell me if "me" means only you or I cc: all the
> > > > recipients, as usual
> > > 
> > > Well, I don't think we know what the root cause is.  Ming is
> > focusing
> > > on that stack trace, but I think it's a red herring.....  And if
> > it's
> > > not a file system problem, then other people will be best suited
> > to
> > > debug the issue.
> > 
> > So far, the reason points to the extra writeback path from
> > exit_to_usermode_loop().
> > If it is not from close() syscall, the issue should be related with
> > file reference
> > count. If it is from close() syscall, the issue might be in 'cp''s
> > implementation.
> > 
> > Andrea, please collect the following log or the strace log requested
> > by Ted, then
> > we can confirm if the extra writeback is from close() or
> > read/write() syscall:
> > 
> > # pass PID of 'cp' to this script
> > #!/bin/sh
> > PID=$1
> > /usr/share/bcc/tools/trace -P $PID  -t -C \
> >     't:block:block_rq_insert "%s %d %d", args->rwbs, args->sector,
> > args->nr_sector' \
> >     't:syscalls:sys_exit_close ' \
> >     't:syscalls:sys_exit_read ' \
> >     't:syscalls:sys_exit_write '
> 
> Meanwhile, I tried to run the test and obtained an error (...usage:
> trace [-h] [-b BUFFER_PAGES] [-p PID]...), so assumed the "-P" should
> be "-p", corrected and obtained the attached log with ext4 and a slow
> copy (2482 seconds) by doing:
> 
> - start the test
> - look at the cp pid
> - run the trace
> - wait for the test to finish
> - stop the trace.

The log shows all io submission is from close() syscall, so fs code
is fine, and I have provided the reason of this issue in last email:

https://lore.kernel.org/linux-scsi/e3dc2a3e0221c0a0beb91172ba2bff1f6acc0cb7.camel@unipv.it/T/#m845caca2969da5676516c35dc0c3528a79beb886

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-24  1:27                                                   ` Ming Lei
  2019-12-24  6:49                                                     ` Andrea Vai
  2019-12-24  8:51                                                     ` Andrea Vai
@ 2019-12-25  5:17                                                     ` Theodore Y. Ts'o
  2019-12-26  2:27                                                       ` Ming Lei
  2 siblings, 1 reply; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-25  5:17 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 24, 2019 at 09:27:07AM +0800, Ming Lei wrote:
> The ext4_release_file() should be run from read() or write() syscall if
> Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
> for ext4_release_file() to be run thousands of times when just
> running 'cp' once, see comment of ext4_release_file():

What's your evidence of that?  As opposed to the writeback taking a
long time, leading to the *one* call of ext4_release_file taking a
long time?  If it's a big file, we might very well be calliing
ext4_writepages multiple times, from a single call to
__filemap_fdatawrite_range().

You confused mightily from that assertion, and that caused me to make
assumptions that cp was doing something crazy.  But I'm quite conviced
now that this is almost certainly not what is happening.

> > I suspect the next step is use a blktrace, to see what kind of I/O is
> > being sent to the USB drive, and how long it takes for the I/O to
> > complete.  You might also try to capture the output of "iostat -x 1"
> > while the script is running, and see what the difference might be
> > between a kernel version that has the problem and one that doesn't,
> > and see if that gives us a clue.
> 
> That isn't necessary, given we have concluded that the bad write
> performance is caused by broken write order.

I didn't see any evidence of that from what I had in my inbox, so I
went back to the mailing list archives to figure out what you were
talking about.  Part of the problem is this has been a very
long-spanning thread, and I had deleted from my inbox all of the parts
relating to the MQ scheduler since that was clearly Not My Problem.  :-)

So, summarizing the most of the thread.  The problem started when we
removed the legacy I/O scheduler, since we are now only using the MQ
scheduler.  What the kernel is sending is long writes (240 sectors),
but it is being sent as an interleaved stream of two sequential
writes.  This particular pendrive can't handle this workload, because
it has a very simplistic Flash Translation Layer.  Now, this is not
*broken*, from a storage perspective; it's just that it's more than
the simple little brain of this particular pen drive can handle.

Previously, with a single queue, and specially since the queue depth
supported by this pen drive is 1, the elevator algorithm would sort
the I/O requests so that it would be mostly sequential, and this
wouldn't be much of a problem.  However, once the legacy I/O stack was
removed, the MQ stack is designed so that we don't have to take a
global lock in order to submit an I/O request.  That also means that
we can't do a full elevator sort since that would require locking all
of the queues.

This is not a problem, since HDD's generally have a 16 deep queue, and
SSD's have a super-deep queue depth since they get their speed via
parallel writes to different flash chips.  Unfortunately, it *is* a
problem for super primitive USB sticks.

> So far, the reason points to the extra writeback path from exit_to_usermode_loop().
> If it is not from close() syscall, the issue should be related with file reference
> count. If it is from close() syscall, the issue might be in 'cp''s
> implementation.

Oh, it's probably from the close system call; and it's *only* from a
single close system call.  Because there is the auto delayed
allocation resolution to protect against buggy userspace, under
certain circumstances, as I explained earlier, we force a full
writeout on a close for a file decsriptor which was opened with an
O_TRUNC.  This is by *design*, since we are trying to protect against
buggy userspace (application programmers vastly outnumber file system
programmers, and far too many of them want O_PONY).  This is Working
As Intended.

You can disable it by deleting the test file before the cp:

    rm -f /mnt/pendrive/$testfile

Or you can disable the protection against stupid userspace by using
the noauto_da_alloc mount option.  (But then if you have a buggy game
program which writes the top-ten score file by using open(2) w/
O_TRUNC, and then said program closes the OpenGL library, and the
proprietary 3rd party binary-only video driver wedges the X server
requiring a hard reset to recover, and the top-ten score file becomes
a zero-length file, don't come crying to me...  Or if a graphical text
editor forgets to use fsync(2) before saving a source file you spent
hours working on, and then the system crashes at exactly the wrong
moment and your source file becomes zero-length, against, don't come
crying to me.  Blame the stupid application programmer which wrote
your text editor who decided to skip the fsync(2), or who decided that
copying the ACL's and xattrs was Too Hard(tm), and so opening the file
with O_TRUNC and rewriting the file in place was easier for the
application programmer.)

In any case, I think this is all working all as intended.  The MQ I/O
stack is optimized for modern HDD and SSD's, and especially SSD's.
And the file system assumes that parallel sequential writes,
especially if they are large, is really not a big deal, since that's
what NCQ or massive parallelism of pretty much all SSD's want.
(Again, ignoring the legacy of crappy flash drives.

You can argue with storage stack folks about whether we need to have
super-dumb mode for slow, crappy flash which uses a global lock and a
global elevator scheduler for super-crappy flash if you want.  I'm
going to stay out of that argument.

					- Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-25  5:17                                                     ` Theodore Y. Ts'o
@ 2019-12-26  2:27                                                       ` Ming Lei
  2019-12-26  3:30                                                         ` Theodore Y. Ts'o
  0 siblings, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-26  2:27 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 25, 2019 at 12:17:22AM -0500, Theodore Y. Ts'o wrote:
> On Tue, Dec 24, 2019 at 09:27:07AM +0800, Ming Lei wrote:
> > The ext4_release_file() should be run from read() or write() syscall if
> > Fedora 30's 'cp' is implemented correctly. IMO, it isn't expected behavior
> > for ext4_release_file() to be run thousands of times when just
> > running 'cp' once, see comment of ext4_release_file():
> 
> What's your evidence of that?  As opposed to the writeback taking a
> long time, leading to the *one* call of ext4_release_file taking a
> long time?  If it's a big file, we might very well be calliing
> ext4_writepages multiple times, from a single call to
> __filemap_fdatawrite_range().
> 
> You confused mightily from that assertion, and that caused me to make
> assumptions that cp was doing something crazy.  But I'm quite conviced
> now that this is almost certainly not what is happening.
> 
> > > I suspect the next step is use a blktrace, to see what kind of I/O is
> > > being sent to the USB drive, and how long it takes for the I/O to
> > > complete.  You might also try to capture the output of "iostat -x 1"
> > > while the script is running, and see what the difference might be
> > > between a kernel version that has the problem and one that doesn't,
> > > and see if that gives us a clue.
> > 
> > That isn't necessary, given we have concluded that the bad write
> > performance is caused by broken write order.
> 
> I didn't see any evidence of that from what I had in my inbox, so I
> went back to the mailing list archives to figure out what you were
> talking about.  Part of the problem is this has been a very
> long-spanning thread, and I had deleted from my inbox all of the parts
> relating to the MQ scheduler since that was clearly Not My Problem.  :-)
> 
> So, summarizing the most of the thread.  The problem started when we
> removed the legacy I/O scheduler, since we are now only using the MQ
> scheduler.  What the kernel is sending is long writes (240 sectors),
> but it is being sent as an interleaved stream of two sequential
> writes.  This particular pendrive can't handle this workload, because
> it has a very simplistic Flash Translation Layer.  Now, this is not
> *broken*, from a storage perspective; it's just that it's more than
> the simple little brain of this particular pen drive can handle.
> 
> Previously, with a single queue, and specially since the queue depth
> supported by this pen drive is 1, the elevator algorithm would sort
> the I/O requests so that it would be mostly sequential, and this
> wouldn't be much of a problem.  However, once the legacy I/O stack was
> removed, the MQ stack is designed so that we don't have to take a
> global lock in order to submit an I/O request.  That also means that
> we can't do a full elevator sort since that would require locking all
> of the queues.
> 
> This is not a problem, since HDD's generally have a 16 deep queue, and
> SSD's have a super-deep queue depth since they get their speed via
> parallel writes to different flash chips.  Unfortunately, it *is* a
> problem for super primitive USB sticks.
> 
> > So far, the reason points to the extra writeback path from exit_to_usermode_loop().
> > If it is not from close() syscall, the issue should be related with file reference
> > count. If it is from close() syscall, the issue might be in 'cp''s
> > implementation.
> 
> Oh, it's probably from the close system call; and it's *only* from a
> single close system call.  Because there is the auto delayed

Right. Looks I mis-interpreted the stackcount log, IOs are submitted
from single close syscall.

> allocation resolution to protect against buggy userspace, under
> certain circumstances, as I explained earlier, we force a full
> writeout on a close for a file decsriptor which was opened with an
> O_TRUNC.  This is by *design*, since we are trying to protect against
> buggy userspace (application programmers vastly outnumber file system
> programmers, and far too many of them want O_PONY).  This is Working
> As Intended.
> 
> You can disable it by deleting the test file before the cp:
> 
>     rm -f /mnt/pendrive/$testfile
> 
> Or you can disable the protection against stupid userspace by using
> the noauto_da_alloc mount option.  (But then if you have a buggy game
> program which writes the top-ten score file by using open(2) w/
> O_TRUNC, and then said program closes the OpenGL library, and the
> proprietary 3rd party binary-only video driver wedges the X server
> requiring a hard reset to recover, and the top-ten score file becomes
> a zero-length file, don't come crying to me...  Or if a graphical text
> editor forgets to use fsync(2) before saving a source file you spent
> hours working on, and then the system crashes at exactly the wrong
> moment and your source file becomes zero-length, against, don't come
> crying to me.  Blame the stupid application programmer which wrote
> your text editor who decided to skip the fsync(2), or who decided that
> copying the ACL's and xattrs was Too Hard(tm), and so opening the file
> with O_TRUNC and rewriting the file in place was easier for the
> application programmer.)
> 
> In any case, I think this is all working all as intended.  The MQ I/O
> stack is optimized for modern HDD and SSD's, and especially SSD's.
> And the file system assumes that parallel sequential writes,
> especially if they are large, is really not a big deal, since that's
> what NCQ or massive parallelism of pretty much all SSD's want.
> (Again, ignoring the legacy of crappy flash drives.
> 
> You can argue with storage stack folks about whether we need to have
> super-dumb mode for slow, crappy flash which uses a global lock and a
> global elevator scheduler for super-crappy flash if you want.  I'm
> going to stay out of that argument.

As I mentioned in the following link:

https://lore.kernel.org/linux-scsi/20191224084721.GA27248@ming.t460p/

The reason is that ioc_batching and BDI congestion is removed by blk-mq.

Then after queue is congested, multiple sequential writes can be done
concurrently at the same time. Before ioc_batching and BDI congestion is
removed, writes are done serialized from multiple processes actually, so
IOs are dispatched to drive in strict sequential order.

This way can't be an issue for SSD.

Maybe we need to be careful for HDD., since the request count in scheduler
queue is double of in-flight request count, and in theory NCQ should only
cover all in-flight 32 requests. I will find a sata HDD., and see if
performance drop can be observed in the similar 'cp' test.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  2:27                                                       ` Ming Lei
@ 2019-12-26  3:30                                                         ` Theodore Y. Ts'o
  2019-12-26  8:37                                                           ` Ming Lei
       [not found]                                                           ` <20200101074310.10904-1-hdanton@sina.com>
  0 siblings, 2 replies; 29+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-26  3:30 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> Maybe we need to be careful for HDD., since the request count in scheduler
> queue is double of in-flight request count, and in theory NCQ should only
> cover all in-flight 32 requests. I will find a sata HDD., and see if
> performance drop can be observed in the similar 'cp' test.

Please try to measure it, but I'd be really surprised if it's
significant with with modern HDD's.  That because they typically have
a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just under
32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
125-150 MiB/s.  So suppose every other request sent to the HDD is from
the other request stream.  The disk will chose the 8 requests from its
queue that are contiguous, and so it will be writing around 256 MiB,
which will take 2-3 seconds.  If it then needs to spend between 1 and
10 ms seeking to another location of the disk, before it writes the
next 256 MiB, the worst case overhead of that seek is 10ms / 2s, or
0.5%.  That may very well be within your measurements' error bars.

And of course, note that in real life, we are very *often* writing to
multiple files in parallel, for example, during a "make -j16" while
building the kernel.  Writing a single large file is certainly
something people do (but even there people who are burning a 4G DVD
rip are often browsing the web while they are waiting for it to
complete, and the browser will be writing cache files, etc.).  So
whether or not this is something where we should be stressing over
this specific workload is going to be quite debateable.

						- Ted

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  3:30                                                         ` Theodore Y. Ts'o
@ 2019-12-26  8:37                                                           ` Ming Lei
  2020-01-07  7:51                                                             ` Andrea Vai
       [not found]                                                           ` <20200101074310.10904-1-hdanton@sina.com>
  1 sibling, 1 reply; 29+ messages in thread
From: Ming Lei @ 2019-12-26  8:37 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > Maybe we need to be careful for HDD., since the request count in scheduler
> > queue is double of in-flight request count, and in theory NCQ should only
> > cover all in-flight 32 requests. I will find a sata HDD., and see if
> > performance drop can be observed in the similar 'cp' test.
> 
> Please try to measure it, but I'd be really surprised if it's
> significant with with modern HDD's.

Just find one machine with AHCI SATA, and run the following xfs
overwrite test:

#!/bin/bash
DIR=$1
echo 3 > /proc/sys/vm/drop_caches
fio --readwrite=write --filesize=5g --overwrite=1 --filename=$DIR/fiofile \
        --runtime=60s --time_based --ioengine=psync --direct=0 --bs=4k
		--iodepth=128 --numjobs=2 --group_reporting=1 --name=overwrite

FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32), because the
machine is picked up from RH beaker, and it is the only disk in the box.

#lsblk
NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                               8:0    0 931.5G  0 disk 
├─sda1                            8:1    0     1G  0 part /boot
└─sda2                            8:2    0 930.5G  0 part 
  ├─rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
  ├─rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
  └─rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home


kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which is
the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO path").

            |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
-----------------------------------------------------------
throughput: |244MB/s               |169MB/s               |
-----------------------------------------------------------

Similar result can be observed on v5.4 kernel(184MB/s) with same test
steps.


> That because they typically have
> a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just under
> 32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
> 8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
> 125-150 MiB/s.  So suppose every other request sent to the HDD is from
> the other request stream.  The disk will chose the 8 requests from its
> queue that are contiguous, and so it will be writing around 256 MiB,
> which will take 2-3 seconds.  If it then needs to spend between 1 and
> 10 ms seeking to another location of the disk, before it writes the
> next 256 MiB, the worst case overhead of that seek is 10ms / 2s, or
> 0.5%.  That may very well be within your measurements' error bars.

Looks you assume that disk seeking just happens once when writing around
256MB. This assumption may not be true, given all data can be in page
cache before writing. So when two tasks are submitting IOs concurrently,
IOs from each single task is sequential, and NCQ may order the current batch
submitted from the two streams. However disk seeking may still be needed
for the next batch handled by NCQ.

> And of course, note that in real life, we are very *often* writing to
> multiple files in parallel, for example, during a "make -j16" while
> building the kernel.  Writing a single large file is certainly
> something people do (but even there people who are burning a 4G DVD
> rip are often browsing the web while they are waiting for it to
> complete, and the browser will be writing cache files, etc.).  So
> whether or not this is something where we should be stressing over
> this specific workload is going to be quite debateable.

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: slow IO on USB media
       [not found]                                                           ` <20200101074310.10904-1-hdanton@sina.com>
@ 2020-01-01 13:53                                                             ` Ming Lei
  0 siblings, 0 replies; 29+ messages in thread
From: Ming Lei @ 2020-01-01 13:53 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Theodore Y. Ts'o, Andrea Vai, Schmid, Carsten, Finn Thain,
	Damien Le Moal, Alan Stern, Jens Axboe, Johannes Thumshirn,
	USB list, SCSI development list, Himanshu Madhani,
	Hannes Reinecke, Omar Sandoval, Martin K. Petersen, Greg KH,
	Hans Holmberg, Kernel development list, linux-ext4,
	linux-fsdevel

On Wed, Jan 01, 2020 at 03:43:10PM +0800, Hillf Danton wrote:
> 
> On Thu, 26 Dec 2019 16:37:06 +0800 Ming Lei wrote:
> > On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> > > On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > > > Maybe we need to be careful for HDD., since the request count in scheduler
> > > > queue is double of in-flight request count, and in theory NCQ should only
> > > > cover all in-flight 32 requests. I will find a sata HDD., and see if
> > > > performance drop can be observed in the similar 'cp' test.
> > >
> > > Please try to measure it, but I'd be really surprised if it's
> > > significant with with modern HDD's.
> > 
> > Just find one machine with AHCI SATA, and run the following xfs
> > overwrite test:
> > 
> > #!/bin/bash
> > DIR=$1
> > echo 3 > /proc/sys/vm/drop_caches
> > fio --readwrite=write --filesize=5g --overwrite=1 --filename=$DIR/fiofile \
> >         --runtime=60s --time_based --ioengine=psync --direct=0 --bs=4k
> > 		--iodepth=128 --numjobs=2 --group_reporting=1 --name=overwrite
> > 
> > FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32), because the
> > machine is picked up from RH beaker, and it is the only disk in the box.
> > 
> > #lsblk
> > NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> > sda                               8:0    0 931.5G  0 disk
> > =E2=94=9C=E2=94=80sda1                            8:1    0     1G  0 part /boot
> > =E2=94=94=E2=94=80sda2                            8:2    0 930.5G  0 part
> >   =E2=94=9C=E2=94=80rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
> >   =E2=94=9C=E2=94=80rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
> >   =E2=94=94=E2=94=80rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home
> > 
> > 
> > kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which is
> > the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO path").
> > 
> >             |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
> > -----------------------------------------------------------
> > throughput: |244MB/s               |169MB/s               |
> > -----------------------------------------------------------
> > 
> > Similar result can be observed on v5.4 kernel(184MB/s) with same test
> > steps.
> 
> 
> The simple diff makes direct issue of requests take pending requests
> also into account and goes the nornal enqueue-and-dequeue path if any
> pending requests exist.
> 
> Then it sorts requests regardless of the number of hard queues in a
> bid to make requests as sequencial as they are. Let's see if the
> added sorting cost can make any sense.
> 
> --->8---
> 
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -410,6 +410,11 @@ run:
>  		blk_mq_run_hw_queue(hctx, async);
>  }
>  
> +static inline bool blk_mq_sched_hctx_has_pending_rq(struct blk_mq_hw_ctx *hctx)
> +{
> +	return sbitmap_any_bit_set(&hctx->ctx_map);
> +}
> +
>  void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
>  				  struct blk_mq_ctx *ctx,
>  				  struct list_head *list, bool run_queue_async)
> @@ -433,7 +438,8 @@ void blk_mq_sched_insert_requests(struct
>  		 * busy in case of 'none' scheduler, and this way may save
>  		 * us one extra enqueue & dequeue to sw queue.
>  		 */
> -		if (!hctx->dispatch_busy && !e && !run_queue_async) {
> +		if (!hctx->dispatch_busy && !e && !run_queue_async &&
> +		    !blk_mq_sched_hctx_has_pending_rq(hctx)) {
>  			blk_mq_try_issue_list_directly(hctx, list);
>  			if (list_empty(list))
>  				goto out;
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1692,7 +1692,7 @@ void blk_mq_flush_plug_list(struct blk_p
>  
>  	list_splice_init(&plug->mq_list, &list);
>  
> -	if (plug->rq_count > 2 && plug->multiple_queues)
> +	if (plug->rq_count > 1)
>  		list_sort(NULL, &list, plug_rq_cmp);
>  
>  	plug->rq_count = 0;

I guess you may not understand the reason, and the issue is related
with neither MQ nor plug.

AHCI/SATA is single queue drive, and for HDD. IO throughput is very
sensitive with IO order in case of sequential IO.

Legacy IO path supports ioc batching and BDI queue congestion. When
there are more than one writeback IO paths, there may be only one
active IO submission path, meantime others are blocked attributed to
ioc batching, so writeback IO is still dispatched to disk in strict
IO order.

But ioc batching and BDI queue congestion is killed when converting to
blk-mq.

Please see the following IO trace with legacy IO request path:

https://lore.kernel.org/linux-scsi/f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it/2-log_ming_20191128_182751.zip


Thanks,
Ming


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-26  8:37                                                           ` Ming Lei
@ 2020-01-07  7:51                                                             ` Andrea Vai
  0 siblings, 0 replies; 29+ messages in thread
From: Andrea Vai @ 2020-01-07  7:51 UTC (permalink / raw)
  To: Ming Lei, Theodore Y. Ts'o
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

Il giorno gio, 26/12/2019 alle 16.37 +0800, Ming Lei ha scritto:
> On Wed, Dec 25, 2019 at 10:30:57PM -0500, Theodore Y. Ts'o wrote:
> > On Thu, Dec 26, 2019 at 10:27:02AM +0800, Ming Lei wrote:
> > > Maybe we need to be careful for HDD., since the request count in
> scheduler
> > > queue is double of in-flight request count, and in theory NCQ
> should only
> > > cover all in-flight 32 requests. I will find a sata HDD., and
> see if
> > > performance drop can be observed in the similar 'cp' test.
> > 
> > Please try to measure it, but I'd be really surprised if it's
> > significant with with modern HDD's.
> 
> Just find one machine with AHCI SATA, and run the following xfs
> overwrite test:
> 
> #!/bin/bash
> DIR=$1
> echo 3 > /proc/sys/vm/drop_caches
> fio --readwrite=write --filesize=5g --overwrite=1 --
> filename=$DIR/fiofile \
>         --runtime=60s --time_based --ioengine=psync --direct=0 --
> bs=4k
> 		--iodepth=128 --numjobs=2 --group_reporting=1 --
> name=overwrite
> 
> FS is xfs, and disk is LVM over AHCI SATA with NCQ(depth 32),
> because the
> machine is picked up from RH beaker, and it is the only disk in the
> box.
> 
> #lsblk
> NAME                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> sda                               8:0    0 931.5G  0 disk 
> ├─sda1                            8:1    0     1G  0 part /boot
> └─sda2                            8:2    0 930.5G  0 part 
>   ├─rhel_hpe--ml10gen9--01-root 253:0    0    50G  0 lvm  /
>   ├─rhel_hpe--ml10gen9--01-swap 253:1    0   3.9G  0 lvm  [SWAP]
>   └─rhel_hpe--ml10gen9--01-home 253:2    0 876.6G  0 lvm  /home
> 
> 
> kernel: 3a7ea2c483a53fc("scsi: provide mq_ops->busy() hook") which
> is
> the previous commit of f664a3cc17b7 ("scsi: kill off the legacy IO
> path").
> 
>             |scsi_mod.use_blk_mq=N |scsi_mod.use_blk_mq=Y |
> -----------------------------------------------------------
> throughput: |244MB/s               |169MB/s               |
> -----------------------------------------------------------
> 
> Similar result can be observed on v5.4 kernel(184MB/s) with same
> test
> steps.
> 
> 
> > That because they typically have
> > a queue depth of 16, and a max_sectors_kb of 32767 (e.g., just
> under
> > 32 MiB).  Sort seeks are typically 1-2 ms, with full stroke seeks
> > 8-10ms.  Typical sequential write speeds on a 7200 RPM drive is
> > 125-150 MiB/s.  So suppose every other request sent to the HDD is
> from
> > the other request stream.  The disk will chose the 8 requests from
> its
> > queue that are contiguous, and so it will be writing around 256
> MiB,
> > which will take 2-3 seconds.  If it then needs to spend between 1
> and
> > 10 ms seeking to another location of the disk, before it writes
> the
> > next 256 MiB, the worst case overhead of that seek is 10ms / 2s,
> or
> > 0.5%.  That may very well be within your measurements' error bars.
> 
> Looks you assume that disk seeking just happens once when writing
> around
> 256MB. This assumption may not be true, given all data can be in
> page
> cache before writing. So when two tasks are submitting IOs
> concurrently,
> IOs from each single task is sequential, and NCQ may order the
> current batch
> submitted from the two streams. However disk seeking may still be
> needed
> for the next batch handled by NCQ.
> 
> > And of course, note that in real life, we are very *often* writing
> to
> > multiple files in parallel, for example, during a "make -j16"
> while
> > building the kernel.  Writing a single large file is certainly
> > something people do (but even there people who are burning a 4G
> DVD
> > rip are often browsing the web while they are waiting for it to
> > complete, and the browser will be writing cache files, etc.).  So
> > whether or not this is something where we should be stressing over
> > this specific workload is going to be quite debateable.
> 

Hi,
  is there any update on this? Sorry if I am making noise, but I would
like to help to improve the kernel (or fix it) if I can help.
Otherwise, please let me know how to consider this case,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, back to index

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191126023253.GA24501@ming.t460p>
     [not found] ` <0598fe2754bf0717d81f7e72d3e9b3230c608cc6.camel@unipv.it>
     [not found]   ` <alpine.LNX.2.21.1.1911271055200.8@nippy.intranet>
     [not found]     ` <cb6e84781c4542229a3f31572cef19ab@SVR-IES-MBX-03.mgc.mentorg.com>
     [not found]       ` <c1358b840b3a4971aa35a25d8495c2c8953403ea.camel@unipv.it>
     [not found]         ` <20191128091712.GD15549@ming.t460p>
     [not found]           ` <f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it>
     [not found]             ` <20191129005734.GB1829@ming.t460p>
     [not found]               ` <20191129023555.GA8620@ming.t460p>
     [not found]                 ` <320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it>
2019-12-03  2:23                   ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
2019-12-10  7:35                     ` Andrea Vai
2019-12-10  8:05                       ` Ming Lei
2019-12-11  2:41                         ` Theodore Y. Ts'o
2019-12-11  4:00                           ` Ming Lei
2019-12-11 16:07                             ` Theodore Y. Ts'o
2019-12-11 21:33                               ` Ming Lei
2019-12-12  7:34                                 ` Andrea Vai
2019-12-18  8:25                                 ` Andrea Vai
2019-12-18  9:48                                   ` Ming Lei
     [not found]                                     ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
2019-12-23 13:08                                       ` Ming Lei
2019-12-23 16:26                                         ` Theodore Y. Ts'o
2019-12-23 16:29                                           ` Andrea Vai
2019-12-23 17:22                                             ` Theodore Y. Ts'o
2019-12-23 18:45                                               ` Andrea Vai
2019-12-23 19:53                                                 ` Theodore Y. Ts'o
2019-12-24  1:27                                                   ` Ming Lei
2019-12-24  6:49                                                     ` Andrea Vai
2019-12-24  8:51                                                     ` Andrea Vai
2019-12-24  9:35                                                       ` Ming Lei
2019-12-25  5:17                                                     ` Theodore Y. Ts'o
2019-12-26  2:27                                                       ` Ming Lei
2019-12-26  3:30                                                         ` Theodore Y. Ts'o
2019-12-26  8:37                                                           ` Ming Lei
2020-01-07  7:51                                                             ` Andrea Vai
     [not found]                                                           ` <20200101074310.10904-1-hdanton@sina.com>
2020-01-01 13:53                                                             ` slow IO on USB media Ming Lei
     [not found]                                         ` <fc6f73fc5f57ade8890b472d63c7f0bd559de538.camel@unipv.it>
2019-12-24  1:32                                           ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
2019-12-24  8:04                                             ` Andrea Vai
2019-12-24  8:47                                               ` Ming Lei

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git