Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
       [not found]                 ` <320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it>
@ 2019-12-03  2:23                   ` Ming Lei
  2019-12-10  7:35                     ` Andrea Vai
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2019-12-03  2:23 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > 
> > > [...]
> > 
> > > Andrea, can you collect the following log when running the test
> > > on current new(bad) kernel?
> > > 
> > > 	/usr/share/bcc/tools/stackcount  -K blk_mq_make_request
> > 
> > Instead, please run the following trace, given insert may be
> > called from other paths, such as flush plug:
> > 
> > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> 
> Attached, for new (patched) bad kernel.
> 
> Produced by: start the trace script (with the pendrive already
> plugged), wait some seconds, run the test (1 trial, 1 GB), wait for
> the test to finish, stop the trace.
> 
> The copy took ~1700 seconds.

See the two path[1][2] of inserting request, and path[1] is triggered
4358 times, and the path[2] is triggered 5763 times.

The path[2] is expected behaviour. Not sure path [1] is correct, given
ext4_release_file() is supposed to be called when this inode is
released. That means the file is closed 4358 times during 1GB file
copying to usb storage.

Cc filesystem list.


[1] insert requests when returning to user mode from syscall

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_writepages'
  b'do_writepages'
  b'__filemap_fdatawrite_range'
  b'ext4_release_file'
  b'__fput'
  b'task_work_run'
  b'exit_to_usermode_loop'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
    4358

[2] insert requests from writeback wq context

  b'blk_mq_sched_request_inserted'
  b'blk_mq_sched_request_inserted'
  b'dd_insert_requests'
  b'blk_mq_sched_insert_requests'
  b'blk_mq_flush_plug_list'
  b'blk_flush_plug_list'
  b'io_schedule_prepare'
  b'io_schedule'
  b'rq_qos_wait'
  b'wbt_wait'
  b'__rq_qos_throttle'
  b'blk_mq_make_request'
  b'generic_make_request'
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_bio_write_page'
  b'mpage_submit_page'
  b'mpage_process_page_bufs'
  b'mpage_prepare_extent_to_map'
  b'ext4_writepages'
  b'do_writepages'
  b'__writeback_single_inode'
  b'writeback_sb_inodes'
  b'__writeback_inodes_wb'
  b'wb_writeback'
  b'wb_workfn'
  b'process_one_work'
  b'worker_thread'
  b'kthread'
  b'ret_from_fork'
    5763

Thanks,
Ming


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-03  2:23                   ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
@ 2019-12-10  7:35                     ` Andrea Vai
  2019-12-10  8:05                       ` Ming Lei
  0 siblings, 1 reply; 4+ messages in thread
From: Andrea Vai @ 2019-12-10  7:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > 
> > > > [...]
> > > 
> > > > Andrea, can you collect the following log when running the
> test
> > > > on current new(bad) kernel?
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount  -K
> blk_mq_make_request
> > > 
> > > Instead, please run the following trace, given insert may be
> > > called from other paths, such as flush plug:
> > > 
> > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > 
> > Attached, for new (patched) bad kernel.
> > 
> > Produced by: start the trace script (with the pendrive already
> > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> for
> > the test to finish, stop the trace.
> > 
> > The copy took ~1700 seconds.
> 
> See the two path[1][2] of inserting request, and path[1] is
> triggered
> 4358 times, and the path[2] is triggered 5763 times.
> 
> The path[2] is expected behaviour. Not sure path [1] is correct,
> given
> ext4_release_file() is supposed to be called when this inode is
> released. That means the file is closed 4358 times during 1GB file
> copying to usb storage.
> 
> Cc filesystem list.
> 
> 
> [1] insert requests when returning to user mode from syscall
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__filemap_fdatawrite_range'
>   b'ext4_release_file'
>   b'__fput'
>   b'task_work_run'
>   b'exit_to_usermode_loop'
>   b'do_syscall_64'
>   b'entry_SYSCALL_64_after_hwframe'
>     4358
> 
> [2] insert requests from writeback wq context
> 
>   b'blk_mq_sched_request_inserted'
>   b'blk_mq_sched_request_inserted'
>   b'dd_insert_requests'
>   b'blk_mq_sched_insert_requests'
>   b'blk_mq_flush_plug_list'
>   b'blk_flush_plug_list'
>   b'io_schedule_prepare'
>   b'io_schedule'
>   b'rq_qos_wait'
>   b'wbt_wait'
>   b'__rq_qos_throttle'
>   b'blk_mq_make_request'
>   b'generic_make_request'
>   b'submit_bio'
>   b'ext4_io_submit'
>   b'ext4_bio_write_page'
>   b'mpage_submit_page'
>   b'mpage_process_page_bufs'
>   b'mpage_prepare_extent_to_map'
>   b'ext4_writepages'
>   b'do_writepages'
>   b'__writeback_single_inode'
>   b'writeback_sb_inodes'
>   b'__writeback_inodes_wb'
>   b'wb_writeback'
>   b'wb_workfn'
>   b'process_one_work'
>   b'worker_thread'
>   b'kthread'
>   b'ret_from_fork'
>     5763
> 
> Thanks,
> Ming
> 

Is there any update on this? Sorry if I am making noise, but I would
like to help to improve the kernel (or fix it) if I can help.
Otherwise, please let me know how to consider this case,

Thanks, and bye
Andrea


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  7:35                     ` Andrea Vai
@ 2019-12-10  8:05                       ` Ming Lei
  2019-12-11  2:41                         ` Theodore Y. Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2019-12-10  8:05 UTC (permalink / raw)
  To: Andrea Vai
  Cc: Schmid, Carsten, Finn Thain, Damien Le Moal, Alan Stern,
	Jens Axboe, Johannes Thumshirn, USB list, SCSI development list,
	Himanshu Madhani, Hannes Reinecke, Omar Sandoval,
	Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel,
	Theodore Ts'o

On Tue, Dec 10, 2019 at 08:35:43AM +0100, Andrea Vai wrote:
> Il giorno mar, 03/12/2019 alle 10.23 +0800, Ming Lei ha scritto:
> > On Fri, Nov 29, 2019 at 03:41:01PM +0100, Andrea Vai wrote:
> > > Il giorno ven, 29/11/2019 alle 10.35 +0800, Ming Lei ha scritto:
> > > > On Fri, Nov 29, 2019 at 08:57:34AM +0800, Ming Lei wrote:
> > > > 
> > > > > [...]
> > > > 
> > > > > Andrea, can you collect the following log when running the
> > test
> > > > > on current new(bad) kernel?
> > > > > 
> > > > > 	/usr/share/bcc/tools/stackcount  -K
> > blk_mq_make_request
> > > > 
> > > > Instead, please run the following trace, given insert may be
> > > > called from other paths, such as flush plug:
> > > > 
> > > > 	/usr/share/bcc/tools/stackcount -K t:block:block_rq_insert
> > > 
> > > Attached, for new (patched) bad kernel.
> > > 
> > > Produced by: start the trace script (with the pendrive already
> > > plugged), wait some seconds, run the test (1 trial, 1 GB), wait
> > for
> > > the test to finish, stop the trace.
> > > 
> > > The copy took ~1700 seconds.
> > 
> > See the two path[1][2] of inserting request, and path[1] is
> > triggered
> > 4358 times, and the path[2] is triggered 5763 times.
> > 
> > The path[2] is expected behaviour. Not sure path [1] is correct,
> > given
> > ext4_release_file() is supposed to be called when this inode is
> > released. That means the file is closed 4358 times during 1GB file
> > copying to usb storage.
> > 
> > Cc filesystem list.
> > 
> > 
> > [1] insert requests when returning to user mode from syscall
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__filemap_fdatawrite_range'
> >   b'ext4_release_file'
> >   b'__fput'
> >   b'task_work_run'
> >   b'exit_to_usermode_loop'
> >   b'do_syscall_64'
> >   b'entry_SYSCALL_64_after_hwframe'
> >     4358
> > 
> > [2] insert requests from writeback wq context
> > 
> >   b'blk_mq_sched_request_inserted'
> >   b'blk_mq_sched_request_inserted'
> >   b'dd_insert_requests'
> >   b'blk_mq_sched_insert_requests'
> >   b'blk_mq_flush_plug_list'
> >   b'blk_flush_plug_list'
> >   b'io_schedule_prepare'
> >   b'io_schedule'
> >   b'rq_qos_wait'
> >   b'wbt_wait'
> >   b'__rq_qos_throttle'
> >   b'blk_mq_make_request'
> >   b'generic_make_request'
> >   b'submit_bio'
> >   b'ext4_io_submit'
> >   b'ext4_bio_write_page'
> >   b'mpage_submit_page'
> >   b'mpage_process_page_bufs'
> >   b'mpage_prepare_extent_to_map'
> >   b'ext4_writepages'
> >   b'do_writepages'
> >   b'__writeback_single_inode'
> >   b'writeback_sb_inodes'
> >   b'__writeback_inodes_wb'
> >   b'wb_writeback'
> >   b'wb_workfn'
> >   b'process_one_work'
> >   b'worker_thread'
> >   b'kthread'
> >   b'ret_from_fork'
> >     5763
> > 
> > Thanks,
> > Ming
> > 
> 
> Is there any update on this? Sorry if I am making noise, but I would
> like to help to improve the kernel (or fix it) if I can help.
> Otherwise, please let me know how to consider this case,

IMO, the extra write path from exit_to_usermode_loop() isn't expected,
that should be the reason why write IO order is changed, then performance
drops on your USB storage.

We need our fs/ext4 experts to take a look.

Or can you reproduce the issue on xfs or btrfs?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
  2019-12-10  8:05                       ` Ming Lei
@ 2019-12-11  2:41                         ` Theodore Y. Ts'o
  0 siblings, 0 replies; 4+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-11  2:41 UTC (permalink / raw)
  To: Ming Lei
  Cc: Andrea Vai, Schmid, Carsten, Finn Thain, Damien Le Moal,
	Alan Stern, Jens Axboe, Johannes Thumshirn, USB list,
	SCSI development list, Himanshu Madhani, Hannes Reinecke,
	Omar Sandoval, Martin K. Petersen, Greg KH, Hans Holmberg,
	Kernel development list, linux-ext4, linux-fsdevel

On Tue, Dec 10, 2019 at 04:05:50PM +0800, Ming Lei wrote:
> > > The path[2] is expected behaviour. Not sure path [1] is correct,
> > > given
> > > ext4_release_file() is supposed to be called when this inode is
> > > released. That means the file is closed 4358 times during 1GB file
> > > copying to usb storage.
> > > 
> > > [1] insert requests when returning to user mode from syscall
> > > 
> > >   b'blk_mq_sched_request_inserted'
> > >   b'blk_mq_sched_request_inserted'
> > >   b'dd_insert_requests'
> > >   b'blk_mq_sched_insert_requests'
> > >   b'blk_mq_flush_plug_list'
> > >   b'blk_flush_plug_list'
> > >   b'io_schedule_prepare'
> > >   b'io_schedule'
> > >   b'rq_qos_wait'
> > >   b'wbt_wait'
> > >   b'__rq_qos_throttle'
> > >   b'blk_mq_make_request'
> > >   b'generic_make_request'
> > >   b'submit_bio'
> > >   b'ext4_io_submit'
> > >   b'ext4_writepages'
> > >   b'do_writepages'
> > >   b'__filemap_fdatawrite_range'
> > >   b'ext4_release_file'
> > >   b'__fput'
> > >   b'task_work_run'
> > >   b'exit_to_usermode_loop'
> > >   b'do_syscall_64'
> > >   b'entry_SYSCALL_64_after_hwframe'
> > >     4358

I'm guessing that your workload is repeatedly truncating a file (or
calling open with O_TRUNC) and then writing data to it.  When you do
this, then when the file is closed, we assume that since you were
replacing the previous contents of a file with new contents, that you
would be unhappy if the file contents was replaced by a zero length
file after a crash.  That's because ten years, ago there were a *huge*
number of crappy applications that would replace a file by reading it
into memory, truncating it, and then write out the new contents of the
file.  This could be a high score file for a game, or a KDE or GNOME
state file, etc.

So if someone does open, truncate, write, close, we still immediately
writing out the data on the close, assuming that the programmer really
wanted open, truncate, write, fsync, close, but was too careless to
actually do the right thing.

Some workaround[1] like this is done by all of the major file systems,
and was fallout the agreement from the "O_PONIES"[2] controversy.
This was discussed and agreed to at the 2009 LSF/MM workshop.  (See
the "rename, fsync, and ponies" section.)

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
[2] https://blahg.josefsipek.net/?p=364
[3] https://lwn.net/Articles/327601/

So if you're seeing a call to filemap_fdatawrite_range as the result
of a fput, that's why.

In any case, this behavior has been around for a decade, and it
appears to be incidental to your performance difficulties with your
USB thumbdrive and block-mq.

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191126023253.GA24501@ming.t460p>
     [not found] ` <0598fe2754bf0717d81f7e72d3e9b3230c608cc6.camel@unipv.it>
     [not found]   ` <alpine.LNX.2.21.1.1911271055200.8@nippy.intranet>
     [not found]     ` <cb6e84781c4542229a3f31572cef19ab@SVR-IES-MBX-03.mgc.mentorg.com>
     [not found]       ` <c1358b840b3a4971aa35a25d8495c2c8953403ea.camel@unipv.it>
     [not found]         ` <20191128091712.GD15549@ming.t460p>
     [not found]           ` <f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it>
     [not found]             ` <20191129005734.GB1829@ming.t460p>
     [not found]               ` <20191129023555.GA8620@ming.t460p>
     [not found]                 ` <320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it>
2019-12-03  2:23                   ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Ming Lei
2019-12-10  7:35                     ` Andrea Vai
2019-12-10  8:05                       ` Ming Lei
2019-12-11  2:41                         ` Theodore Y. Ts'o

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git