linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Andrea Vai <andrea.vai@unipv.it>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>,
	"Schmid, Carsten" <Carsten_Schmid@mentor.com>,
	Finn Thain <fthain@telegraphics.com.au>,
	Damien Le Moal <Damien.LeMoal@wdc.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Jens Axboe <axboe@kernel.dk>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	USB list <linux-usb@vger.kernel.org>,
	SCSI development list <linux-scsi@vger.kernel.org>,
	Himanshu Madhani <himanshu.madhani@cavium.com>,
	Hannes Reinecke <hare@suse.com>, Omar Sandoval <osandov@fb.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Hans Holmberg <Hans.Holmberg@wdc.com>,
	Kernel development list <linux-kernel@vger.kernel.org>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6
Date: Tue, 24 Dec 2019 16:47:21 +0800	[thread overview]
Message-ID: <20191224084721.GA27248@ming.t460p> (raw)
In-Reply-To: <28e0ca9257a834c04221d083e7024a0155744835.camel@unipv.it>

On Tue, Dec 24, 2019 at 09:04:10AM +0100, Andrea Vai wrote:
> Il giorno mar, 24/12/2019 alle 09.32 +0800, Ming Lei ha scritto:
> > On Mon, Dec 23, 2019 at 03:02:35PM +0100, Andrea Vai wrote:
> > > Il giorno lun, 23/12/2019 alle 21.08 +0800, Ming Lei ha scritto:
> > > > On Mon, Dec 23, 2019 at 12:22:45PM +0100, Andrea Vai wrote:
> > > > > Il giorno mer, 18/12/2019 alle 17.48 +0800, Ming Lei ha
> > scritto:
> > > > > > On Wed, Dec 18, 2019 at 09:25:02AM +0100, Andrea Vai wrote:
> > > > > > > Il giorno gio, 12/12/2019 alle 05.33 +0800, Ming Lei ha
> > > > scritto:
> > > > > > > > On Wed, Dec 11, 2019 at 11:07:45AM -0500, Theodore Y.
> > Ts'o
> > > > > > wrote:
> > > > > > > > > On Wed, Dec 11, 2019 at 12:00:58PM +0800, Ming Lei
> > wrote:
> > > > > > > > > > I didn't reproduce the issue in my test environment,
> > and
> > > > > > follows
> > > > > > > > > > Andrea's test commands[1]:
> > > > > > > > > > 
> > > > > > > > > >   mount UUID=$uuid /mnt/pendrive 2>&1 |tee -a
> > $logfile
> > > > > > > > > >   SECONDS=0
> > > > > > > > > >   cp $testfile /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > >   umount /mnt/pendrive 2>&1 |tee -a $logfile
> > > > > > > > > > 
> > > > > > > > > > The 'cp' command supposes to open/close the file
> > just
> > > > once,
> > > > > > > > however
> > > > > > > > > > ext4_release_file() & write pages is observed to run
> > for
> > > > > > 4358
> > > > > > > > times
> > > > > > > > > > when executing the above 'cp' test.
> > > > > > > > > 
> > > > > > > > > Why are we sure the ext4_release_file() / _fput() is
> > > > coming
> > > > > > from
> > > > > > > > the
> > > > > > > > > cp command, as opposed to something else that might be
> > > > running
> > > > > > on
> > > > > > > > the
> > > > > > > > > system under test?  _fput() is called by the kernel
> > when
> > > > the
> > > > > > last
> > > > > > > > 
> > > > > > > > Please see the log:
> > > > > > > > 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/3af3666920e7d46f8f0c6d88612f143ffabc743c.camel@unipv.it/2-log_ming.zip
> > > > > > > > 
> > > > > > > > Which is collected by:
> > > > > > > > 
> > > > > > > > #!/bin/sh
> > > > > > > > MAJ=$1
> > > > > > > > MIN=$2
> > > > > > > > MAJ=$(( $MAJ << 20 ))
> > > > > > > > DEV=$(( $MAJ | $MIN ))
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/trace -t -C \
> > > > > > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector' \
> > > > > > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s
> > %d
> > > > %d",
> > > > > > args-
> > > > > > > > >rwbs, args->sector, args->nr_sector'
> > > > > > > > 
> > > > > > > > $MAJ:$MIN points to the USB storage disk.
> > > > > > > > 
> > > > > > > > From the above IO trace, there are two write paths, one
> > is
> > > > from
> > > > > > cp,
> > > > > > > > another is from writeback wq.
> > > > > > > > 
> > > > > > > > The stackcount trace[1] is consistent with the IO trace
> > log
> > > > > > since it
> > > > > > > > only shows two IO paths, that is why I concluded that
> > the
> > > > write
> > > > > > done
> > > > > > > > via
> > > > > > > > ext4_release_file() is from 'cp'.
> > > > > > > > 
> > > > > > > > [1] 
> > > > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/320b315b9c87543d4fb919ecbdf841596c8fbcea.camel@unipv.it/2-log_ming_20191129_150609.zip
> > > > > > > > 
> > > > > > > > > reference to a struct file is
> > released.  (Specifically, if
> > > > you
> > > > > > > > have a
> > > > > > > > > fd which is dup'ed, it's only when the last fd
> > > > corresponding
> > > > > > to
> > > > > > > > the
> > > > > > > > > struct file is closed, and the struct file is about to
> > be
> > > > > > > > released,
> > > > > > > > > does the file system's f_ops->release function get
> > > > called.)
> > > > > > > > > 
> > > > > > > > > So the first question I'd ask is whether there is
> > anything
> > > > > > else
> > > > > > > > going
> > > > > > > > > on the system, and whether the writes are happening to
> > the
> > > > USB
> > > > > > > > thumb
> > > > > > > > > drive, or to some other storage device.  And if there
> > is
> > > > > > something
> > > > > > > > > else which is writing to the pendrive, maybe that's
> > why no
> > > > one
> > > > > > > > else
> > > > > > > > > has been able to reproduce the OP's complaint....
> > > > > > > > 
> > > > > > > > OK, we can ask Andrea to confirm that via the following
> > > > trace,
> > > > > > which
> > > > > > > > will add pid/comm info in the stack trace:
> > > > > > > > 
> > > > > > > > /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted
> > > > > > > > 
> > > > > > > > Andrew, could you collect the above log again when
> > running
> > > > > > new/bad
> > > > > > > > kernel for confirming if the write done by
> > > > ext4_release_file()
> > > > > > is
> > > > > > > > from
> > > > > > > > the 'cp' process?
> > > > > > > 
> > > > > > > You can find the stackcount log attached. It has been
> > produced
> > > > by:
> > > > > > > 
> > > > > > > - /usr/share/bcc/tools/stackcount
> > > > blk_mq_sched_request_inserted >
> > > > > > trace.log
> > > > > > > - wait some seconds
> > > > > > > - run the test (1 copy trial), wait for the test to
> > finish,
> > > > wait
> > > > > > some seconds
> > > > > > > - stop the trace (ctrl+C)
> > > > > > 
> > > > > > Thanks for collecting the log, looks your 'stackcount'
> > doesn't
> > > > > > include
> > > > > > comm/pid info, seems there is difference between your bcc
> > and
> > > > > > my bcc in fedora 30.
> > > > > > 
> > > > > > Could you collect above log again via the following command?
> > > > > > 
> > > > > > /usr/share/bcc/tools/stackcount -P -K
> > t:block:block_rq_insert
> > > > > > 
> > > > > > which will show the comm/pid info.
> > > > > 
> > > > > ok, attached (trace_20191219.txt), the test (1 trial) took
> > 3684
> > > > > seconds.
> > > > 
> > > > From the above trace:
> > > > 
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'blk_mq_sched_request_inserted'
> > > >   b'dd_insert_requests'
> > > >   b'blk_mq_sched_insert_requests'
> > > >   b'blk_mq_flush_plug_list'
> > > >   b'blk_flush_plug_list'
> > > >   b'io_schedule_prepare'
> > > >   b'io_schedule'
> > > >   b'rq_qos_wait'
> > > >   b'wbt_wait'
> > > >   b'__rq_qos_throttle'
> > > >   b'blk_mq_make_request'
> > > >   b'generic_make_request'
> > > >   b'submit_bio'
> > > >   b'ext4_io_submit'
> > > >   b'ext4_writepages'
> > > >   b'do_writepages'
> > > >   b'__filemap_fdatawrite_range'
> > > >   b'ext4_release_file'
> > > >   b'__fput'
> > > >   b'task_work_run'
> > > >   b'exit_to_usermode_loop'
> > > >   b'do_syscall_64'
> > > >   b'entry_SYSCALL_64_after_hwframe'
> > > >     b'cp' [19863]
> > > >     4400
> > > > 
> > > > So this write is clearly from 'cp' process, and it should be one
> > > > ext4 fs issue.
> > > > 
> > > > Ted, can you take a look at this issue?
> > > > 
> > > > > 
> > > > > > > I also tried the usual test with btrfs and xfs. Btrfs
> > behavior
> > > > > > looks
> > > > > > > "good". xfs seems sometimes better, sometimes worse, I
> > would
> > > > say.
> > > > > > I
> > > > > > > don't know if it matters, anyway you can also find the
> > results
> > > > of
> > > > > > the
> > > > > > > two tests (100 trials each). Basically, btrfs is always
> > > > between 68
> > > > > > and
> > > > > > > 89 seconds, with a cyclicity (?) with "period=2 trials".
> > xfs
> > > > looks
> > > > > > > almost always very good (63-65s), but sometimes "bad"
> > (>300s).
> > > > > > 
> > > > > > If you are interested in digging into this one, the
> > following
> > > > trace
> > > > > > should be helpful:
> > > > > > 
> > > > > > 
> > > > 
> > https://lore.kernel.org/linux-scsi/f38db337cf26390f7c7488a0bc2076633737775b.camel@unipv.it/T/#m5aa008626e07913172ad40e1eb8e5f2ffd560fc6
> > > > > > 
> > > > > 
> > > > > Attached:
> > > > > - trace_xfs_20191223.txt (7 trials, then aborted while doing
> > the
> > > > 8th),
> > > > > times to complete:
> > > > > 64s
> > > > > 63s
> > > > > 64s
> > > > > 833s
> > > > > 1105s
> > > > > 63s
> > > > > 64s
> > > > 
> > > > oops, looks we have to collect io insert trace with the
> > following
> > > > bcc script
> > > > on xfs for confirming if there is similar issue with ext4, could
> > you
> > > > run
> > > > it again on xfs? And only post the trace done in case of slow
> > 'cp'.
> > > > 
> > > > 
> > > > #!/bin/sh
> > > > 
> > > > MAJ=$1
> > > > MIN=$2
> > > > MAJ=$(( $MAJ << 20 ))
> > > > DEV=$(( $MAJ | $MIN ))
> > > > 
> > > > /usr/share/bcc/tools/trace -t -C \
> > > >     't:block:block_rq_issue (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector' \
> > > >     't:block:block_rq_insert (args->dev == '$DEV') "%s %d %d",
> > args-
> > > > >rwbs, args->sector, args->nr_sector'
> > > > 
> > > > 
> > > here it is (1 trial, 313 seconds to finish)
> > 
> > The above log shows similar issue with ext4 since there is another
> > writeback IO path from 'cp' process. And the following trace can
> > show if
> > it is same with ext4's issue:
> > 
> > /usr/share/bcc/tools/stackcount -P -K t:block:block_rq_insert
> 
> sorry, also here please tell me which conditions should I use to run
> the test (ext4 or xfs? slow run or not important?)

Maybe not needed.

After thinking the issue further, looks it is highly related with
removing ioc_batching and BDI congestion by blk-mq.

When there are two writeback paths, the original block layer(legacy)
can set 'cp' process which writes pages during close() as 'batching',
then write pages from writeback wq context is blocked. That said there
is actually single writeback IO path even though two are writing pages,
so write order can be maintained, see the following comment in original
__get_request():

	/*
	 * The queue will fill after this allocation, so set
	 * it as full, and mark this process as "batching".
	 * This process will be allowed to complete a batch of
	 * requests, others will be blocked.
	 */

This behavior can be shown in the IO trace done in old kernel with
legacy block IO path:

https://lore.kernel.org/linux-scsi/f82fd5129e3dcacae703a689be60b20a7fedadf6.camel@unipv.it/2-log_ming_20191128_182751.zip

IMO, we need to figure out one solution in blk-mq to fix this issue
since HDD. performance might be hurt under this situation.

Thanks,
Ming


  reply	other threads:[~2019-12-24  8:47 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <307581a490b610c3025ee80f79a465a89d68ed19.camel@unipv.it>
2019-08-20 17:13 ` Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Alan Stern
2019-08-23 10:39   ` Andrea Vai
2019-08-23 20:42     ` Alan Stern
2019-08-26  6:09       ` Andrea Vai
2019-08-26 16:33         ` Alan Stern
2019-09-18 15:25           ` Andrea Vai
2019-09-18 16:30             ` Alan Stern
2019-09-19  7:33               ` Andrea Vai
2019-09-19 17:54                 ` Alan Stern
2019-09-20  7:25                   ` Andrea Vai
2019-09-20  7:44                     ` Greg KH
2019-09-19  8:26               ` Damien Le Moal
2019-09-19  8:55                 ` Ming Lei
2019-09-19  9:09                   ` Damien Le Moal
2019-09-19  9:21                     ` Ming Lei
2019-09-19 14:01                 ` Alan Stern
2019-09-19 14:14                   ` Damien Le Moal
2019-09-20  7:03                     ` Andrea Vai
2019-09-25 19:30                       ` Alan Stern
2019-09-25 19:36                         ` Jens Axboe
2019-09-27 15:47                           ` Andrea Vai
2019-11-04 16:00                             ` Andrea Vai
2019-11-04 18:20                               ` Alan Stern
2019-11-05 11:48                                 ` Andrea Vai
2019-11-05 18:31                                   ` Alan Stern
2019-11-05 23:29                                     ` Jens Axboe
2019-11-06 16:03                                       ` Alan Stern
2019-11-06 22:13                                         ` Damien Le Moal
2019-11-07  7:04                                           ` Andrea Vai
2019-11-07  7:54                                             ` Damien Le Moal
2019-11-07 18:59                                               ` Andrea Vai
2019-11-08  8:42                                                 ` Damien Le Moal
2019-11-08 14:33                                                   ` Jens Axboe
2019-11-11 10:46                                                     ` Andrea Vai
2019-11-09 10:09                                                   ` Ming Lei
2019-11-09 22:28                                                 ` Ming Lei
2019-11-11 10:50                                                   ` Andrea Vai
2019-11-11 11:05                                                     ` Ming Lei
2019-11-11 11:13                                                       ` Andrea Vai
2019-11-22 19:16                                                   ` Andrea Vai
2019-11-23  7:28                                                     ` Ming Lei
2019-11-23 15:44                                                       ` Andrea Vai
2019-11-25  3:54                                                         ` Ming Lei
2019-11-25 10:11                                                           ` Andrea Vai
2019-11-25 10:29                                                             ` Ming Lei
2019-11-25 14:58                                                               ` Andrea Vai
2019-11-25 15:15                                                                 ` Ming Lei
2019-11-25 18:51                                                                   ` Andrea Vai
2019-11-26  2:32                                                                     ` Ming Lei
2019-11-26  7:46                                                                       ` Andrea Vai
2019-11-26  9:15                                                                         ` Ming Lei
2019-11-26 10:24                                                                           ` Ming Lei
2019-11-26 11:14                                                                           ` Andrea Vai
2019-11-27  2:05                                                                             ` Ming Lei
2019-11-27  9:39                                                                               ` Andrea Vai
2019-11-27 13:08                                                                                 ` Ming Lei
2019-11-27 15:01                                                                                   ` Andrea Vai
2019-11-27  0:21                                                                         ` Finn Thain
2019-11-27  8:14                                                                           ` AW: " Schmid, Carsten
2019-11-27 21:49                                                                             ` Finn Thain
2019-11-28  7:46                                                                             ` Andrea Vai
2019-11-28  8:12                                                                               ` AW: " Schmid, Carsten
2019-11-28 11:40                                                                                 ` Andrea Vai
2019-11-28 17:39                                                                                 ` Alan Stern
2019-11-28  9:17                                                                               ` Ming Lei
2019-11-28 17:34                                                                                 ` Andrea Vai
2019-11-29  0:57                                                                                   ` Ming Lei
2019-11-29  2:35                                                                                     ` Ming Lei
2019-11-29 14:41                                                                                       ` Andrea Vai
2019-12-03  2:23                                                                                         ` Ming Lei
2019-12-10  7:35                                                                                           ` Andrea Vai
2019-12-10  8:05                                                                                             ` Ming Lei
2019-12-11  2:41                                                                                               ` Theodore Y. Ts'o
2019-12-11  4:00                                                                                                 ` Ming Lei
2019-12-11 16:07                                                                                                   ` Theodore Y. Ts'o
2019-12-11 21:33                                                                                                     ` Ming Lei
2019-12-12  7:34                                                                                                       ` Andrea Vai
2019-12-18  8:25                                                                                                       ` Andrea Vai
2019-12-18  9:48                                                                                                         ` Ming Lei
     [not found]                                                                                                           ` <b1b6a0e9d690ecd9432025acd2db4ac09f834040.camel@unipv.it>
2019-12-23 13:08                                                                                                             ` Ming Lei
2019-12-23 14:02                                                                                                               ` Andrea Vai
2019-12-24  1:32                                                                                                                 ` Ming Lei
2019-12-24  8:04                                                                                                                   ` Andrea Vai
2019-12-24  8:47                                                                                                                     ` Ming Lei [this message]
2019-12-23 16:26                                                                                                               ` Theodore Y. Ts'o
2019-12-23 16:29                                                                                                                 ` Andrea Vai
2019-12-23 17:22                                                                                                                   ` Theodore Y. Ts'o
2019-12-23 18:45                                                                                                                     ` Andrea Vai
2019-12-23 19:53                                                                                                                       ` Theodore Y. Ts'o
2019-12-24  1:27                                                                                                                         ` Ming Lei
2019-12-24  6:49                                                                                                                           ` Andrea Vai
2019-12-24  8:51                                                                                                                           ` Andrea Vai
2019-12-24  9:35                                                                                                                             ` Ming Lei
2019-12-25  5:17                                                                                                                           ` Theodore Y. Ts'o
2019-12-26  2:27                                                                                                                             ` Ming Lei
2019-12-26  3:30                                                                                                                               ` Theodore Y. Ts'o
2019-12-26  8:37                                                                                                                                 ` Ming Lei
2020-01-07  7:51                                                                                                                                   ` Andrea Vai
     [not found]                                                                                                                                 ` <20200101074310.10904-1-hdanton@sina.com>
2020-01-01 13:53                                                                                                                                   ` slow IO on USB media Ming Lei
2019-11-29 11:44                                                                                     ` AW: Slow I/O on USB media after commit f664a3cc17b7d0a2bc3b3ab96181e1029b0ec0e6 Bernd Schubert
2019-12-02  7:01                                                                                       ` Andrea Vai
2019-11-28 17:10                                                                           ` Andrea Vai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191224084721.GA27248@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=Carsten_Schmid@mentor.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Hans.Holmberg@wdc.com \
    --cc=andrea.vai@unipv.it \
    --cc=axboe@kernel.dk \
    --cc=fthain@telegraphics.com.au \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.com \
    --cc=himanshu.madhani@cavium.com \
    --cc=jthumshirn@suse.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=osandov@fb.com \
    --cc=stern@rowland.harvard.edu \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).