linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Zhengyuan Liu <liuzhengyuang521@gmail.com>,
	linux-xfs@vger.kernel.org,
	Zhengyuan Liu <liuzhengyuan@kylinos.cn>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [Question] About XFS random buffer write performance
Date: Tue, 28 Jul 2020 16:47:53 +0100	[thread overview]
Message-ID: <20200728154753.GS23808@casper.infradead.org> (raw)
In-Reply-To: <20200728153453.GC3151642@magnolia>

On Tue, Jul 28, 2020 at 08:34:53AM -0700, Darrick J. Wong wrote:
> On Tue, Jul 28, 2020 at 07:34:39PM +0800, Zhengyuan Liu wrote:
> > Hi all,
> > 
> > When doing random buffer write testing I found the bandwidth on EXT4 is much
> > better than XFS under the same environment.
> > The test case ,test result and test environment is as follows:
> > Test case:
> > fio --ioengine=sync --rw=randwrite --iodepth=64 --size=4G --name=test
> > --filename=/mnt/testfile --bs=4k
> > Before doing fio, use dd (if=/dev/zero of=/mnt/testfile bs=1M
> > count=4096) to warm-up the file in the page cache.
> > 
> > Test result (bandwidth):
> >          ext4                   xfs
> >        ~300MB/s       ~120MB/s
> > 
> > Test environment:
> >     Platform:  arm64
> >     Kernel:  v5.7
> >     PAGESIZE:  64K
> >     Memtotal:  16G
> >     Storage: sata ssd(Max bandwidth about 350MB/s)
> >     FS block size: 4K
> > 
> > The  fio "Test result" shows that EXT4 has more than 2x bandwidth compared to
> > XFS, but iostat shows the transfer speed of XFS to SSD is about 300MB/s too.
> > So I debt XFS writing back many non-dirty blocks to SSD while  writing back
> > dirty pages. I tried to read the core writeback code of both
> > filesystem and found
> > XFS will write back blocks which is uptodate (seeing iomap_writepage_map()),
> 
> Ahhh, right, because iomap tracks uptodate separately for each block in
> the page, but only tracks dirty status for the whole page.  Hence if you
> dirty one byte in the 64k page, xfs will write all 64k even though we
> could get away writing 4k like ext4 does.
> 
> Hey Christoph & Matthew: If you're already thinking about changing
> struct iomap_page, should we add the ability to track per-block dirty
> state to reduce the write amplification that Zhengyuan is asking about?
> 
> I'm guessing that between willy's THP series, Dave's iomap chunks
> series, and whatever Christoph may or may not be writing, at least one
> of you might have already implemented this? :)

Well, this is good timing!  I was wondering whether something along
these lines was an important use-case.

I propose we do away with the 'uptodate' bit-array and replace it with an
'writeback' bit-array.  We set the page uptodate bit whenever the reads to
fill the page have completed rather than checking the 'writeback' array.
In page_mkwrite, we fill the writeback bit-array on the grounds that we
have no way to track a block's non-dirtiness and we don't want to scan
each block at writeback time to see if it's been written to.

I'll do this now before the THP series gets reposted.

  reply	other threads:[~2020-07-28 15:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28 11:34 [Question] About XFS random buffer write performance Zhengyuan Liu
2020-07-28 15:34 ` Darrick J. Wong
2020-07-28 15:47   ` Matthew Wilcox [this message]
2020-07-29  1:54     ` Dave Chinner
2020-07-29  2:12       ` Matthew Wilcox
2020-07-29  5:19         ` Dave Chinner
2020-07-29 18:50           ` Matthew Wilcox
2020-07-29 23:05             ` Dave Chinner
2020-07-30 13:50               ` Matthew Wilcox
2020-07-30 22:08                 ` Dave Chinner
2020-07-30 23:45                   ` Matthew Wilcox
2020-07-31  2:05                     ` Dave Chinner
2020-07-31  2:37                       ` Matthew Wilcox
2020-07-31 20:47                     ` Matthew Wilcox
2020-07-31 22:13                       ` Dave Chinner
2020-08-21  2:39                         ` Zhengyuan Liu
2020-07-31  6:55                   ` Christoph Hellwig
2020-07-29 13:02       ` Zhengyuan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200728154753.GS23808@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=liuzhengyuan@kylinos.cn \
    --cc=liuzhengyuang521@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).