All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	Zhengyuan Liu <liuzhengyuang521@gmail.com>,
	linux-xfs@vger.kernel.org,
	Zhengyuan Liu <liuzhengyuan@kylinos.cn>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [Question] About XFS random buffer write performance
Date: Fri, 31 Jul 2020 00:45:17 +0100	[thread overview]
Message-ID: <20200730234517.GM23808@casper.infradead.org> (raw)
In-Reply-To: <20200730220857.GD2005@dread.disaster.area>

On Fri, Jul 31, 2020 at 08:08:57AM +1000, Dave Chinner wrote:
> On Thu, Jul 30, 2020 at 02:50:40PM +0100, Matthew Wilcox wrote:
> > On Thu, Jul 30, 2020 at 09:05:03AM +1000, Dave Chinner wrote:
> > > On Wed, Jul 29, 2020 at 07:50:35PM +0100, Matthew Wilcox wrote:
> > > > I had a bit of a misunderstanding.  Let's discard that proposal
> > > > and discuss what we want to optimise for, ignoring THPs.  We don't
> > > > need to track any per-block state, of course.  We could implement
> > > > __iomap_write_begin() by reading in the entire page (skipping the last
> > > > few blocks if they lie outside i_size, of course) and then marking the
> > > > entire page Uptodate.
> > > 
> > > __iomap_write_begin() already does read-around for sub-page writes.
> > > And, if necessary, it does zeroing of unwritten extents, newly
> > > allocated ranges and ranges beyond EOF and marks them uptodate
> > > appropriately.
> > 
> > But it doesn't read in the entire page, just the blocks in the page which
> > will be touched by the write.
> 
> Ah, you are right, I got my page/offset macros mixed up.
> 
> In which case, you just identified why the uptodate array is
> necessary and can't be removed. If we do a sub-page write() the page
> is not fully initialised, and so if we then mmap it readpage needs
> to know what part of the page requires initialisation to bring the
> page uptodate before it is exposed to userspace.

You snipped the part of my mail where I explained how we could handle
that without the uptodate array ;-(  Essentially, we do as you thought
it worked, we read the entire page (or at least the portion of it that
isn't going to be overwritten.  Once all the bytes have been transferred,
we can mark the page Uptodate.  We'll need to wait for the transfer to
happen if the write overlaps a block boundary, but we do that right now.

> But that also means the behaviour of the 4kB write on 64kB page size
> benchmark is unexplained, because that should only be marking the
> written pages of the page up to date, and so it should be behaving
> exactly like ext4 and only writing back individual uptodate chunks
> on the dirty page....

That benchmark started by zeroing the entire page cache, so all blocks
were marked Uptodate, so we wouldn't skip them on writeout.


  reply	other threads:[~2020-07-30 23:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28 11:34 [Question] About XFS random buffer write performance Zhengyuan Liu
2020-07-28 15:34 ` Darrick J. Wong
2020-07-28 15:47   ` Matthew Wilcox
2020-07-29  1:54     ` Dave Chinner
2020-07-29  2:12       ` Matthew Wilcox
2020-07-29  5:19         ` Dave Chinner
2020-07-29 18:50           ` Matthew Wilcox
2020-07-29 23:05             ` Dave Chinner
2020-07-30 13:50               ` Matthew Wilcox
2020-07-30 22:08                 ` Dave Chinner
2020-07-30 23:45                   ` Matthew Wilcox [this message]
2020-07-31  2:05                     ` Dave Chinner
2020-07-31  2:37                       ` Matthew Wilcox
2020-07-31 20:47                     ` Matthew Wilcox
2020-07-31 22:13                       ` Dave Chinner
2020-08-21  2:39                         ` Zhengyuan Liu
2020-07-31  6:55                   ` Christoph Hellwig
2020-07-29 13:02       ` Zhengyuan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200730234517.GM23808@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=liuzhengyuan@kylinos.cn \
    --cc=liuzhengyuang521@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.