linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <Damien.LeMoal@wdc.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size updates
Date: Tue, 25 Jun 2019 09:15:23 +1000	[thread overview]
Message-ID: <20190624231523.GC7777@dread.disaster.area> (raw)
In-Reply-To: <20190624161720.GQ5387@magnolia>

On Mon, Jun 24, 2019 at 09:17:20AM -0700, Darrick J. Wong wrote:
> On Mon, Jun 24, 2019 at 07:52:48AM +0200, Christoph Hellwig wrote:
> > We have historically decided that we want to preallocate the xfs_trans
> > structure at writeback time so that we don't have to allocate on in
> > the I/O completion handler.  But we treat unwrittent extent and COW
> > fork conversions different already, which proves that the transaction
> > allocations in the end I/O handler are not a problem.  Removing the
> > preallocation gets rid of a lot of corner case code, and also ensures
> > we only allocate one and log a transaction when actually required,
> > as the ioend merging can reduce the number of actual i_size updates
> > significantly.
> 
> That's what I thought when I wrote the ioend merging patches, but IIRC
> Dave objected on the grounds that most file writes are trivial file
> extending writes and therefore we should leave this alone to avoid
> slowing down the ioend path even if it came at a cost of cancelling a
> lot of empty transactions.

The issue is stuff like extracting a tarball, where we might write a
hundred thousand files and they are all written in a single IO. i.e.
there is no IO completion merging at all.

> I wasn't 100% convinced it mattered but ran out of time in the
> development window and never got around to researching if it made any
> difference.

Yeah, it's not all that simple :/

In these cases, we always have to allocate a transaction for every
file being written. If we do it before we submit the IO, then all
the transactions are allocated from the single writeback context. If
we don't have log space, data writeback pauses while the tail of the
AIL is pushed, metadata writeback occurs, and then the transaction
allocation for data writeback is woken, and data writeback
submission continues. It's fairly orderly, and we don't end up
trying to write back data while we are doing bulk metadata flushing
from the AIL.

If we delay the transaction allocation to the ioend context and we
are low on log space, we end up blocking a kworker on a transaction
allocation which then has to wait for metadata writeback. The
kworker infrastructure will then issue the next ioend work, which
then blocks on transaction allocation. Delayed allocation can cause
thousands of small file IOs to be inflight at the same time due to
intra-file contiguous allocation and IO merging in the block layer,
hence we can have thousands of individual IO completions that
require transaction allocation to be completed.

> So, uh, how much of a hit do we take for having to allocate a
> transaction for a file size extension?  Particularly since we can
> combine those things now?

Unless we are out of log space, the transaction allocation and free
should be largely uncontended and so it's just a small amount of CPU
usage. i.e it's a slab allocation/free and then lockless space
reservation/free. If we are out of log space, then we sleep waiting
for space - the issue really comes down to where it is better to
sleep in that case....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-06-24 23:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-24  5:52 lift the xfs writepage code into iomap Christoph Hellwig
2019-06-24  5:52 ` [PATCH 01/12] list.h: add a list_pop helper Christoph Hellwig
2019-06-24 14:53   ` Darrick J. Wong
2019-06-24 15:51   ` Matthew Wilcox
2019-06-25 10:06     ` Christoph Hellwig
2019-06-24  5:52 ` [PATCH 02/12] xfs: simplify xfs_chain_bio Christoph Hellwig
2019-06-24 15:14   ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 03/12] xfs: fix a comment typo in xfs_submit_ioend Christoph Hellwig
2019-06-24 14:53   ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 04/12] xfs: initialize ioma->flags in xfs_bmbt_to_iomap Christoph Hellwig
2019-06-24 14:57   ` Darrick J. Wong
2019-06-25 10:07     ` Christoph Hellwig
2019-06-25 15:13       ` Darrick J. Wong
2019-06-25 15:21         ` Christoph Hellwig
2019-06-24  5:52 ` [PATCH 05/12] xfs: use a struct iomap in xfs_writepage_ctx Christoph Hellwig
2019-06-24 15:50   ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 06/12] xfs: remove XFS_TRANS_NOFS Christoph Hellwig
2019-06-24 15:58   ` Darrick J. Wong
2019-06-24 22:59   ` Dave Chinner
2019-06-25 10:12     ` Christoph Hellwig
2019-06-24  5:52 ` [PATCH 07/12] xfs: don't preallocate a transaction for file size updates Christoph Hellwig
2019-06-24 16:17   ` Darrick J. Wong
2019-06-24 23:15     ` Dave Chinner [this message]
2019-06-25 10:25       ` Christoph Hellwig
2019-06-27 22:23         ` Dave Chinner
2019-06-24  5:52 ` [PATCH 08/12] xfs: simplify xfs_ioend_can_merge Christoph Hellwig
2019-06-24 16:00   ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 09/12] xfs: refactor the ioend merging code Christoph Hellwig
2019-06-24 16:04   ` Darrick J. Wong
2019-06-24 16:06   ` Nikolay Borisov
2019-06-25 10:14     ` Christoph Hellwig
2019-06-25 12:42       ` Nikolay Borisov
2019-06-25 14:45         ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 10/12] xfs: remove the fork fields in the writepage_ctx and ioend Christoph Hellwig
2019-06-24 16:08   ` Darrick J. Wong
2019-06-24  5:52 ` [PATCH 11/12] iomap: move the xfs writeback code to iomap.c Christoph Hellwig
2019-06-24 15:46   ` Darrick J. Wong
2019-06-25 10:08     ` Christoph Hellwig
2019-06-24 23:43   ` Dave Chinner
2019-06-25 10:10     ` Christoph Hellwig
2019-06-28  0:45       ` Dave Chinner
2019-06-28  5:33         ` Christoph Hellwig
2019-07-01  0:08           ` Dave Chinner
2019-07-01  6:43             ` Christoph Hellwig
2019-07-01 23:09               ` Dave Chinner
2019-06-28 22:27         ` Luis Chamberlain
2019-07-11 21:31           ` Brendan Higgins
2019-06-24  5:52 ` [PATCH 12/12] iomap: add tracing for the address space operations Christoph Hellwig
2019-06-24 23:49   ` Dave Chinner
2019-06-25 10:15     ` Christoph Hellwig
2019-06-25 14:47       ` Darrick J. Wong
2019-06-27 22:35       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190624231523.GC7777@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=agruenba@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).