Linux-XFS Archive on lore.kernel.org
 help / color / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs: relax unwritten writeback overhead under some circumstances
Date: Thu, 16 Jan 2020 15:15:24 -0800
Message-ID: <20200116231524.GL8247@magnolia> (raw)
In-Reply-To: <20200116164900.GB4593@infradead.org>

On Thu, Jan 16, 2020 at 08:49:00AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 15, 2020 at 10:15:58PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In the previous patch, we solved a stale disk contents exposure problem
> > by forcing the delalloc write path to create unwritten extents, write
> > the data, and convert the extents to written after writeback completes.
> > 
> > This is a pretty huge hammer to use, so we'll relax the delalloc write
> > strategy to go straight to written extents (as we once did) if someone
> > tells us to write the entire file to disk.  This reopens the exposure
> > window slightly, but we'll only be affected if writeback completes out
> > of order and the system crashes during writeback.
> > 
> > Because once again we can map written extents past EOF, we also
> > enlarge the writepages window downward if the window is beyond the
> > on-disk size and there are written extents after the EOF block.  This
> > ensures that speculative post-EOF preallocations are not left uncovered.
> 
> This does sound really sketchy.  Do you have any performance numbers
> justifying something this nasty?

Nope! :D

IIRC Dave also expressed interested in performance impacts the last time
I sent this series, albeit more from the perspective of quantifying how
much pain we'd incur from forcing all writes to perform an unwritten
extent conversion at the end.

FWIW after months of running this on my internal systems, I haven't been
able to quantify any significant difference before and after, even with
rmap enabled.  There's slightly more log traffic from the extra
bmbt/rmapbt/inode core updates, but even then the log is fairly good at
deduping repeated updates.  Both transactions usually commit before the
log checkpoints.

Frankly I wouldn't apply this patch (or 'xfs: extend the range of
flush_unmap ranges') on the grounds that re-opening potential disclosure
flaws is never worth the risk.  I'm also pretty sure that being careful
to convert delalloc data fork extents to unwritten extents fixes the
stale disclosure flaw that Ritesh wrote about in ('iomap: direct-io:
Move inode_dio_begin before filemap_write_and_wait_range').

(As far as ext4 goes, I talked to Jan and Ted this morning and they
seemed to think that they could solve the race on their end by retaining
the unwritten state in the incore extent cache because ext4 apparently
doesn't commit the extent map update transaction until after writeback
completes.)

--D

  reply index

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16  6:15 [PATCH 0/2] xfs: fix stale disk exposure after crash Darrick J. Wong
2020-01-16  6:15 ` [PATCH 1/2] xfs: force writes to delalloc regions to unwritten Darrick J. Wong
2020-01-16 16:47   ` Christoph Hellwig
2020-01-16 23:16     ` Darrick J. Wong
2020-01-19 20:49   ` Dave Chinner
2020-02-03 20:14     ` Darrick J. Wong
2020-01-16  6:15 ` [PATCH 2/2] xfs: relax unwritten writeback overhead under some circumstances Darrick J. Wong
2020-01-16 16:49   ` Christoph Hellwig
2020-01-16 23:15     ` Darrick J. Wong [this message]
2020-01-16 16:49 ` [PATCH 0/2] xfs: fix stale disk exposure after crash Christoph Hellwig
2020-01-16 23:00   ` Darrick J. Wong

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200116231524.GL8247@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-XFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \
		linux-xfs@vger.kernel.org
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git