All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>,
	linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 46/63] xfs: unshare a range of blocks via fallocate
Date: Mon, 10 Oct 2016 10:05:24 -0700	[thread overview]
Message-ID: <20161010170524.GD5619@birch.djwong.org> (raw)
In-Reply-To: <20161007222508.GK9806@dastard>

On Sat, Oct 08, 2016 at 09:25:08AM +1100, Dave Chinner wrote:
> On Fri, Oct 07, 2016 at 02:15:40PM -0700, Darrick J. Wong wrote:
> > On Fri, Oct 07, 2016 at 04:58:38PM -0400, Brian Foster wrote:
> > > On Fri, Oct 07, 2016 at 01:26:39PM -0700, Darrick J. Wong wrote:
> > > > On Fri, Oct 07, 2016 at 02:05:07PM -0400, Brian Foster wrote:
> > > > > On Thu, Sep 29, 2016 at 08:10:39PM -0700, Darrick J. Wong wrote:
> > > > > The code that has been merged is now different from this code :/, but
> > > > > just a heads up that the code in the tree looks like it has another one
> > > > > of those potentially blind transaction commit sequences between
> > > > > xfs_reflink_try_clear_inode_flag() and xfs_reflink_clear_inode_flag().
> > > > 
> > > > _reflink_unshare jumps out if it's not a reflink inode before
> > > > calling _reflink_try_clear_inode_flag -> _reflink_clear_inode_flag.
> > > > We do not call _reflink_clear_inode_flag with a non-reflink inode.
> > > > As for blindly committing a transaction with no dirty data, that's
> > > > fine, _trans_commit checks for that case and simply frees everything
> > > > attached to the transaction.
> > > > 
> > > 
> > > Yeah, I saw that. That's what I was alluding to below wrt to the usage
> > > being fine in the patch. It's just the pattern that's used that stands
> > > out.
> > > 
> > > With regard to the transaction.. sure, that situation may not be broken,
> > > but it's still not ideal if it's a log reservation we didn't have to
> > > make in the first place.
> > 
> > Yeah.  We must hold the ilock from the start of the extent iteration
> > until we clear (or not) the inode flag, but we have to allocate the
> > transaction before grabbing the ilock.  In other words, we don't know if
> > we need the transaction until it's too late to get one, hence this
> > suboptimal thing where we sometimes get a reservation and never commit
> > anything.  I don't know of any way to avoid that.
> 
> Getting a transaction we don't use isn't the end of the world -
> in most cases it's just a bit of wasted CPU time. Similarly to
> committing an empty transaction it has no actual effect except to
> increment the empty transaction stat. In this case, commit is just
> fine as xfs_trans_commit will detect that it is empty and do the
> cancel work directly.

This is going to become a bigger thing once we get to online scrub
because I use empty transactions to avoid deadlock problems.  I observed
that the routine to grab a buffer will lock the buffer and (optionally)
attach it to a transaction.  Subsequent attempts to re-grab a still
locked buffer succeed if the buffer is attached to the transaction, and
are made to wait for the lock if not.

We can use this as a strategy to detect tree cycles:

n0 (root) -> n1 -> n2 -> n3
             ^------------+

By the time we hit the bad pointer in n3, we've locked n0-3 and attached
it to the empty transaction.  Next, we read the bad pointer in n3 and
try to grab n1 again.  Since it's attached and locked to our empty
transaction, we can read the buffer and notice that the level is wrong,
and declare the tree to be corrupt.  On our way out, we call
xfs_trans_cancel to unlock everything.  It's a little uncomfortable to
be (ab)using transactions for their ability to track locked buffers, but
oh well.

Note we can also use this of escaping crosslinked btrees:

bno0 -> b1 -> b2 ---------+
                          V
          rmap0 -> r1 -> r2 -> r3

Let's say we're checking rmap records out of r3 and we want to make sure
that the bnobt does not have a record for this rmapping.  We start down
the bnobt until we hit the bad pointer in b2 that points to a block
we already locked while reading the rmapbt.  Having the transaction
allows the bnobt cursor to read r2 and fail the read verifier, after
which we can cancel the transaction and tell userspace that there's
something wrong.  If we didn't have the transaction, we'd try to lock a
buffer that we already locked, which deadlocks the system.

I suppose I had better write all this down in xfs_scrub.c before I send
out patches for review.

--D

> 
> If I start to see too many empty transaction commits in my
> performance test runs, I'll let you know and we can start to look
> for solutions. But right now I wouldn't worry about it.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2016-10-10 17:05 UTC|newest]

Thread overview: 187+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-30  3:05 [PATCH v10 00/63] xfs: add reflink and dedupe support Darrick J. Wong
2016-09-30  3:05 ` [PATCH 01/63] vfs: support FS_XFLAG_COWEXTSIZE and get/set of CoW extent size hint Darrick J. Wong
2016-09-30  3:05 ` [PATCH 02/63] vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks Darrick J. Wong
2016-09-30  7:08   ` Christoph Hellwig
2016-09-30  3:05 ` [PATCH 03/63] xfs: return an error when an inline directory is too small Darrick J. Wong
2016-09-30  3:06 ` [PATCH 04/63] xfs: define tracepoints for refcount btree activities Darrick J. Wong
2016-09-30  3:06 ` [PATCH 05/63] xfs: introduce refcount btree definitions Darrick J. Wong
2016-09-30  3:06 ` [PATCH 06/63] xfs: refcount btree add more reserved blocks Darrick J. Wong
2016-09-30  3:06 ` [PATCH 07/63] xfs: define the on-disk refcount btree format Darrick J. Wong
2016-09-30  3:06 ` [PATCH 08/63] xfs: add refcount btree support to growfs Darrick J. Wong
2016-09-30  3:06 ` [PATCH 09/63] xfs: account for the refcount btree in the alloc/free log reservation Darrick J. Wong
2016-09-30  3:06 ` [PATCH 10/63] xfs: add refcount btree operations Darrick J. Wong
2016-09-30  3:06 ` [PATCH 11/63] xfs: create refcount update intent log items Darrick J. Wong
2016-09-30  3:06 ` [PATCH 12/63] xfs: log refcount intent items Darrick J. Wong
2016-09-30  3:06 ` [PATCH 13/63] xfs: adjust refcount of an extent of blocks in refcount btree Darrick J. Wong
2016-09-30  7:11   ` Christoph Hellwig
2016-09-30 17:53     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 14/63] xfs: connect refcount adjust functions to upper layers Darrick J. Wong
2016-09-30  7:13   ` Christoph Hellwig
2016-09-30 16:21   ` Brian Foster
2016-09-30 19:40     ` Darrick J. Wong
2016-09-30 20:11       ` Brian Foster
2016-09-30  3:07 ` [PATCH 15/63] xfs: adjust refcount when unmapping file blocks Darrick J. Wong
2016-09-30  7:14   ` Christoph Hellwig
2016-09-30  3:07 ` [PATCH 16/63] xfs: add refcount btree block detection to log recovery Darrick J. Wong
2016-09-30  7:15   ` Christoph Hellwig
2016-09-30  3:07 ` [PATCH 17/63] xfs: refcount btree requires more reserved space Darrick J. Wong
2016-09-30  7:15   ` Christoph Hellwig
2016-09-30 16:46   ` Brian Foster
2016-09-30 18:41     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 18/63] xfs: introduce reflink utility functions Darrick J. Wong
2016-09-30  3:07   ` Darrick J. Wong
2016-09-30  7:16   ` Christoph Hellwig
2016-09-30 19:22   ` Brian Foster
2016-09-30 19:50     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 19/63] xfs: create bmbt update intent log items Darrick J. Wong
2016-09-30  7:24   ` Christoph Hellwig
2016-09-30 17:24     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 20/63] xfs: log bmap intent items Darrick J. Wong
2016-09-30  7:26   ` Christoph Hellwig
2016-09-30 17:26     ` Darrick J. Wong
2016-09-30 19:22   ` Brian Foster
2016-09-30 19:52     ` Darrick J. Wong
2016-09-30  3:07 ` [PATCH 21/63] xfs: map an inode's offset to an exact physical block Darrick J. Wong
2016-09-30  7:31   ` Christoph Hellwig
2016-09-30 17:30     ` Darrick J. Wong
2016-10-03 19:03   ` Brian Foster
2016-10-04  0:11     ` Darrick J. Wong
2016-10-04 12:43       ` Brian Foster
2016-10-04 17:28         ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 22/63] xfs: pass bmapi flags through to bmap_del_extent Darrick J. Wong
2016-09-30  7:16   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 23/63] xfs: implement deferred bmbt map/unmap operations Darrick J. Wong
2016-09-30  7:34   ` Christoph Hellwig
2016-09-30 17:38     ` Darrick J. Wong
2016-09-30 20:34       ` Roger Willcocks
2016-09-30 21:08         ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 24/63] xfs: when replaying bmap operations, don't let unlinked inodes get reaped Darrick J. Wong
2016-09-30  7:35   ` Christoph Hellwig
2016-10-03 19:04   ` Brian Foster
2016-10-04  0:29     ` Darrick J. Wong
2016-10-04 12:44       ` Brian Foster
2016-10-04 19:07         ` Dave Chinner
2016-10-04 21:44           ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 25/63] xfs: return work remaining at the end of a bunmapi operation Darrick J. Wong
2016-09-30  7:19   ` Christoph Hellwig
2016-10-03 19:04   ` Brian Foster
2016-10-04  0:30     ` Darrick J. Wong
2016-10-04 12:44       ` Brian Foster
2016-09-30  3:08 ` [PATCH 26/63] xfs: define tracepoints for reflink activities Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 27/63] xfs: add reflink feature flag to geometry Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 28/63] xfs: don't allow reflinked dir/dev/fifo/socket/pipe files Darrick J. Wong
2016-09-30  7:20   ` Christoph Hellwig
2016-09-30  3:08 ` [PATCH 29/63] xfs: introduce the CoW fork Darrick J. Wong
2016-09-30  7:39   ` Christoph Hellwig
2016-09-30 17:48     ` Darrick J. Wong
2016-09-30  3:08 ` [PATCH 30/63] xfs: support bmapping delalloc extents in " Darrick J. Wong
2016-09-30  7:42   ` Christoph Hellwig
2016-09-30  3:09 ` [PATCH 31/63] xfs: create delalloc extents in " Darrick J. Wong
2016-10-04 16:38   ` Brian Foster
2016-10-04 17:39     ` Darrick J. Wong
2016-10-04 18:38       ` Brian Foster
2016-09-30  3:09 ` [PATCH 32/63] xfs: support allocating delayed " Darrick J. Wong
2016-09-30  7:42   ` Christoph Hellwig
2016-10-04 16:38   ` Brian Foster
2016-09-30  3:09 ` [PATCH 33/63] xfs: allocate " Darrick J. Wong
2016-10-04 16:38   ` Brian Foster
2016-10-04 18:26     ` Darrick J. Wong
2016-10-04 18:39       ` Brian Foster
2016-09-30  3:09 ` [PATCH 34/63] xfs: support removing extents from " Darrick J. Wong
2016-09-30  7:46   ` Christoph Hellwig
2016-09-30 18:00     ` Darrick J. Wong
2016-10-05 18:26   ` Brian Foster
2016-09-30  3:09 ` [PATCH 35/63] xfs: move mappings from cow fork to data fork after copy-write Darrick J. Wong
2016-10-05 18:26   ` Brian Foster
2016-10-05 21:22     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 36/63] xfs: report shared extent mappings to userspace correctly Darrick J. Wong
2016-09-30  3:09 ` [PATCH 37/63] xfs: implement CoW for directio writes Darrick J. Wong
2016-10-05 18:27   ` Brian Foster
2016-10-05 20:55     ` Darrick J. Wong
2016-10-06 12:20       ` Brian Foster
2016-10-07  1:02         ` Darrick J. Wong
2016-10-07  6:17           ` Christoph Hellwig
2016-10-07 12:16             ` Brian Foster
2016-10-07 12:15           ` Brian Foster
2016-10-13 18:14             ` Darrick J. Wong
2016-10-13 19:01               ` Brian Foster
2016-09-30  3:09 ` [PATCH 38/63] xfs: cancel CoW reservations and clear inode reflink flag when freeing blocks Darrick J. Wong
2016-09-30  7:47   ` Christoph Hellwig
2016-10-06 16:44   ` Brian Foster
2016-10-07  0:40     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 39/63] xfs: cancel pending CoW reservations when destroying inodes Darrick J. Wong
2016-09-30  7:47   ` Christoph Hellwig
2016-10-06 16:44   ` Brian Foster
2016-10-07  0:42     ` Darrick J. Wong
2016-09-30  3:09 ` [PATCH 40/63] xfs: store in-progress CoW allocations in the refcount btree Darrick J. Wong
2016-09-30  7:49   ` Christoph Hellwig
2016-10-07 18:04   ` Brian Foster
2016-10-07 19:18     ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 41/63] xfs: reflink extents from one file to another Darrick J. Wong
2016-09-30  7:50   ` Christoph Hellwig
2016-10-07 18:04   ` Brian Foster
2016-10-07 19:44     ` Darrick J. Wong
2016-10-07 20:48       ` Brian Foster
2016-10-07 21:41         ` Darrick J. Wong
2016-10-10 13:17           ` Brian Foster
2016-09-30  3:10 ` [PATCH 42/63] xfs: add clone file and clone range vfs functions Darrick J. Wong
2016-09-30  7:51   ` Christoph Hellwig
2016-09-30 18:04     ` Darrick J. Wong
2016-10-07 18:04   ` Brian Foster
2016-10-07 20:31     ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 43/63] xfs: add dedupe range vfs function Darrick J. Wong
2016-09-30  7:53   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 44/63] xfs: teach get_bmapx about shared extents and the CoW fork Darrick J. Wong
2016-09-30  7:53   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 45/63] xfs: swap inode reflink flags when swapping inode extents Darrick J. Wong
2016-09-30  7:54   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 46/63] xfs: unshare a range of blocks via fallocate Darrick J. Wong
2016-09-30  7:54   ` Christoph Hellwig
2016-10-07 18:05   ` Brian Foster
2016-10-07 20:26     ` Darrick J. Wong
2016-10-07 20:58       ` Brian Foster
2016-10-07 21:15         ` Darrick J. Wong
2016-10-07 22:25           ` Dave Chinner
2016-10-10 17:05             ` Darrick J. Wong [this message]
2016-09-30  3:10 ` [PATCH 47/63] xfs: create a separate cow extent size hint for the allocator Darrick J. Wong
2016-09-30  7:55   ` Christoph Hellwig
2016-09-30  3:10 ` [PATCH 48/63] xfs: preallocate blocks for worst-case btree expansion Darrick J. Wong
2016-09-30  8:19   ` Christoph Hellwig
2016-10-12 18:44   ` Brian Foster
2016-10-12 20:52     ` Darrick J. Wong
2016-10-12 22:42       ` Brian Foster
2016-12-06 19:32         ` Darrick J. Wong
2016-12-07 11:53           ` Brian Foster
2016-12-08  6:14             ` Darrick J. Wong
2016-09-30  3:10 ` [PATCH 49/63] xfs: don't allow reflink when the AG is low on space Darrick J. Wong
2016-09-30  8:19   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 50/63] xfs: try other AGs to allocate a BMBT block Darrick J. Wong
2016-09-30  8:20   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 51/63] xfs: garbage collect old cowextsz reservations Darrick J. Wong
2016-09-30  8:23   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 52/63] xfs: increase log reservations for reflink Darrick J. Wong
2016-09-30  8:23   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 53/63] xfs: add shared rmap map/unmap/convert log item types Darrick J. Wong
2016-09-30  8:24   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 54/63] xfs: use interval query for rmap alloc operations on shared files Darrick J. Wong
2016-09-30  8:24   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 55/63] xfs: convert unwritten status of reverse mappings for " Darrick J. Wong
2016-09-30  8:25   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 56/63] xfs: set a default CoW extent size of 32 blocks Darrick J. Wong
2016-09-30  8:25   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 57/63] xfs: check for invalid inode reflink flags Darrick J. Wong
2016-09-30  8:26   ` Christoph Hellwig
2016-09-30  3:11 ` [PATCH 58/63] xfs: don't mix reflink and DAX mode for now Darrick J. Wong
2016-09-30  8:26   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 59/63] xfs: simulate per-AG reservations being critically low Darrick J. Wong
2016-09-30  8:27   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 60/63] xfs: recognize the reflink feature bit Darrick J. Wong
2016-09-30  8:27   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 61/63] xfs: various swapext cleanups Darrick J. Wong
2016-09-30  8:28   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 62/63] xfs: refactor swapext code Darrick J. Wong
2016-09-30  8:28   ` Christoph Hellwig
2016-09-30  3:12 ` [PATCH 63/63] xfs: implement swapext for rmap filesystems Darrick J. Wong
2016-09-30  9:00 ` [PATCH v10 00/63] xfs: add reflink and dedupe support Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161010170524.GD5619@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.