linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: david@fromorbit.com, darrick.wong@oracle.com
Cc: linux-fsdevel@vger.kernel.org, vishal.l.verma@intel.com,
	bfoster@redhat.com, xfs@oss.sgi.com
Subject: [PATCH v7 00/47] xfs: add reverse mapping support
Date: Wed, 20 Jul 2016 21:55:55 -0700	[thread overview]
Message-ID: <146907695530.25461.3225785294902719773.stgit@birch.djwong.org> (raw)

Hi all,

This is the seventh revision of a patchset that adds to XFS kernel
support for tracking reverse-mappings of physical blocks to file and
metadata (rmap).  Per reviewers' request with v6, I am splitting the
gigantic patchbombs into separate functional areas.  Given the
significant amount of design assumptions that change with block
sharing, rmap and reflink are provided together.  There shouldn't be
any incompatible on-disk format changes, pending a thorough review of
the patches within.

The reverse mapping implementation features a simple per-AG b+tree
containing tuples of (physical block, owner, offset, blockcount) with
the key being the first three fields.  The large record size will
enable us to reconstruct corrupt block mapping btrees (bmbt); the
large key size is necessary to identify uniquely each rmap record in
the presence of shared physical blocks.  In contrast to previous
iterations of this patchset, it is no longer a requirement that there
be a 1:1 correspondence between bmbt and rmapbt records; each rmapbt
record can cover multiple bmbt records.

Since the previous posting, I have made some major changes to the
underlying XFS common code.  First, I have extended the generic b+tree
implementation to support overlapping intervals, which is necessary
for the rmapbt on a reflink filesystem where there can be a number of
rmapbt records representing a physical block.  The new b+tree variant
introduces the notion of a "high key" for each record; it is the
highest key that can be used to identify a record.  On disk, an
overlapped-interval b+tree looks like a traditional b+tree except that
nodes store both the lowest key and the highest key accessible through
that subtree pointer.  There's a new interval query function that uses
both keys to iterate all records overlapping a given range of keys.
This change allows us to remove the old requirement that each bmbt
record correspond to a matching rmapbt record.

The second big change is to the xfs_bmap_free functions.  The existing
code implements a mechanism to defer metadata (specifically, free
space b+tree) updates across a transaction commit by logging redo
items that can be replayed during recovery.  It is an elegant way to
avoid running afoul of AG locking order rules /and/ it can in theory
be used to get around running out of transaction reservation.  That
said, I have refactored it into a generic "deferred operations"
mechanism that can defer arbitrary types of work to a subsequent
rolled transaction.  The framework thus allows me to schedule rmapbt,
refcountbt, and bmbt updates while maintaining correct redo in case of
failure.

At the very end of the patchset is an initial implementation of a
GETFSMAP ioctl for userland to query the physical block mapping of a
filesystem.

The first few patches fix various vfs/xfs bugs, adds an enhancement to
the xfs_buf tracepoints so that we can analyze buffer deadlocks, and
merges difference between the kernel and userspace libxfs so that the
rest of the patches apply consistently.

If you're going to start using this mess, you probably ought to just
pull from my github trees for kernel[1], xfsprogs[2], and xfstests[3].
There are also updates for xfs-docs[4] and man-pages[5].  The kernel
patches should apply to dchinner's for-next; xfsprogs patches to
for-next; and xfstest to master.

The patches have been xfstested with x64, i386, ppc64, and armv7l.
All three architectures pass all 'clone' group tests.

This is an extraordinary way to eat your data.  Enjoy! 
Comments and questions are, as always, welcome.

--D

[1] https://github.com/djwong/linux/tree/for-dave-for-4.8
[2] https://github.com/djwong/xfsprogs/tree/djwong-experimental
[3] https://github.com/djwong/xfstests/tree/djwong-devel
[4] https://github.com/djwong/xfs-documentation/tree/djwong-devel
[5] https://github.com/djwong/man-pages/tree/djwong-devel

             reply	other threads:[~2016-07-21  4:56 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-21  4:55 Darrick J. Wong [this message]
2016-07-21  4:56 ` [PATCH 01/47] vfs: fix return type of ioctl_file_dedupe_range Darrick J. Wong
2016-08-01  6:33   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 02/47] vfs: support FS_XFLAG_REFLINK and FS_XFLAG_COWEXTSIZE Darrick J. Wong
2016-08-01  6:33   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 03/47] xfs: fix attr shortform structure alignment on cris Darrick J. Wong
2016-07-26 16:36   ` Brian Foster
2016-08-01  6:34   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 04/47] xfs: fix locking of the rt bitmap/summary inodes Darrick J. Wong
2016-07-26 16:36   ` Brian Foster
2016-07-28 18:58     ` Darrick J. Wong
2016-08-01  6:34   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 05/47] xfs: set *stat=1 after iroot realloc Darrick J. Wong
2016-07-26 16:36   ` Brian Foster
2016-08-01  6:35   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 06/47] xfs: during btree split, save new block key & ptr for future insertion Darrick J. Wong
2016-07-26 16:36   ` Brian Foster
2016-08-01  6:37   ` Christoph Hellwig
2016-07-21  4:56 ` [PATCH 07/47] xfs: add function pointers for get/update keys to the btree Darrick J. Wong
2016-07-26 19:09   ` Brian Foster
2016-07-28 19:13     ` Darrick J. Wong
2016-07-28 19:46   ` [PATCH v2 " Darrick J. Wong
2016-08-01 15:57     ` Brian Foster
2016-08-01 17:54       ` Darrick J. Wong
2016-08-01  6:39   ` [PATCH " Christoph Hellwig
2016-08-01 17:33     ` Darrick J. Wong
2016-08-02 12:23       ` Christoph Hellwig
2016-08-03  0:12         ` Darrick J. Wong
2016-07-21  4:56 ` [PATCH 08/47] xfs: support btrees with overlapping intervals for keys Darrick J. Wong
2016-08-01  6:48   ` Christoph Hellwig
2016-08-01 19:11     ` Darrick J. Wong
2016-08-02 12:03       ` Christoph Hellwig
2016-08-03  3:29         ` Darrick J. Wong
2016-08-02 14:04       ` Brian Foster
2016-08-03  1:06         ` Dave Chinner
2016-08-01 17:47   ` Brian Foster
2016-08-01 19:18     ` Darrick J. Wong
2016-07-21  4:56 ` [PATCH 09/47] xfs: introduce interval queries on btrees Darrick J. Wong
2016-08-01  8:00   ` Christoph Hellwig
2016-07-21  4:57 ` [PATCH 10/47] xfs: refactor btree owner change into a separate visit-blocks function Darrick J. Wong
2016-08-01  6:50   ` Christoph Hellwig
2016-07-21  4:57 ` [PATCH 11/47] xfs: move deferred operations into a separate file Darrick J. Wong
2016-08-01  7:08   ` Christoph Hellwig
2016-08-01  8:02   ` Christoph Hellwig
2016-08-02 22:39     ` Dave Chinner
2016-08-03  9:16       ` Christoph Hellwig
2016-08-03 22:57         ` Dave Chinner
2016-08-04 16:00           ` Christoph Hellwig
2016-08-04 23:44             ` Dave Chinner
2016-08-02 17:30   ` Brian Foster
2016-07-21  4:57 ` [PATCH 12/47] xfs: add tracepoints for the deferred ops mechanism Darrick J. Wong
2016-07-21  4:57 ` [PATCH 13/47] xfs: clean up typedef usage in the EFI/EFD handling code Darrick J. Wong
2016-08-01  7:09   ` Christoph Hellwig
2016-07-21  4:57 ` [PATCH 14/47] xfs: enable the xfs_defer mechanism to process extents to free Darrick J. Wong
2016-08-01  7:09   ` Christoph Hellwig
2016-08-02 17:30   ` Brian Foster
2016-07-21  4:57 ` [PATCH 15/47] xfs: rework xfs_bmap_free callers to use xfs_defer_ops Darrick J. Wong
2016-08-02 17:30   ` Brian Foster
2016-07-21  4:57 ` [PATCH 16/47] xfs: change xfs_bmap_{finish, cancel, init, free} -> xfs_defer_* Darrick J. Wong
2016-08-02 17:30   ` Brian Foster
2016-08-02 20:47     ` Darrick J. Wong
2016-07-21  4:57 ` [PATCH 17/47] xfs: rename flist/free_list to dfops Darrick J. Wong
2016-08-02 17:30   ` Brian Foster
2016-07-21  4:58 ` [PATCH 18/47] xfs: refactor redo intent item processing Darrick J. Wong
2016-08-01  8:10   ` Christoph Hellwig
2016-08-02 20:35     ` Darrick J. Wong
2016-08-02 18:47   ` Brian Foster
2016-07-21  4:58 ` [PATCH 19/47] xfs: add tracepoints and error injection for deferred extent freeing Darrick J. Wong
2016-08-02 18:48   ` Brian Foster
2016-08-02 20:24     ` Darrick J. Wong
2016-08-02 21:38       ` Brian Foster
2016-08-02 22:43         ` Darrick J. Wong
2016-07-21  4:58 ` [PATCH 20/47] xfs: increase XFS_BTREE_MAXLEVELS to fit the rmapbt Darrick J. Wong
2016-08-02 18:48   ` Brian Foster
2016-08-02 20:06     ` Darrick J. Wong
2016-08-02 21:38       ` Brian Foster
2016-07-21  4:58 ` [PATCH 21/47] xfs: introduce rmap btree definitions Darrick J. Wong
2016-07-21  4:58 ` [PATCH 22/47] xfs: add rmap btree stats infrastructure Darrick J. Wong
2016-07-21  4:58 ` [PATCH 23/47] xfs: rmap btree add more reserved blocks Darrick J. Wong
2016-07-21  4:58 ` [PATCH 24/47] xfs: add owner field to extent allocation and freeing Darrick J. Wong
2016-07-21  4:58 ` [PATCH 25/47] xfs: introduce rmap extent operation stubs Darrick J. Wong
2016-07-21  4:58 ` [PATCH 26/47] xfs: define the on-disk rmap btree format Darrick J. Wong
2016-07-21  4:59 ` [PATCH 27/47] xfs: add rmap btree growfs support Darrick J. Wong
2016-07-21  4:59 ` [PATCH 28/47] xfs: rmap btree transaction reservations Darrick J. Wong
2016-07-21  4:59 ` [PATCH 29/47] xfs: rmap btree requires more reserved free space Darrick J. Wong
2016-07-21  4:59 ` [PATCH 30/47] xfs: add rmap btree operations Darrick J. Wong
2016-07-21  4:59 ` [PATCH 31/47] xfs: support overlapping intervals in the rmap btree Darrick J. Wong
2016-07-21  4:59 ` [PATCH 32/47] xfs: teach rmapbt to support interval queries Darrick J. Wong
2016-07-21  4:59 ` [PATCH 33/47] xfs: add tracepoints for the rmap functions Darrick J. Wong
2016-07-21  4:59 ` [PATCH 34/47] xfs: add an extent to the rmap btree Darrick J. Wong
2016-07-21  4:59 ` [PATCH 35/47] xfs: remove an extent from " Darrick J. Wong
2016-07-21  5:00 ` [PATCH 36/47] xfs: convert unwritten status of reverse mappings Darrick J. Wong
2016-08-03  2:00   ` Dave Chinner
2016-07-21  5:00 ` [PATCH 37/47] xfs: add rmap btree insert and delete helpers Darrick J. Wong
2016-07-21  5:00 ` [PATCH 38/47] xfs: create rmap update intent log items Darrick J. Wong
2016-08-01  7:12   ` Christoph Hellwig
2016-08-01 18:08     ` Darrick J. Wong
2016-07-21  5:00 ` [PATCH 39/47] xfs: log rmap intent items Darrick J. Wong
2016-07-21  5:00 ` [PATCH 40/47] xfs: enable the xfs_defer mechanism to process rmaps to update Darrick J. Wong
2016-07-21  5:00 ` [PATCH 41/47] xfs: propagate bmap updates to rmapbt Darrick J. Wong
2016-07-21  5:00 ` [PATCH 42/47] xfs: add rmap btree geometry feature flag Darrick J. Wong
2016-07-21  5:00 ` [PATCH 43/47] xfs: add rmap btree block detection to log recovery Darrick J. Wong
2016-07-21  5:00 ` [PATCH 44/47] xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled Darrick J. Wong
2016-07-21  5:01 ` [PATCH 45/47] xfs: don't update rmapbt when fixing agfl Darrick J. Wong
2016-07-21  5:01 ` [PATCH 46/47] xfs: enable the rmap btree functionality Darrick J. Wong
2016-07-21  5:01 ` [PATCH 47/47] xfs: introduce the XFS_IOC_GETFSMAP ioctl Darrick J. Wong
2016-07-23  4:28   ` [PATCH v2 " Darrick J. Wong
2016-08-03 19:45 ` [PATCH v7 00/47] xfs: add reverse mapping support Mark Fasheh
2016-08-03 20:55   ` Darrick J. Wong
2016-08-04  0:58     ` Darrick J. Wong
2016-08-04  2:18       ` Mark Fasheh
2016-08-04 15:48         ` Darrick J. Wong
2016-08-04 23:50           ` Dave Chinner
2016-08-05  0:49             ` Darrick J. Wong
2016-08-05  7:01             ` Artem Bityutskiy
2016-08-05  7:22               ` Darrick J. Wong
2016-08-05 10:49               ` Dave Chinner
2016-08-05 11:57                 ` Artem Bityutskiy
2016-08-05 22:26                   ` Dave Chinner
2016-08-05 18:36             ` Mark Fasheh
2016-08-05 22:39               ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=146907695530.25461.3225785294902719773.stgit@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    --cc=xfs@oss.sgi.com \
    --subject='Re: [PATCH v7 00/47] xfs: add reverse mapping support' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox