All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: david@fromorbit.com, darrick.wong@oracle.com
Cc: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
Subject: [PATCH 18/58] xfs: enhance rmap btree operations
Date: Tue, 06 Oct 2015 21:57:02 -0700	[thread overview]
Message-ID: <20151007045701.30457.40870.stgit@birch.djwong.org> (raw)
In-Reply-To: <20151007045443.30457.47038.stgit@birch.djwong.org>

Adapt the rmap btree to store owner offsets within each rmap record,
and to handle the primary key being extended to [agblk, owner, offset].
The expansion of the primary key is crucial to allowing multiple owners
per extent.  Unfortunately, doing so adds the requirement that all rmap
records for file extents (metadata always has one owner) correspond to
some bmbt entry somewhere.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_rmap.c       |   32 +++++++++++++++++---
 fs/xfs/libxfs/xfs_rmap_btree.c |   65 ++++++++++++++++++++++++++++++----------
 fs/xfs/libxfs/xfs_rmap_btree.h |    7 ++++
 3 files changed, 84 insertions(+), 20 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index 64b2525..f6fe742 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -37,26 +37,48 @@
 #include "xfs_extent_busy.h"
 
 /*
- * Lookup the first record less than or equal to [bno, len]
+ * Lookup the first record less than or equal to [bno, len, owner, offset]
  * in the btree given by cur.
  */
-STATIC int
+int
 xfs_rmap_lookup_le(
 	struct xfs_btree_cur	*cur,
 	xfs_agblock_t		bno,
 	xfs_extlen_t		len,
 	uint64_t		owner,
+	uint64_t		offset,
 	int			*stat)
 {
 	cur->bc_rec.r.rm_startblock = bno;
 	cur->bc_rec.r.rm_blockcount = len;
 	cur->bc_rec.r.rm_owner = owner;
+	cur->bc_rec.r.rm_offset = offset;
 	return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat);
 }
 
 /*
+ * Lookup the record exactly matching [bno, len, owner, offset]
+ * in the btree given by cur.
+ */
+int
+xfs_rmap_lookup_eq(
+	struct xfs_btree_cur	*cur,
+	xfs_agblock_t		bno,
+	xfs_extlen_t		len,
+	uint64_t		owner,
+	uint64_t		offset,
+	int			*stat)
+{
+	cur->bc_rec.r.rm_startblock = bno;
+	cur->bc_rec.r.rm_blockcount = len;
+	cur->bc_rec.r.rm_owner = owner;
+	cur->bc_rec.r.rm_offset = offset;
+	return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat);
+}
+
+/*
  * Update the record referred to by cur to the value given
- * by [bno, len, ref].
+ * by [bno, len, owner, offset].
  * This either works (return 0) or gets an EFSCORRUPTED error.
  */
 STATIC int
@@ -69,13 +91,14 @@ xfs_rmap_update(
 	rec.rmap.rm_startblock = cpu_to_be32(irec->rm_startblock);
 	rec.rmap.rm_blockcount = cpu_to_be32(irec->rm_blockcount);
 	rec.rmap.rm_owner = cpu_to_be64(irec->rm_owner);
+	rec.rmap.rm_offset = cpu_to_be64(irec->rm_offset);
 	return xfs_btree_update(cur, &rec);
 }
 
 /*
  * Get the data from the pointed-to record.
  */
-STATIC int
+int
 xfs_rmap_get_rec(
 	struct xfs_btree_cur	*cur,
 	struct xfs_rmap_irec	*irec,
@@ -91,6 +114,7 @@ xfs_rmap_get_rec(
 	irec->rm_startblock = be32_to_cpu(rec->rmap.rm_startblock);
 	irec->rm_blockcount = be32_to_cpu(rec->rmap.rm_blockcount);
 	irec->rm_owner = be64_to_cpu(rec->rmap.rm_owner);
+	irec->rm_offset = be64_to_cpu(rec->rmap.rm_offset);
 	return 0;
 }
 
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
index 58bdac3..5fe717b 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.c
+++ b/fs/xfs/libxfs/xfs_rmap_btree.c
@@ -37,21 +37,26 @@
 /*
  * Reverse map btree.
  *
- * This is a per-ag tree used to track the owner of a given extent. Owner
- * records are inserted when an extent is allocated, and removed when an extent
- * is freed. There can only be one owner of an extent, usually an inode or some
- * other metadata structure like a AG btree.
+ * This is a per-ag tree used to track the owner(s) of a given extent. With
+ * reflink it is possible for there to be multiple owners, which is a departure
+ * from classic XFS. Owner records for data extents are inserted when the
+ * extent is mapped and removed when an extent is unmapped.  Owner records for
+ * all other block types (i.e. metadata) are inserted when an extent is
+ * allocated and removed when an extent is freed. There can only be one owner
+ * of a metadata extent, usually an inode or some other metadata structure like
+ * an AG btree.
  *
  * The rmap btree is part of the free space management, so blocks for the tree
  * are sourced from the agfl. Hence we need transaction reservation support for
  * this tree so that the freelist is always large enough. This also impacts on
  * the minimum space we need to leave free in the AG.
  *
- * The tree is ordered by block number - there's no need to order/search by
- * extent size for online updating/management of the tree, and the reverse
- * lookups are going to be "who owns this block" and so are by-block ordering is
- * perfect for this.
- *
+ * The tree is ordered by [ag block, owner, offset]. This is a large key size,
+ * but it is the only way to enforce unique keys when a block can be owned by
+ * multiple files at any offset. There's no need to order/search by extent
+ * size for online updating/management of the tree. It is intended that most
+ * reverse lookups will be to find the owner(s) of a particular block, or to
+ * try to recover tree and file data from corrupt primary metadata.
  */
 
 static struct xfs_btree_cur *
@@ -165,6 +170,8 @@ xfs_rmapbt_init_key_from_rec(
 	union xfs_btree_rec	*rec)
 {
 	key->rmap.rm_startblock = rec->rmap.rm_startblock;
+	key->rmap.rm_owner = rec->rmap.rm_owner;
+	key->rmap.rm_offset = rec->rmap.rm_offset;
 }
 
 STATIC void
@@ -173,6 +180,8 @@ xfs_rmapbt_init_rec_from_key(
 	union xfs_btree_rec	*rec)
 {
 	rec->rmap.rm_startblock = key->rmap.rm_startblock;
+	rec->rmap.rm_owner = key->rmap.rm_owner;
+	rec->rmap.rm_offset = key->rmap.rm_offset;
 }
 
 STATIC void
@@ -183,6 +192,7 @@ xfs_rmapbt_init_rec_from_cur(
 	rec->rmap.rm_startblock = cpu_to_be32(cur->bc_rec.r.rm_startblock);
 	rec->rmap.rm_blockcount = cpu_to_be32(cur->bc_rec.r.rm_blockcount);
 	rec->rmap.rm_owner = cpu_to_be64(cur->bc_rec.r.rm_owner);
+	rec->rmap.rm_offset = cpu_to_be64(cur->bc_rec.r.rm_offset);
 }
 
 STATIC void
@@ -205,8 +215,16 @@ xfs_rmapbt_key_diff(
 {
 	struct xfs_rmap_irec	*rec = &cur->bc_rec.r;
 	struct xfs_rmap_key	*kp = &key->rmap;
-
-	return (__int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock;
+	__int64_t		d;
+
+	d = (__int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock;
+	if (d)
+		return d;
+	d = (__int64_t)be64_to_cpu(kp->rm_owner) - rec->rm_owner;
+	if (d)
+		return d;
+	d = (__int64_t)be64_to_cpu(kp->rm_offset) - rec->rm_offset;
+	return d;
 }
 
 static bool
@@ -307,8 +325,16 @@ xfs_rmapbt_keys_inorder(
 	union xfs_btree_key	*k1,
 	union xfs_btree_key	*k2)
 {
-	return be32_to_cpu(k1->rmap.rm_startblock) <
-	       be32_to_cpu(k2->rmap.rm_startblock);
+	if (be32_to_cpu(k1->rmap.rm_startblock) <
+	    be32_to_cpu(k2->rmap.rm_startblock))
+		return 1;
+	if (be64_to_cpu(k1->rmap.rm_owner) <
+	    be64_to_cpu(k2->rmap.rm_owner))
+		return 1;
+	if (be64_to_cpu(k1->rmap.rm_offset) <=
+	    be64_to_cpu(k2->rmap.rm_offset))
+		return 1;
+	return 0;
 }
 
 STATIC int
@@ -317,9 +343,16 @@ xfs_rmapbt_recs_inorder(
 	union xfs_btree_rec	*r1,
 	union xfs_btree_rec	*r2)
 {
-	return be32_to_cpu(r1->rmap.rm_startblock) +
-		be32_to_cpu(r1->rmap.rm_blockcount) <=
-		be32_to_cpu(r2->rmap.rm_startblock);
+	if (be32_to_cpu(r1->rmap.rm_startblock) <
+	    be32_to_cpu(r2->rmap.rm_startblock))
+		return 1;
+	if (be64_to_cpu(r1->rmap.rm_offset) <
+	    be64_to_cpu(r2->rmap.rm_offset))
+		return 1;
+	if (be64_to_cpu(r1->rmap.rm_owner) <=
+	    be64_to_cpu(r2->rmap.rm_owner))
+		return 1;
+	return 0;
 }
 #endif	/* DEBUG */
 
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.h b/fs/xfs/libxfs/xfs_rmap_btree.h
index 2e02362..a5c97f8 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.h
+++ b/fs/xfs/libxfs/xfs_rmap_btree.h
@@ -51,6 +51,13 @@ struct xfs_btree_cur *xfs_rmapbt_init_cursor(struct xfs_mount *mp,
 				xfs_agnumber_t agno);
 int xfs_rmapbt_maxrecs(struct xfs_mount *mp, int blocklen, int leaf);
 
+int xfs_rmap_lookup_le(struct xfs_btree_cur *cur, xfs_agblock_t	bno,
+		xfs_extlen_t len, uint64_t owner, uint64_t offset, int *stat);
+int xfs_rmap_lookup_eq(struct xfs_btree_cur *cur, xfs_agblock_t	bno,
+		xfs_extlen_t len, uint64_t owner, uint64_t offset, int *stat);
+int xfs_rmap_get_rec(struct xfs_btree_cur *cur, struct xfs_rmap_irec *irec,
+		int *stat);
+
 int xfs_rmap_alloc(struct xfs_trans *tp, struct xfs_buf *agbp,
 		   xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
 		   struct xfs_owner_info *oinfo);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2015-10-07  4:57 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-07  4:54 [RFCv3 00/58] xfs: add reverse-mapping, reflink, and dedupe support Darrick J. Wong
2015-10-07  4:54 ` Darrick J. Wong
2015-10-07  4:54 ` [PATCH 01/58] libxfs: make xfs_alloc_fix_freelist non-static Darrick J. Wong
2015-10-07  4:54   ` Darrick J. Wong
2015-10-07  4:54 ` [PATCH 02/58] xfs: fix log ticket type printing Darrick J. Wong
2015-10-07  4:54   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 03/58] xfs: introduce rmap btree definitions Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 04/58] xfs: add rmap btree stats infrastructure Darrick J. Wong
2015-10-07  4:55 ` [PATCH 05/58] xfs: rmap btree add more reserved blocks Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 06/58] xfs: add owner field to extent allocation and freeing Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 07/58] xfs: add extended " Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 08/58] xfs: introduce rmap extent operation stubs Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 09/58] xfs: extend rmap extent operation stubs to take full owner info Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 10/58] xfs: define the on-disk rmap btree format Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:55 ` [PATCH 11/58] xfs: enhance " Darrick J. Wong
2015-10-07  4:55   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 12/58] xfs: add rmap btree growfs support Darrick J. Wong
2015-10-07  4:56   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 13/58] xfs: enhance " Darrick J. Wong
2015-10-07  4:56   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 14/58] xfs: rmap btree transaction reservations Darrick J. Wong
2015-10-07  4:56   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 15/58] xfs: rmap btree requires more reserved free space Darrick J. Wong
2015-10-07  4:56   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 16/58] libxfs: fix min freelist length calculation Darrick J. Wong
2015-10-07  4:56   ` Darrick J. Wong
2015-10-07  4:56 ` [PATCH 17/58] xfs: add rmap btree operations Darrick J. Wong
2015-10-07  4:57 ` Darrick J. Wong [this message]
2015-10-07  4:57 ` [PATCH 19/58] xfs: add an extent to the rmap btree Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 20/58] xfs: add tracepoints for the rmap-mirrors-bmbt functions Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 21/58] xfs: teach rmap_alloc how to deal with our larger rmap btree Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 22/58] xfs: remove an extent from the " Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 23/58] xfs: enhanced " Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 24/58] xfs: add rmap btree insert and delete helpers Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 25/58] xfs: bmap btree changes should update rmap btree Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-21 21:39   ` Darrick J. Wong
2015-10-21 21:39     ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 26/58] xfs: add rmap btree geometry feature flag Darrick J. Wong
2015-10-07  4:57   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 27/58] xfs: add rmap btree block detection to log recovery Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 28/58] xfs: enable the rmap btree functionality Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 29/58] xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 30/58] xfs: implement " Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 31/58] libxfs: refactor short btree block verification Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 32/58] xfs: don't update rmapbt when fixing agfl Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 33/58] xfs: introduce refcount btree definitions Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 34/58] xfs: add refcount btree stats infrastructure Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:58 ` [PATCH 35/58] xfs: refcount btree add more reserved blocks Darrick J. Wong
2015-10-07  4:58   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 36/58] xfs: define the on-disk refcount btree format Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 37/58] xfs: define tracepoints for refcount/reflink activities Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 38/58] xfs: add refcount btree support to growfs Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 39/58] xfs: add refcount btree operations Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 40/58] libxfs: adjust refcount of an extent of blocks in refcount btree Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-27 19:05   ` Darrick J. Wong
2015-10-27 19:05     ` Darrick J. Wong
2015-10-30 20:56     ` Darrick J. Wong
2015-10-30 20:56       ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 41/58] libxfs: adjust refcount when unmapping file blocks Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 42/58] xfs: add refcount btree block detection to log recovery Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 43/58] xfs: map an inode's offset to an exact physical block Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 44/58] xfs: add reflink feature flag to geometry Darrick J. Wong
2015-10-07  4:59   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 45/58] xfs: create a separate workqueue for copy-on-write activities Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 46/58] xfs: implement copy-on-write for reflinked blocks Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 47/58] xfs: handle directio " Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 48/58] xfs: copy-on-write reflinked blocks when zeroing ranges of blocks Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-21 21:17   ` Darrick J. Wong
2015-10-21 21:17     ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 49/58] xfs: clear inode reflink flag when freeing blocks Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 50/58] xfs: reflink extents from one file to another Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:12   ` kbuild test robot
2015-10-07  5:12     ` kbuild test robot
2015-10-07  5:00 ` [PATCH 51/58] xfs: add clone file and clone range ioctls Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:13   ` kbuild test robot
2015-10-07  5:13     ` kbuild test robot
2015-10-07  6:46   ` kbuild test robot
2015-10-07  6:46     ` kbuild test robot
2015-10-07  7:35   ` kbuild test robot
2015-10-07  7:35     ` kbuild test robot
2015-10-07  5:00 ` [PATCH 52/58] xfs: emulate the btrfs dedupe extent same ioctl Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 53/58] xfs: teach fiemap about reflink'd extents Darrick J. Wong
2015-10-07  5:00   ` Darrick J. Wong
2015-10-07  5:01 ` [PATCH 54/58] xfs: swap inode reflink flags when swapping inode extents Darrick J. Wong
2015-10-07  5:01   ` Darrick J. Wong
2015-10-07  5:01 ` [PATCH 55/58] vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks Darrick J. Wong
2015-10-07  5:01   ` Darrick J. Wong
2015-10-07  5:01 ` [PATCH 56/58] xfs: unshare a range of blocks via fallocate Darrick J. Wong
2015-10-07  5:01   ` Darrick J. Wong
2015-10-07  5:01 ` [PATCH 57/58] xfs: support XFS_XFLAG_REFLINK (and FS_NOCOW_FL) on reflink filesystems Darrick J. Wong
2015-10-07  5:01   ` Darrick J. Wong
2015-10-07  5:01 ` [PATCH 58/58] xfs: recognize the reflink feature bit Darrick J. Wong
2015-10-07  5:01   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151007045701.30457.40870.stgit@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.