All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] clean up speculative preallocation tracking and tagging
@ 2016-11-22 13:00 Brian Foster
  2016-11-22 13:01 ` [PATCH 1/3] xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() Brian Foster
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Brian Foster @ 2016-11-22 13:00 UTC (permalink / raw)
  To: linux-xfs

Hi all,

These are a few patches that resulted from the following discussion with
regard to how to tag reflink inodes correctly for COW fork
preallocation:

  http://www.spinics.net/lists/linux-xfs/msg02155.html

The problem with the current code is that xfs_reflink_reserve_cow()
doesn't consider preallocation due to start offset alignment. The
problem with the first pass patch above is that the same function
doesn't distinguish between extent preallocation and extent merge.

The solution in this series pushes down the incorporation of
preallocation to the bmapi call, where it already has the additional
context to identify whether an extent allocation was widened due to
cowextszhint alignment. The callers are still responsible for defining
how much to preallocate, to throttle, retry if necessary, etc.

Note that this series is based on top of Christoph's recent extent
lookup cleanup patches. This survives xfstests for me on a reflink=1 fs.
Thoughts, reviews, flames appreciated.

Brian

Brian Foster (3):
  xfs: track preallocation separately in xfs_bmapi_reserve_delalloc()
  xfs: clean up cow fork reservation and tag inodes correctly
  xfs: pass post-eof speculative prealloc blocks to bmapi

 fs/xfs/libxfs/xfs_bmap.c | 23 +++++++++++++++++++++--
 fs/xfs/libxfs/xfs_bmap.h |  2 +-
 fs/xfs/xfs_iomap.c       | 33 +++++++++++++--------------------
 fs/xfs/xfs_reflink.c     | 29 +++--------------------------
 4 files changed, 38 insertions(+), 49 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] xfs: track preallocation separately in xfs_bmapi_reserve_delalloc()
  2016-11-22 13:00 [PATCH 0/3] clean up speculative preallocation tracking and tagging Brian Foster
@ 2016-11-22 13:01 ` Brian Foster
  2016-11-22 13:01 ` [PATCH 2/3] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2016-11-22 13:01 UTC (permalink / raw)
  To: linux-xfs

Speculative preallocation is currently processed entirely by the callers
of xfs_bmapi_reserve_delalloc(). The caller determines how much
preallocation to include, adjusts the extent length and passes down the
resulting request.

While this works fine for post-eof speculative preallocation, it is not
as reliable for COW fork preallocation. COW fork preallocation is
implemented via the cowextszhint, which aligns the start offset as well
as the length of the extent. Further, it is difficult for the caller to
accurately identify when preallocation occurs because the returned
extent could have been merged with neighboring extents in the fork.

To simplify this situation and facilitate further COW fork preallocation
enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
preallocation parameter to incorporate into the allocation request. The
preallocation blocks value is tacked onto the end of the request and
adjusted to accommodate neighboring extents and extent size limits.
Since xfs_bmapi_reserve_delalloc() now knows precisely how much
preallocation was included in the allocation, it can also tag the inodes
appropriately to support preallocation reclaim.

Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
use the preallocation mechanism. This patch should not change behavior
outside of correctly tagging reflink inodes when start offset
preallocation occurs (which the caller does not handle correctly).

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 23 +++++++++++++++++++++--
 fs/xfs/libxfs/xfs_bmap.h |  2 +-
 fs/xfs/xfs_iomap.c       |  2 +-
 fs/xfs/xfs_reflink.c     |  2 +-
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 3ab68db..d17fe6a 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -49,6 +49,7 @@
 #include "xfs_rmap.h"
 #include "xfs_ag_resv.h"
 #include "xfs_refcount.h"
+#include "xfs_icache.h"
 
 
 kmem_zone_t		*xfs_bmap_free_item_zone;
@@ -4141,8 +4142,9 @@ int
 xfs_bmapi_reserve_delalloc(
 	struct xfs_inode	*ip,
 	int			whichfork,
-	xfs_fileoff_t		aoff,
+	xfs_fileoff_t		off,
 	xfs_filblks_t		len,
+	xfs_filblks_t		prealloc,
 	struct xfs_bmbt_irec	*got,
 	xfs_extnum_t		*lastx,
 	int			eof)
@@ -4154,10 +4156,17 @@ xfs_bmapi_reserve_delalloc(
 	char			rt = XFS_IS_REALTIME_INODE(ip);
 	xfs_extlen_t		extsz;
 	int			error;
+	xfs_fileoff_t		aoff = off;
 
-	alen = XFS_FILBLKS_MIN(len, MAXEXTLEN);
+	/*
+	 * Cap the alloc length. Keep track of prealloc so we know whether to
+	 * tag the inode before we return.
+	 */
+	alen = XFS_FILBLKS_MIN(len + prealloc, MAXEXTLEN);
 	if (!eof)
 		alen = XFS_FILBLKS_MIN(alen, got->br_startoff - aoff);
+	if (prealloc && alen >= len)
+		prealloc = alen - len;
 
 	/* Figure out the extent size, adjust alen */
 	if (whichfork == XFS_COW_FORK)
@@ -4223,6 +4232,16 @@ xfs_bmapi_reserve_delalloc(
 	 */
 	xfs_bmbt_get_all(xfs_iext_get_ext(ifp, *lastx), got);
 
+	/*
+	 * Tag the inode if blocks were preallocated. Note that COW fork
+	 * preallocation can occur at the start or end of the extent, even when
+	 * prealloc == 0, so we must also check the aligned offset and length.
+	 */
+	if (whichfork == XFS_DATA_FORK && prealloc)
+		xfs_inode_set_eofblocks_tag(ip);
+	if (whichfork == XFS_COW_FORK && (prealloc || aoff < off || alen > len))
+		xfs_inode_set_cowblocks_tag(ip);
+
 	ASSERT(got->br_startoff <= aoff);
 	ASSERT(got->br_startoff + got->br_blockcount >= aoff + alen);
 	ASSERT(isnullstartblock(got->br_startblock));
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index ffed1f9..cecd094 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -238,7 +238,7 @@ int	xfs_bmap_shift_extents(struct xfs_trans *tp, struct xfs_inode *ip,
 		int num_exts);
 int	xfs_bmap_split_extent(struct xfs_inode *ip, xfs_fileoff_t split_offset);
 int	xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork,
-		xfs_fileoff_t aoff, xfs_filblks_t len,
+		xfs_fileoff_t off, xfs_filblks_t len, xfs_filblks_t prealloc,
 		struct xfs_bmbt_irec *got, xfs_extnum_t *lastx, int eof);
 
 enum xfs_bmap_intent_type {
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 05eedfb..4062727 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -620,7 +620,7 @@ xfs_file_iomap_begin_delay(
 
 retry:
 	error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb,
-			end_fsb - offset_fsb, &got, &idx, eof);
+			end_fsb - offset_fsb, 0, &got, &idx, eof);
 	switch (error) {
 	case 0:
 		break;
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 4b024e7..812c632 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -295,7 +295,7 @@ xfs_reflink_reserve_cow(
 
 retry:
 	error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff,
-			end_fsb - imap->br_startoff, &got, &idx, eof);
+			end_fsb - imap->br_startoff, 0, &got, &idx, eof);
 	switch (error) {
 	case 0:
 		break;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-22 13:00 [PATCH 0/3] clean up speculative preallocation tracking and tagging Brian Foster
  2016-11-22 13:01 ` [PATCH 1/3] xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() Brian Foster
@ 2016-11-22 13:01 ` Brian Foster
  2016-11-22 13:01 ` [PATCH 3/3] xfs: pass post-eof speculative prealloc blocks to bmapi Brian Foster
  2016-11-28  4:01 ` [PATCH 0/3] clean up speculative preallocation tracking and tagging Dave Chinner
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2016-11-22 13:01 UTC (permalink / raw)
  To: linux-xfs

COW fork reservation is implemented via delayed allocation. The code is
modeled after the traditional delalloc allocation code, but is slightly
different in terms of how preallocation occurs. Rather than post-eof
speculative preallocation, COW fork preallocation is implemented via a
COW extent size hint that is designed to minimize fragmentation as a
reflinked file is split over time.

xfs_reflink_reserve_cow() still uses logic that is oriented towards
dealing with post-eof speculative preallocation, however, and is stale
or not necessarily correct. First, the EOF alignment to the COW extent
size hint is implemented in xfs_bmapi_reserve_delalloc() (which does so
correctly by aligning the start and end offsets) and so is not necessary
in xfs_reflink_reserve_cow(). The backoff and retry logic on ENOSPC is
also ineffective for the same reason, as xfs_bmapi_reserve_delalloc()
will simply perform the same allocation request on the retry. Finally,
since the COW extent size hint aligns the start and end offset of the
range to allocate, the end_fsb != orig_end_fsb logic is not sufficient.
Indeed, if a write request happens to end on an aligned offset, it is
possible that we do not tag the inode for COW preallocation even though
xfs_bmapi_reserve_delalloc() may have preallocated at the start offset.

Kill the unnecessary, duplicate code in xfs_reflink_reserve_cow().
Remove the inode tag logic as well since xfs_bmapi_reserve_delalloc()
has been updated to tag the inode correctly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_reflink.c | 29 +++--------------------------
 1 file changed, 3 insertions(+), 26 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 812c632..73675459 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -245,11 +245,9 @@ xfs_reflink_reserve_cow(
 {
 	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
 	struct xfs_bmbt_irec	got;
-	xfs_fileoff_t		end_fsb, orig_end_fsb;
 	int			error = 0;
 	bool			eof = false, trimmed;
 	xfs_extnum_t		idx;
-	xfs_extlen_t		align;
 
 	/*
 	 * Search the COW fork extent list first.  This serves two purposes:
@@ -287,33 +285,12 @@ xfs_reflink_reserve_cow(
 	if (error)
 		return error;
 
-	end_fsb = orig_end_fsb = imap->br_startoff + imap->br_blockcount;
-
-	align = xfs_eof_alignment(ip, xfs_get_cowextsz_hint(ip));
-	if (align)
-		end_fsb = roundup_64(end_fsb, align);
-
-retry:
 	error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff,
-			end_fsb - imap->br_startoff, 0, &got, &idx, eof);
-	switch (error) {
-	case 0:
-		break;
-	case -ENOSPC:
-	case -EDQUOT:
-		/* retry without any preallocation */
+			imap->br_blockcount, 0, &got, &idx, eof);
+	if (error == -ENOSPC || error == -EDQUOT)
 		trace_xfs_reflink_cow_enospc(ip, imap);
-		if (end_fsb != orig_end_fsb) {
-			end_fsb = orig_end_fsb;
-			goto retry;
-		}
-		/*FALLTHRU*/
-	default:
+	if (error)
 		return error;
-	}
-
-	if (end_fsb != orig_end_fsb)
-		xfs_inode_set_cowblocks_tag(ip);
 
 	trace_xfs_reflink_cow_alloc(ip, &got);
 	return 0;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] xfs: pass post-eof speculative prealloc blocks to bmapi
  2016-11-22 13:00 [PATCH 0/3] clean up speculative preallocation tracking and tagging Brian Foster
  2016-11-22 13:01 ` [PATCH 1/3] xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() Brian Foster
  2016-11-22 13:01 ` [PATCH 2/3] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
@ 2016-11-22 13:01 ` Brian Foster
  2016-11-28  4:01 ` [PATCH 0/3] clean up speculative preallocation tracking and tagging Dave Chinner
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2016-11-22 13:01 UTC (permalink / raw)
  To: linux-xfs

xfs_file_iomap_begin_delay() implements post-eof speculative
preallocation by extending the block count of the requested delayed
allocation. Now that xfs_bmapi_reserve_delalloc() has been updated to
handle prealloc blocks separately and tag the inode, update
xfs_file_iomap_begin_delay() to use the new parameter and rely on the
former to tag the inode.

Note that this patch does not change behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 4062727..4f46f49 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -536,10 +536,11 @@ xfs_file_iomap_begin_delay(
 	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	xfs_fileoff_t		maxbytes_fsb =
 		XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
-	xfs_fileoff_t		end_fsb, orig_end_fsb;
+	xfs_fileoff_t		end_fsb;
 	int			error = 0, eof = 0;
 	struct xfs_bmbt_irec	got;
 	xfs_extnum_t		idx;
+	xfs_fsblock_t		prealloc_blocks = 0;
 
 	ASSERT(!XFS_IS_REALTIME_INODE(ip));
 	ASSERT(!xfs_get_extsz_hint(ip));
@@ -594,33 +595,32 @@ xfs_file_iomap_begin_delay(
 	 * the lower level functions are updated.
 	 */
 	count = min_t(loff_t, count, 1024 * PAGE_SIZE);
-	end_fsb = orig_end_fsb =
-		min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
+	end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
 
 	if (eof) {
-		xfs_fsblock_t	prealloc_blocks;
-
 		prealloc_blocks = xfs_iomap_prealloc_size(ip, offset, count, idx);
 		if (prealloc_blocks) {
 			xfs_extlen_t	align;
 			xfs_off_t	end_offset;
+			xfs_fileoff_t	p_end_fsb;
 
 			end_offset = XFS_WRITEIO_ALIGN(mp, offset + count - 1);
-			end_fsb = XFS_B_TO_FSBT(mp, end_offset) +
-				prealloc_blocks;
+			p_end_fsb = XFS_B_TO_FSBT(mp, end_offset) +
+					prealloc_blocks;
 
 			align = xfs_eof_alignment(ip, 0);
 			if (align)
-				end_fsb = roundup_64(end_fsb, align);
+				p_end_fsb = roundup_64(p_end_fsb, align);
 
-			end_fsb = min(end_fsb, maxbytes_fsb);
-			ASSERT(end_fsb > offset_fsb);
+			p_end_fsb = min(p_end_fsb, maxbytes_fsb);
+			ASSERT(p_end_fsb > offset_fsb);
+			prealloc_blocks = p_end_fsb - end_fsb;
 		}
 	}
 
 retry:
 	error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb,
-			end_fsb - offset_fsb, 0, &got, &idx, eof);
+			end_fsb - offset_fsb, prealloc_blocks, &got, &idx, eof);
 	switch (error) {
 	case 0:
 		break;
@@ -628,8 +628,8 @@ xfs_file_iomap_begin_delay(
 	case -EDQUOT:
 		/* retry without any preallocation */
 		trace_xfs_delalloc_enospc(ip, offset, count);
-		if (end_fsb != orig_end_fsb) {
-			end_fsb = orig_end_fsb;
+		if (prealloc_blocks) {
+			prealloc_blocks = 0;
 			goto retry;
 		}
 		/*FALLTHRU*/
@@ -637,13 +637,6 @@ xfs_file_iomap_begin_delay(
 		goto out_unlock;
 	}
 
-	/*
-	 * Tag the inode as speculatively preallocated so we can reclaim this
-	 * space on demand, if necessary.
-	 */
-	if (end_fsb != orig_end_fsb)
-		xfs_inode_set_eofblocks_tag(ip);
-
 	trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
 done:
 	if (isnullstartblock(got.br_startblock))
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] clean up speculative preallocation tracking and tagging
  2016-11-22 13:00 [PATCH 0/3] clean up speculative preallocation tracking and tagging Brian Foster
                   ` (2 preceding siblings ...)
  2016-11-22 13:01 ` [PATCH 3/3] xfs: pass post-eof speculative prealloc blocks to bmapi Brian Foster
@ 2016-11-28  4:01 ` Dave Chinner
  3 siblings, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2016-11-28  4:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Nov 22, 2016 at 08:00:59AM -0500, Brian Foster wrote:
> Hi all,
> 
> These are a few patches that resulted from the following discussion with
> regard to how to tag reflink inodes correctly for COW fork
> preallocation:
> 
>   http://www.spinics.net/lists/linux-xfs/msg02155.html
> 
> The problem with the current code is that xfs_reflink_reserve_cow()
> doesn't consider preallocation due to start offset alignment. The
> problem with the first pass patch above is that the same function
> doesn't distinguish between extent preallocation and extent merge.
> 
> The solution in this series pushes down the incorporation of
> preallocation to the bmapi call, where it already has the additional
> context to identify whether an extent allocation was widened due to
> cowextszhint alignment. The callers are still responsible for defining
> how much to preallocate, to throttle, retry if necessary, etc.
> 
> Note that this series is based on top of Christoph's recent extent
> lookup cleanup patches. This survives xfstests for me on a reflink=1 fs.
> Thoughts, reviews, flames appreciated.

All in all looks like a nice little cleanup. Tests out just fine
here, too, so consider it:

Reviewed-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-28  4:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22 13:00 [PATCH 0/3] clean up speculative preallocation tracking and tagging Brian Foster
2016-11-22 13:01 ` [PATCH 1/3] xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() Brian Foster
2016-11-22 13:01 ` [PATCH 2/3] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
2016-11-22 13:01 ` [PATCH 3/3] xfs: pass post-eof speculative prealloc blocks to bmapi Brian Foster
2016-11-28  4:01 ` [PATCH 0/3] clean up speculative preallocation tracking and tagging Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.