All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation
@ 2016-11-08 20:27 Brian Foster
  2016-11-08 20:27 ` [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Brian Foster @ 2016-11-08 20:27 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This is an experiment based on an idea for COW fork speculative
preallocation. This is experimental, lightly/barely tested and sent in
RFC form to solicit thoughts, ideas or flames before I spend time taking
it further.

Patch 1 probably stands on its own. Patches 2 and 3 are some refactoring
and patch 4 implements the basic idea, which is described in the commit
log description. The testing I've done so far is basically similar to
how one would test the effects of traditional speculative preallocation:
write to multiple reflinked files in parallel and examine the resulting
fragmentation. Specifically, I wrote sequentially to 16 different
reflinked files of the same 8GB original (which has two data extents,
completely shared). Without preallocation, the test results in ~248
extents across the 16 files. With preallocation, the test results in 32
extents across the 16 files (i.e., 2 extents per file, same as the
source file).

An obvious tradeoff is the unnecessarily aggressive allocation that
might occur in the event of random writes to a large file (such as in
the cloned VM disk image use case), but my thinking is that the
cowblocks tagging and reclaim infrastructure should manage that
sufficiently (lack of testing notwithstanding). In any event, I'm
interested in any thoughts along the lines of whether this is useful at
all, alternative algorithm ideas, etc.

Brian

Brian Foster (4):
  xfs: clean up cow fork reservation and tag inodes correctly
  xfs: logically separate iomap range from allocation range
  xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  xfs: implement basic COW fork speculative preallocation

 fs/xfs/xfs_iomap.c   | 132 +++++++++++++++++++++++++++++++++++++++++----------
 fs/xfs/xfs_reflink.c |  28 ++---------
 2 files changed, 111 insertions(+), 49 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
@ 2016-11-08 20:27 ` Brian Foster
  2016-11-15 14:16   ` Christoph Hellwig
  2016-11-08 20:27 ` [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range Brian Foster
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-08 20:27 UTC (permalink / raw)
  To: linux-xfs

COW fork reservation is implemented via delayed allocation. The code is
modeled after the traditional delalloc allocation code, but is slightly
different in terms of how preallocation occurs. Rather than post-eof
speculative preallocation, COW fork preallocation is implemented via a
COW extent size hint that is designed to minimize fragmentation as a
reflinked file is split over time.

xfs_reflink_reserve_cow() still uses logic that is oriented towards
dealing with post-eof speculative preallocation, however, and is stale
or not necessarily correct. First, the EOF alignment to the COW extent
size hint is implemented in xfs_bmapi_reserve_delalloc() (which does so
correctly by aligning the start and end offsets) and so is not necessary
in xfs_reflink_reserve_cow(). The backoff and retry logic on ENOSPC is
also ineffective for the same reason, as xfs_bmapi_reserve_delalloc()
will simply perform the same allocation request on the retry. Finally,
since the COW extent size hint aligns the start and end offset of the
range to allocate, the end_fsb != orig_end_fsb logic is not sufficient.
Indeed, if a write request happens to end on an aligned offset, it is
possible that we do not tag the inode for COW preallocation even though
xfs_bmapi_reserve_delalloc() may have preallocated at the start offset.

Kill the unnecessary, duplicate code in xfs_reflink_reserve_cow() and
update the preallocation tag logic to check whether the bmap allocation
actually matches the range that was requested.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_reflink.c | 28 ++++------------------------
 1 file changed, 4 insertions(+), 24 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index a279b4e..b19a7e0 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -244,11 +244,9 @@ xfs_reflink_reserve_cow(
 	bool			*shared)
 {
 	struct xfs_bmbt_irec	got, prev;
-	xfs_fileoff_t		end_fsb, orig_end_fsb;
 	int			eof = 0, error = 0;
 	bool			trimmed;
 	xfs_extnum_t		idx;
-	xfs_extlen_t		align;
 
 	/*
 	 * Search the COW fork extent list first.  This serves two purposes:
@@ -285,32 +283,14 @@ xfs_reflink_reserve_cow(
 	if (error)
 		return error;
 
-	end_fsb = orig_end_fsb = imap->br_startoff + imap->br_blockcount;
 
-	align = xfs_eof_alignment(ip, xfs_get_cowextsz_hint(ip));
-	if (align)
-		end_fsb = roundup_64(end_fsb, align);
-
-retry:
 	error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff,
-			end_fsb - imap->br_startoff, &got, &prev, &idx, eof);
-	switch (error) {
-	case 0:
-		break;
-	case -ENOSPC:
-	case -EDQUOT:
-		/* retry without any preallocation */
-		trace_xfs_reflink_cow_enospc(ip, imap);
-		if (end_fsb != orig_end_fsb) {
-			end_fsb = orig_end_fsb;
-			goto retry;
-		}
-		/*FALLTHRU*/
-	default:
+			imap->br_blockcount, &got, &prev, &idx, eof);
+	if (error)
 		return error;
-	}
 
-	if (end_fsb != orig_end_fsb)
+	if (imap->br_startoff != got.br_startoff ||
+	    imap->br_blockcount != got.br_blockcount)
 		xfs_inode_set_cowblocks_tag(ip);
 
 	trace_xfs_reflink_cow_alloc(ip, &got);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range
  2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
  2016-11-08 20:27 ` [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
@ 2016-11-08 20:27 ` Brian Foster
  2016-11-15 14:18   ` Christoph Hellwig
  2016-11-08 20:27 ` [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc Brian Foster
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-08 20:27 UTC (permalink / raw)
  To: linux-xfs

The xfs_file_iomap_begin_delay() function currently converts the bmbt
record output from the xfs_bmapi_reserve_delalloc() call to the iomap
mapping for the higher level iomap code. In preparation to reuse
xfs_file_iomap_begin_delay() for data fork and COW fork allocation,
logically separate the iomap mapping provided to the caller from the
bmbt record returned by xfs_bmapi_reserve_delalloc().

This is necessary because while COW reservation involves delalloc
allocation to the COW fork, the mapping returned to the caller must
still refer to the shared blocks from the data fork. Note that this
patch does not change behavior in any way.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 193aee4..7446531 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -539,6 +539,7 @@ xfs_file_iomap_begin_delay(
 	int			error = 0, eof = 0;
 	struct xfs_bmbt_irec	got;
 	struct xfs_bmbt_irec	prev;
+	struct xfs_bmbt_irec	imap;	/* for iomap */
 	xfs_extnum_t		idx;
 
 	ASSERT(!XFS_IS_REALTIME_INODE(ip));
@@ -565,19 +566,20 @@ xfs_file_iomap_begin_delay(
 
 	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
 			&got, &prev);
+	imap = got;
 	if (!eof && got.br_startoff <= offset_fsb) {
 		if (xfs_is_reflink_inode(ip)) {
 			bool		shared;
 
 			end_fsb = min(XFS_B_TO_FSB(mp, offset + count),
 					maxbytes_fsb);
-			xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb);
-			error = xfs_reflink_reserve_cow(ip, &got, &shared);
+			xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
+			error = xfs_reflink_reserve_cow(ip, &imap, &shared);
 			if (error)
 				goto out_unlock;
 		}
 
-		trace_xfs_iomap_found(ip, offset, count, 0, &got);
+		trace_xfs_iomap_found(ip, offset, count, 0, &imap);
 		goto done;
 	}
 
@@ -648,17 +650,18 @@ xfs_file_iomap_begin_delay(
 		xfs_inode_set_eofblocks_tag(ip);
 
 	trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
+	imap = got;
 done:
-	if (isnullstartblock(got.br_startblock))
-		got.br_startblock = DELAYSTARTBLOCK;
+	if (isnullstartblock(imap.br_startblock))
+		imap.br_startblock = DELAYSTARTBLOCK;
 
-	if (!got.br_startblock) {
-		error = xfs_alert_fsblock_zero(ip, &got);
+	if (!imap.br_startblock) {
+		error = xfs_alert_fsblock_zero(ip, &imap);
 		if (error)
 			goto out_unlock;
 	}
 
-	xfs_bmbt_to_iomap(ip, iomap, &got);
+	xfs_bmbt_to_iomap(ip, iomap, &imap);
 
 out_unlock:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
  2016-11-08 20:27 ` [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
  2016-11-08 20:27 ` [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range Brian Foster
@ 2016-11-08 20:27 ` Brian Foster
  2016-11-15 14:28   ` Christoph Hellwig
  2016-11-08 20:27 ` [PATCH RFC 4/4] xfs: implement basic COW fork speculative preallocation Brian Foster
  2016-11-08 20:48 ` [PATCH RFC 0/4] xfs: basic cow " Darrick J. Wong
  4 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-08 20:27 UTC (permalink / raw)
  To: linux-xfs

COW fork reservation (delayed allocation) is implemented in
xfs_reflink_reserve_cow() and is generally based on the traditional data
fork delalloc logic in xfs_file_iomap_begin_delay(). In preparation for
further changes to implement more aggressive COW fork preallocation,
refactor the COW reservation code to reuse xfs_file_iomap_begin_delay()
for data fork allocation or COW fork reservation. This patch does not
change behavior.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c | 79 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 58 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 7446531..40bf66c 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -532,7 +532,7 @@ xfs_file_iomap_begin_delay(
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
-	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
+	xfs_fileoff_t		offset_fsb;
 	xfs_fileoff_t		maxbytes_fsb =
 		XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
 	xfs_fileoff_t		end_fsb, orig_end_fsb;
@@ -541,10 +541,14 @@ xfs_file_iomap_begin_delay(
 	struct xfs_bmbt_irec	prev;
 	struct xfs_bmbt_irec	imap;	/* for iomap */
 	xfs_extnum_t		idx;
+	int			fork = XFS_DATA_FORK;
 
 	ASSERT(!XFS_IS_REALTIME_INODE(ip));
 	ASSERT(!xfs_get_extsz_hint(ip));
 
+	offset_fsb = XFS_B_TO_FSBT(mp, offset);
+	end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
+
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 
 	if (unlikely(XFS_TEST_ERROR(
@@ -564,23 +568,50 @@ xfs_file_iomap_begin_delay(
 			goto out_unlock;
 	}
 
+	/*
+	 * Search for a preexisting extent. COW fork allocation may still be
+	 * required for reflink inodes if the data extent is shared.
+	 */
 	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
 			&got, &prev);
 	imap = got;
 	if (!eof && got.br_startoff <= offset_fsb) {
 		if (xfs_is_reflink_inode(ip)) {
-			bool		shared;
+			bool		shared, trimmed;
 
-			end_fsb = min(XFS_B_TO_FSB(mp, offset + count),
-					maxbytes_fsb);
-			xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
-			error = xfs_reflink_reserve_cow(ip, &imap, &shared);
+			/*
+			 * Assume the data extent is shared if an extent exists
+			 * in the cow fork.
+			 */
+			xfs_trim_extent(&imap, offset_fsb,
+					end_fsb - offset_fsb);
+			xfs_bmap_search_extents(ip, imap.br_startoff,
+					XFS_COW_FORK, &eof, &idx, &got, &prev);
+			if (!eof && got.br_startoff <= imap.br_startoff) {
+				trace_xfs_reflink_cow_found(ip, &got);
+				xfs_trim_extent(&imap, got.br_startoff,
+						got.br_blockcount);
+				goto done;
+			}
+
+			/*
+			 * No existing cow fork extent. Now we have to actually
+			 * check if the data extent is shared and trim the
+			 * mapping to the next (un)shared boundary.
+			 */
+			error = xfs_reflink_trim_around_shared(ip, &imap,
+						       &shared, &trimmed);
 			if (error)
 				goto out_unlock;
+			if (!shared)
+				goto done;
+
+			end_fsb = imap.br_startoff + imap.br_blockcount;
+			fork = XFS_COW_FORK;
+		} else {
+			trace_xfs_iomap_found(ip, offset, count, 0, &imap);
+			goto done;
 		}
-
-		trace_xfs_iomap_found(ip, offset, count, 0, &imap);
-		goto done;
 	}
 
 	error = xfs_qm_dqattach_locked(ip, 0);
@@ -597,10 +628,10 @@ xfs_file_iomap_begin_delay(
 	 * the lower level functions are updated.
 	 */
 	count = min_t(loff_t, count, 1024 * PAGE_SIZE);
-	end_fsb = orig_end_fsb =
-		min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb);
+	end_fsb = orig_end_fsb = min(XFS_B_TO_FSB(mp, offset + count), end_fsb);
+	xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
 
-	if (eof) {
+	if (eof && fork == XFS_DATA_FORK) {
 		xfs_fsblock_t	prealloc_blocks;
 
 		prealloc_blocks =
@@ -623,9 +654,8 @@ xfs_file_iomap_begin_delay(
 	}
 
 retry:
-	error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb,
-			end_fsb - offset_fsb, &got,
-			&prev, &idx, eof);
+	error = xfs_bmapi_reserve_delalloc(ip, fork, offset_fsb,
+			end_fsb - offset_fsb, &got, &prev, &idx, eof);
 	switch (error) {
 	case 0:
 		break;
@@ -643,14 +673,21 @@ xfs_file_iomap_begin_delay(
 	}
 
 	/*
-	 * Tag the inode as speculatively preallocated so we can reclaim this
-	 * space on demand, if necessary.
+	 * Tag the inode if we've added post-eof or cow fork preallocation so we
+	 * can reclaim this space on demand.
 	 */
-	if (end_fsb != orig_end_fsb)
-		xfs_inode_set_eofblocks_tag(ip);
+	if (fork == XFS_DATA_FORK) {
+		trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
+		if (end_fsb != orig_end_fsb)
+			xfs_inode_set_eofblocks_tag(ip);
+		imap = got;
+	} else {
+		trace_xfs_reflink_cow_alloc(ip, &got);
+		if (got.br_startblock != offset_fsb ||
+		    got.br_blockcount != end_fsb - offset_fsb)
+			xfs_inode_set_cowblocks_tag(ip);
+	}
 
-	trace_xfs_iomap_alloc(ip, offset, count, 0, &got);
-	imap = got;
 done:
 	if (isnullstartblock(imap.br_startblock))
 		imap.br_startblock = DELAYSTARTBLOCK;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 4/4] xfs: implement basic COW fork speculative preallocation
  2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
                   ` (2 preceding siblings ...)
  2016-11-08 20:27 ` [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc Brian Foster
@ 2016-11-08 20:27 ` Brian Foster
  2016-11-08 20:48 ` [PATCH RFC 0/4] xfs: basic cow " Darrick J. Wong
  4 siblings, 0 replies; 18+ messages in thread
From: Brian Foster @ 2016-11-08 20:27 UTC (permalink / raw)
  To: linux-xfs

COW fork preallocation is currently limited to what is specified by the
COW extent size hint, which is typically much less aggressive than
traditional speculative preallocation added when sufficiently large
files are extended. This type of algorithm is not relevant for COW
reservation since by design, COW reservation never involves extending
the size of a file.

That said, we can be more aggressive with COW fork preallocation given
that we have extended the same inode tagging and reclaim infrastructure
used for post-eof preallocation to support COW fork preallocation. This
provides the ability to reclaim COW fork preallocation in the background
or on demand.

As such, add a simple COW fork speculative preallocation algorithm that
extends COW fork reservations due to file writes out to the next data
fork extent, unshared boundary or the next preexisting extent in the COW
fork, whichever limit we hit first. This provides a prealloc algorithm
that, like post-eof speculative preallocation, is based on the size of
preexisting extents.

XXX: This requires refinements such as throttling, reclaim, etc., as
noted in the comments.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c | 48 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 40bf66c..43936aa 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -540,8 +540,11 @@ xfs_file_iomap_begin_delay(
 	struct xfs_bmbt_irec	got;
 	struct xfs_bmbt_irec	prev;
 	struct xfs_bmbt_irec	imap;	/* for iomap */
+	struct xfs_bmbt_irec	drec;	/* raw data fork record */
 	xfs_extnum_t		idx;
 	int			fork = XFS_DATA_FORK;
+	bool			shared;
+	bool			trimmed;
 
 	ASSERT(!XFS_IS_REALTIME_INODE(ip));
 	ASSERT(!xfs_get_extsz_hint(ip));
@@ -574,11 +577,9 @@ xfs_file_iomap_begin_delay(
 	 */
 	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
 			&got, &prev);
-	imap = got;
+	drec = imap = got;
 	if (!eof && got.br_startoff <= offset_fsb) {
 		if (xfs_is_reflink_inode(ip)) {
-			bool		shared, trimmed;
-
 			/*
 			 * Assume the data extent is shared if an extent exists
 			 * in the cow fork.
@@ -651,6 +652,47 @@ xfs_file_iomap_begin_delay(
 			end_fsb = min(end_fsb, maxbytes_fsb);
 			ASSERT(end_fsb > offset_fsb);
 		}
+	} else if (fork == XFS_COW_FORK && !trimmed) {
+		struct xfs_bmbt_irec	tmp = drec;
+		xfs_extlen_t		len;
+
+		/*
+		 * If we get here, we have a shared data extent without a COW
+		 * fork reservation and the range of the write doesn't cross an
+		 * unshared boundary. To implement COW fork preallocation,
+		 * allocate as much as possible up until the next data fork
+		 * extent, the next data fork unshared boundary or the next
+		 * existing extent in the COW fork.
+		 */
+		ASSERT(shared && offset_fsb >= tmp.br_startoff);
+
+		/*
+		 * Trim the original data fork extent to the start of the write
+		 * and the next unshared boundary. This defines the maximum COW
+		 * fork preallocation. bmapi_reserve_delalloc() will trim to the
+		 * next COW fork extent (got) if one exists.
+		 */
+		len = tmp.br_blockcount - (offset_fsb - tmp.br_startoff);
+		xfs_trim_extent(&tmp, offset_fsb, len);
+		error = xfs_reflink_trim_around_shared(ip, &tmp, &shared,
+						       &trimmed);
+		if (error)
+			goto out_unlock;
+		ASSERT(shared);
+		end_fsb = tmp.br_startoff + tmp.br_blockcount;
+
+		/*
+		 * TODO:
+		 * - Throttling based on low free space conditions (try to
+		 *   refactor into xfs_iomap_prealloc_size()).
+		 * - Associated scan/reclaim mechanism on buffered write ENOSPC.
+		 * - Alignment? Might not want to overlap unshared blocks.
+		 *   bmapi_reserve_delalloc() might do this anyways due to
+		 *   cowextszhint.
+		 * - Adopt similar cowextsz hint behavior as for traditional
+		 *   extsz hint? E.g., cowextsz hint overrides prealloc?
+		 * 	- allocsize mount option?
+		 */
 	}
 
 retry:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation
  2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
                   ` (3 preceding siblings ...)
  2016-11-08 20:27 ` [PATCH RFC 4/4] xfs: implement basic COW fork speculative preallocation Brian Foster
@ 2016-11-08 20:48 ` Darrick J. Wong
  2016-11-08 22:39   ` Brian Foster
  4 siblings, 1 reply; 18+ messages in thread
From: Darrick J. Wong @ 2016-11-08 20:48 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Nov 08, 2016 at 03:27:32PM -0500, Brian Foster wrote:
> Hi all,
> 
> This is an experiment based on an idea for COW fork speculative
> preallocation. This is experimental, lightly/barely tested and sent in
> RFC form to solicit thoughts, ideas or flames before I spend time taking
> it further.
> 
> Patch 1 probably stands on its own. Patches 2 and 3 are some refactoring
> and patch 4 implements the basic idea, which is described in the commit
> log description. The testing I've done so far is basically similar to
> how one would test the effects of traditional speculative preallocation:
> write to multiple reflinked files in parallel and examine the resulting
> fragmentation. Specifically, I wrote sequentially to 16 different
> reflinked files of the same 8GB original (which has two data extents,
> completely shared). Without preallocation, the test results in ~248
> extents across the 16 files. With preallocation, the test results in 32
> extents across the 16 files (i.e., 2 extents per file, same as the
> source file).
> 
> An obvious tradeoff is the unnecessarily aggressive allocation that
> might occur in the event of random writes to a large file (such as in
> the cloned VM disk image use case), but my thinking is that the
> cowblocks tagging and reclaim infrastructure should manage that
> sufficiently (lack of testing notwithstanding). In any event, I'm
> interested in any thoughts along the lines of whether this is useful at
> all, alternative algorithm ideas, etc.

Was about to step out to lunch when this came in, but...

Is there an xfstest for this, so I can play too? :)

As far as random writes go, some of the reflink tests look at fragmentation
behavior.  generic/301 generic/302 xfs/180 xfs/182 xfs/184 xfs/192 xfs/193
xfs/198 xfs/200 xfs/204 xfs/208 xfs/208 xfs/211 xfs/215 xfs/218 xfs/219 xfs/221
xfs/223 xfs/224 xfs/225 xfs/226 xfs/228 xfs/230 xfs/231 xfs/232 xfs/344 xfs/345
xfs/346 xfs/347 are the ones that grep 'new extents:' picked up.

Will look at the patches when I get back.

--D

> 
> Brian
> 
> Brian Foster (4):
>   xfs: clean up cow fork reservation and tag inodes correctly
>   xfs: logically separate iomap range from allocation range
>   xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
>   xfs: implement basic COW fork speculative preallocation
> 
>  fs/xfs/xfs_iomap.c   | 132 +++++++++++++++++++++++++++++++++++++++++----------
>  fs/xfs/xfs_reflink.c |  28 ++---------
>  2 files changed, 111 insertions(+), 49 deletions(-)
> 
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation
  2016-11-08 20:48 ` [PATCH RFC 0/4] xfs: basic cow " Darrick J. Wong
@ 2016-11-08 22:39   ` Brian Foster
  2016-11-08 23:34     ` Dave Chinner
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-08 22:39 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Nov 08, 2016 at 12:48:00PM -0800, Darrick J. Wong wrote:
> On Tue, Nov 08, 2016 at 03:27:32PM -0500, Brian Foster wrote:
> > Hi all,
> > 
> > This is an experiment based on an idea for COW fork speculative
> > preallocation. This is experimental, lightly/barely tested and sent in
> > RFC form to solicit thoughts, ideas or flames before I spend time taking
> > it further.
> > 
> > Patch 1 probably stands on its own. Patches 2 and 3 are some refactoring
> > and patch 4 implements the basic idea, which is described in the commit
> > log description. The testing I've done so far is basically similar to
> > how one would test the effects of traditional speculative preallocation:
> > write to multiple reflinked files in parallel and examine the resulting
> > fragmentation. Specifically, I wrote sequentially to 16 different
> > reflinked files of the same 8GB original (which has two data extents,
> > completely shared). Without preallocation, the test results in ~248
> > extents across the 16 files. With preallocation, the test results in 32
> > extents across the 16 files (i.e., 2 extents per file, same as the
> > source file).
> > 
> > An obvious tradeoff is the unnecessarily aggressive allocation that
> > might occur in the event of random writes to a large file (such as in
> > the cloned VM disk image use case), but my thinking is that the
> > cowblocks tagging and reclaim infrastructure should manage that
> > sufficiently (lack of testing notwithstanding). In any event, I'm
> > interested in any thoughts along the lines of whether this is useful at
> > all, alternative algorithm ideas, etc.
> 
> Was about to step out to lunch when this came in, but...
> 
> Is there an xfstest for this, so I can play too? :)
> 

Not yet.. I've only xfstests tested insofar as it hasn't blown anything
up yet. :) Otherwise, I've just run manual write tests to observe
whether it is doing what I expect it to in simple cases. It clearly
needs more work, as noted in the patch, but if this is something worth
pursuing further I can certainly come up with some tests as well.

FWIW, that COW fork fiemap hack I sent a bit ago came in handy for
playing with this as well. :)

> As far as random writes go, some of the reflink tests look at fragmentation
> behavior.  generic/301 generic/302 xfs/180 xfs/182 xfs/184 xfs/192 xfs/193
> xfs/198 xfs/200 xfs/204 xfs/208 xfs/208 xfs/211 xfs/215 xfs/218 xfs/219 xfs/221
> xfs/223 xfs/224 xfs/225 xfs/226 xfs/228 xfs/230 xfs/231 xfs/232 xfs/344 xfs/345
> xfs/346 xfs/347 are the ones that grep 'new extents:' picked up.
> 

Ok.

> Will look at the patches when I get back.
> 

Thanks!

Brian

> --D
> 
> > 
> > Brian
> > 
> > Brian Foster (4):
> >   xfs: clean up cow fork reservation and tag inodes correctly
> >   xfs: logically separate iomap range from allocation range
> >   xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
> >   xfs: implement basic COW fork speculative preallocation
> > 
> >  fs/xfs/xfs_iomap.c   | 132 +++++++++++++++++++++++++++++++++++++++++----------
> >  fs/xfs/xfs_reflink.c |  28 ++---------
> >  2 files changed, 111 insertions(+), 49 deletions(-)
> > 
> > -- 
> > 2.7.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation
  2016-11-08 22:39   ` Brian Foster
@ 2016-11-08 23:34     ` Dave Chinner
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Chinner @ 2016-11-08 23:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: Darrick J. Wong, linux-xfs

On Tue, Nov 08, 2016 at 05:39:31PM -0500, Brian Foster wrote:
> On Tue, Nov 08, 2016 at 12:48:00PM -0800, Darrick J. Wong wrote:
> > On Tue, Nov 08, 2016 at 03:27:32PM -0500, Brian Foster wrote:
> > > Hi all,
> > > 
> > > This is an experiment based on an idea for COW fork speculative
> > > preallocation. This is experimental, lightly/barely tested and sent in
> > > RFC form to solicit thoughts, ideas or flames before I spend time taking
> > > it further.
> > > 
> > > Patch 1 probably stands on its own. Patches 2 and 3 are some refactoring
> > > and patch 4 implements the basic idea, which is described in the commit
> > > log description. The testing I've done so far is basically similar to
> > > how one would test the effects of traditional speculative preallocation:
> > > write to multiple reflinked files in parallel and examine the resulting
> > > fragmentation. Specifically, I wrote sequentially to 16 different
> > > reflinked files of the same 8GB original (which has two data extents,
> > > completely shared). Without preallocation, the test results in ~248
> > > extents across the 16 files. With preallocation, the test results in 32
> > > extents across the 16 files (i.e., 2 extents per file, same as the
> > > source file).
> > > 
> > > An obvious tradeoff is the unnecessarily aggressive allocation that
> > > might occur in the event of random writes to a large file (such as in
> > > the cloned VM disk image use case), but my thinking is that the
> > > cowblocks tagging and reclaim infrastructure should manage that
> > > sufficiently (lack of testing notwithstanding). In any event, I'm
> > > interested in any thoughts along the lines of whether this is useful at
> > > all, alternative algorithm ideas, etc.
> > 
> > Was about to step out to lunch when this came in, but...
> > 
> > Is there an xfstest for this, so I can play too? :)
> > 
> 
> Not yet.. I've only xfstests tested insofar as it hasn't blown anything
> up yet. :) Otherwise, I've just run manual write tests to observe
> whether it is doing what I expect it to in simple cases. It clearly
> needs more work, as noted in the patch, but if this is something worth
> pursuing further I can certainly come up with some tests as well.

I think it definitely has value for preventing COW overwrite
fragmentation - this will be an issue if people start reflinking
files widely (e.g. container roots) and then occasionally 
overwriting files completely.

> FWIW, that COW fork fiemap hack I sent a bit ago came in handy for
> playing with this as well. :)

It might be worth keeping these two patchsets together for the
purposes of development and review. The fiemap hack by itself is
neat, but having a demonstrated use for development of new features
makes it more than just a "neat hack". :P

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-08 20:27 ` [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
@ 2016-11-15 14:16   ` Christoph Hellwig
  2016-11-15 18:11     ` Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2016-11-15 14:16 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

> +	if (imap->br_startoff != got.br_startoff ||
> +	    imap->br_blockcount != got.br_blockcount)
>  		xfs_inode_set_cowblocks_tag(ip);

Can't got.br_blockcount be smaller than imap->br_blockcount if we have
an existing COW fork reservation lying around behind the whole we're
filling?  Also they way xfs_bmapi_reserve_delalloc works the startoff
will be the same.  E.g. this check should probably be:

	if (got.br_blockcount > imap->br_blockcount)

Except for that the patch looks good and is a nice cleanup.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range
  2016-11-08 20:27 ` [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range Brian Foster
@ 2016-11-15 14:18   ` Christoph Hellwig
  2016-11-15 18:11     ` Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2016-11-15 14:18 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Nov 08, 2016 at 03:27:34PM -0500, Brian Foster wrote:
> The xfs_file_iomap_begin_delay() function currently converts the bmbt
> record output from the xfs_bmapi_reserve_delalloc() call to the iomap
> mapping for the higher level iomap code. In preparation to reuse
> xfs_file_iomap_begin_delay() for data fork and COW fork allocation,
> logically separate the iomap mapping provided to the caller from the
> bmbt record returned by xfs_bmapi_reserve_delalloc().
> 
> This is necessary because while COW reservation involves delalloc
> allocation to the COW fork, the mapping returned to the caller must
> still refer to the shared blocks from the data fork. Note that this
> patch does not change behavior in any way.

On it's own this patch looks highly confusing.  I'll keep reading
the rest of the series if it makes more sense with that, but in doubt
it probably should be merged into whatever patches it helps with.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  2016-11-08 20:27 ` [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc Brian Foster
@ 2016-11-15 14:28   ` Christoph Hellwig
  2016-11-15 18:11     ` Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2016-11-15 14:28 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

> +	/*
> +	 * Search for a preexisting extent. COW fork allocation may still be
> +	 * required for reflink inodes if the data extent is shared.
> +	 */
>  	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
>  			&got, &prev);
>  	imap = got;

Maybe we should look up directly into imap and now duplicate that
information for imap and got?

>  		if (xfs_is_reflink_inode(ip)) {
> +			bool		shared, trimmed;
>  
> +			/*
> +			 * Assume the data extent is shared if an extent exists
> +			 * in the cow fork.
> +			 */
> +			xfs_trim_extent(&imap, offset_fsb,
> +					end_fsb - offset_fsb);
> +			xfs_bmap_search_extents(ip, imap.br_startoff,
> +					XFS_COW_FORK, &eof, &idx, &got, &prev);
> +			if (!eof && got.br_startoff <= imap.br_startoff) {
> +				trace_xfs_reflink_cow_found(ip, &got);
> +				xfs_trim_extent(&imap, got.br_startoff,
> +						got.br_blockcount);
> +				goto done;
> +			}
> +
> +			/*
> +			 * No existing cow fork extent. Now we have to actually
> +			 * check if the data extent is shared and trim the
> +			 * mapping to the next (un)shared boundary.
> +			 */
> +			error = xfs_reflink_trim_around_shared(ip, &imap,
> +						       &shared, &trimmed);
>  			if (error)
>  				goto out_unlock;
> +			if (!shared)
> +				goto done;
> +
> +			end_fsb = imap.br_startoff + imap.br_blockcount;

I think the code for looking at the COW fork should go into a little
helper, even if it means passing a few arguments to it.

> +			fork = XFS_COW_FORK;
> +		} else {
> +			trace_xfs_iomap_found(ip, offset, count, 0, &imap);
> +			goto done;
>  		}

And while we're at it - try to always handle the a that ends with a
goto first, that helps to reduce the indentation level.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-15 14:16   ` Christoph Hellwig
@ 2016-11-15 18:11     ` Brian Foster
  2016-11-18  8:11       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-15 18:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Tue, Nov 15, 2016 at 06:16:21AM -0800, Christoph Hellwig wrote:
> > +	if (imap->br_startoff != got.br_startoff ||
> > +	    imap->br_blockcount != got.br_blockcount)
> >  		xfs_inode_set_cowblocks_tag(ip);
> 
> Can't got.br_blockcount be smaller than imap->br_blockcount if we have
> an existing COW fork reservation lying around behind the whole we're
> filling?  Also they way xfs_bmapi_reserve_delalloc works the startoff
> will be the same.  E.g. this check should probably be:
> 

Good point, though I think it can be smaller or larger without
necessarily having preallocation due to being merged with surrounding
extents. I'm not quite sure what the right answer for that is with
regard to tagging, but I do think it's better to have false positive
tagging than false negatives.

startoff can actually change due to merges as well, but merging aside,
I'm not following how you expect startoff to always be the same in the
event of the extent size hint. E.g., I see the following on a quick test
(file is a 1m reflinked file):

# xfs_io -c fiemap -c "fiemap -c" /mnt/file 
/mnt/file:
        0: [0..2047]: 160..2207
/mnt/file:
# xfs_io -c "pwrite 32k 4k" /mnt/file 
wrote 4096/4096 bytes at offset 32768
4 KiB, 1 ops; 0.0000 sec (16.910 MiB/sec and 4329.0043 ops/sec)
# xfs_io -c fiemap -c "fiemap -c" /mnt/file 
/mnt/file:
        0: [0..63]: 160..223
        1: [64..71]: 2272..2279
        2: [72..2047]: 232..2207
/mnt/file:
        0: [0..63]: 2208..2271
        1: [64..71]: hole
        2: [72..255]: 2280..2463
        3: [256..2047]: hole

So the cow extent size hint rounds out the allocation in the cow fork to
the aligned start/end offsets. In fact, I think it would do so
regardless of whether those covered blocks are even shared, but that's a
separate issue.

Given all of that, I could tweak the check to:

	if (got.br_startoff < imap->br_startoff ||
	    got.br_blockcount > imap->br_blockcount)
		...

Thoughts?

Brian

> 	if (got.br_blockcount > imap->br_blockcount)
> 
> Except for that the patch looks good and is a nice cleanup.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range
  2016-11-15 14:18   ` Christoph Hellwig
@ 2016-11-15 18:11     ` Brian Foster
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Foster @ 2016-11-15 18:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Tue, Nov 15, 2016 at 06:18:04AM -0800, Christoph Hellwig wrote:
> On Tue, Nov 08, 2016 at 03:27:34PM -0500, Brian Foster wrote:
> > The xfs_file_iomap_begin_delay() function currently converts the bmbt
> > record output from the xfs_bmapi_reserve_delalloc() call to the iomap
> > mapping for the higher level iomap code. In preparation to reuse
> > xfs_file_iomap_begin_delay() for data fork and COW fork allocation,
> > logically separate the iomap mapping provided to the caller from the
> > bmbt record returned by xfs_bmapi_reserve_delalloc().
> > 
> > This is necessary because while COW reservation involves delalloc
> > allocation to the COW fork, the mapping returned to the caller must
> > still refer to the shared blocks from the data fork. Note that this
> > patch does not change behavior in any way.
> 
> On it's own this patch looks highly confusing.  I'll keep reading
> the rest of the series if it makes more sense with that, but in doubt
> it probably should be merged into whatever patches it helps with.

Heh, Ok. I'll revisit this when the rest of the code is more fleshed
out.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  2016-11-15 14:28   ` Christoph Hellwig
@ 2016-11-15 18:11     ` Brian Foster
  2016-11-18  8:13       ` Christoph Hellwig
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2016-11-15 18:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Tue, Nov 15, 2016 at 06:28:26AM -0800, Christoph Hellwig wrote:
> > +	/*
> > +	 * Search for a preexisting extent. COW fork allocation may still be
> > +	 * required for reflink inodes if the data extent is shared.
> > +	 */
> >  	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
> >  			&got, &prev);
> >  	imap = got;
> 
> Maybe we should look up directly into imap and now duplicate that
> information for imap and got?
> 

Didn't you recently change this code from doing that? I'm not following
how changing it back helps us...

> >  		if (xfs_is_reflink_inode(ip)) {
> > +			bool		shared, trimmed;
> >  
> > +			/*
> > +			 * Assume the data extent is shared if an extent exists
> > +			 * in the cow fork.
> > +			 */
> > +			xfs_trim_extent(&imap, offset_fsb,
> > +					end_fsb - offset_fsb);
> > +			xfs_bmap_search_extents(ip, imap.br_startoff,
> > +					XFS_COW_FORK, &eof, &idx, &got, &prev);
> > +			if (!eof && got.br_startoff <= imap.br_startoff) {
> > +				trace_xfs_reflink_cow_found(ip, &got);
> > +				xfs_trim_extent(&imap, got.br_startoff,
> > +						got.br_blockcount);
> > +				goto done;
> > +			}
> > +
> > +			/*
> > +			 * No existing cow fork extent. Now we have to actually
> > +			 * check if the data extent is shared and trim the
> > +			 * mapping to the next (un)shared boundary.
> > +			 */
> > +			error = xfs_reflink_trim_around_shared(ip, &imap,
> > +						       &shared, &trimmed);
> >  			if (error)
> >  				goto out_unlock;
> > +			if (!shared)
> > +				goto done;
> > +
> > +			end_fsb = imap.br_startoff + imap.br_blockcount;
> 
> I think the code for looking at the COW fork should go into a little
> helper, even if it means passing a few arguments to it.
> 

Ok.

> > +			fork = XFS_COW_FORK;
> > +		} else {
> > +			trace_xfs_iomap_found(ip, offset, count, 0, &imap);
> > +			goto done;
> >  		}
> 
> And while we're at it - try to always handle the a that ends with a
> goto first, that helps to reduce the indentation level.
> 

Indeed, that looks like a nice cleanup.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-15 18:11     ` Brian Foster
@ 2016-11-18  8:11       ` Christoph Hellwig
  2016-11-18 15:10         ` Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2016-11-18  8:11 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Tue, Nov 15, 2016 at 01:11:01PM -0500, Brian Foster wrote:
> On Tue, Nov 15, 2016 at 06:16:21AM -0800, Christoph Hellwig wrote:
> > > +	if (imap->br_startoff != got.br_startoff ||
> > > +	    imap->br_blockcount != got.br_blockcount)
> > >  		xfs_inode_set_cowblocks_tag(ip);
> > 
> > Can't got.br_blockcount be smaller than imap->br_blockcount if we have
> > an existing COW fork reservation lying around behind the whole we're
> > filling?  Also they way xfs_bmapi_reserve_delalloc works the startoff
> > will be the same.  E.g. this check should probably be:
> > 
> 
> Good point, though I think it can be smaller or larger without
> necessarily having preallocation due to being merged with surrounding
> extents. I'm not quite sure what the right answer for that is with
> regard to tagging, but I do think it's better to have false positive
> tagging than false negatives.

Good point, merging can change both the start and length.  Based on
that I think tagging in the caller of xfs_bmapi_reserve_delalloc is
wrong, and we need to do it inside xfs_bmapi_reserve_delalloc where
we know if we did speculative preallocation.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  2016-11-15 18:11     ` Brian Foster
@ 2016-11-18  8:13       ` Christoph Hellwig
  2016-11-18 15:11         ` Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2016-11-18  8:13 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Tue, Nov 15, 2016 at 01:11:25PM -0500, Brian Foster wrote:
> On Tue, Nov 15, 2016 at 06:28:26AM -0800, Christoph Hellwig wrote:
> > > +	/*
> > > +	 * Search for a preexisting extent. COW fork allocation may still be
> > > +	 * required for reflink inodes if the data extent is shared.
> > > +	 */
> > >  	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
> > >  			&got, &prev);
> > >  	imap = got;
> > 
> > Maybe we should look up directly into imap and now duplicate that
> > information for imap and got?
> > 
> 
> Didn't you recently change this code from doing that? I'm not following
> how changing it back helps us...

You only introduce imap in the previous patch.  I'd just try to avoid
copying the irec structures as much as possible.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly
  2016-11-18  8:11       ` Christoph Hellwig
@ 2016-11-18 15:10         ` Brian Foster
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Foster @ 2016-11-18 15:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Nov 18, 2016 at 12:11:46AM -0800, Christoph Hellwig wrote:
> On Tue, Nov 15, 2016 at 01:11:01PM -0500, Brian Foster wrote:
> > On Tue, Nov 15, 2016 at 06:16:21AM -0800, Christoph Hellwig wrote:
> > > > +	if (imap->br_startoff != got.br_startoff ||
> > > > +	    imap->br_blockcount != got.br_blockcount)
> > > >  		xfs_inode_set_cowblocks_tag(ip);
> > > 
> > > Can't got.br_blockcount be smaller than imap->br_blockcount if we have
> > > an existing COW fork reservation lying around behind the whole we're
> > > filling?  Also they way xfs_bmapi_reserve_delalloc works the startoff
> > > will be the same.  E.g. this check should probably be:
> > > 
> > 
> > Good point, though I think it can be smaller or larger without
> > necessarily having preallocation due to being merged with surrounding
> > extents. I'm not quite sure what the right answer for that is with
> > regard to tagging, but I do think it's better to have false positive
> > tagging than false negatives.
> 
> Good point, merging can change both the start and length.  Based on
> that I think tagging in the caller of xfs_bmapi_reserve_delalloc is
> wrong, and we need to do it inside xfs_bmapi_reserve_delalloc where
> we know if we did speculative preallocation.

xfs_bmapi_reserve_delalloc() doesn't really know if it's doing
preallocation outside of the extent size hint alignment. I don't think
we want to try and bury all of the prealloc stuff down in the bmapi
code, though we may be able to add a prealloc parameter to
bmapi_reserve_delalloc() such that it can add the prealloc itself and
also tell the difference between prealloc and merge cases. I'll play
around with it, thanks.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc
  2016-11-18  8:13       ` Christoph Hellwig
@ 2016-11-18 15:11         ` Brian Foster
  0 siblings, 0 replies; 18+ messages in thread
From: Brian Foster @ 2016-11-18 15:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Nov 18, 2016 at 12:13:02AM -0800, Christoph Hellwig wrote:
> On Tue, Nov 15, 2016 at 01:11:25PM -0500, Brian Foster wrote:
> > On Tue, Nov 15, 2016 at 06:28:26AM -0800, Christoph Hellwig wrote:
> > > > +	/*
> > > > +	 * Search for a preexisting extent. COW fork allocation may still be
> > > > +	 * required for reflink inodes if the data extent is shared.
> > > > +	 */
> > > >  	xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx,
> > > >  			&got, &prev);
> > > >  	imap = got;
> > > 
> > > Maybe we should look up directly into imap and now duplicate that
> > > information for imap and got?
> > > 
> > 
> > Didn't you recently change this code from doing that? I'm not following
> > how changing it back helps us...
> 
> You only introduce imap in the previous patch.  I'd just try to avoid
> copying the irec structures as much as possible.
> 

Ok, this took some playing around to try and get right since the imap
record must continue to refer to the data extent (trimmed appropriately
and whatnot) in the cow reservation case.. It's possible there's some
unnecessary duplication in the current form. I'll make another pass at
that once I have this rebased against your extent search cleanups.

Brian

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-11-18 15:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-08 20:27 [PATCH RFC 0/4] xfs: basic cow fork speculative preallocation Brian Foster
2016-11-08 20:27 ` [PATCH RFC 1/4] xfs: clean up cow fork reservation and tag inodes correctly Brian Foster
2016-11-15 14:16   ` Christoph Hellwig
2016-11-15 18:11     ` Brian Foster
2016-11-18  8:11       ` Christoph Hellwig
2016-11-18 15:10         ` Brian Foster
2016-11-08 20:27 ` [PATCH RFC 2/4] xfs: logically separate iomap range from allocation range Brian Foster
2016-11-15 14:18   ` Christoph Hellwig
2016-11-15 18:11     ` Brian Foster
2016-11-08 20:27 ` [PATCH RFC 3/4] xfs: reuse xfs_file_iomap_begin_delay() for cow fork delalloc Brian Foster
2016-11-15 14:28   ` Christoph Hellwig
2016-11-15 18:11     ` Brian Foster
2016-11-18  8:13       ` Christoph Hellwig
2016-11-18 15:11         ` Brian Foster
2016-11-08 20:27 ` [PATCH RFC 4/4] xfs: implement basic COW fork speculative preallocation Brian Foster
2016-11-08 20:48 ` [PATCH RFC 0/4] xfs: basic cow " Darrick J. Wong
2016-11-08 22:39   ` Brian Foster
2016-11-08 23:34     ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.