All of lore.kernel.org
 help / color / mirror / Atom feed
* more reflink fixes & debug
@ 2018-09-20 14:42 Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

Hi all,

this series starts adds more reflink fixes on top of the series
from two days ago, plus two new debug patches to dump leftover
delalloc extents and a debug-only always_cow mode.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] xfs: cancel COW blocks before swapext
  2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
@ 2018-09-20 14:42 ` Christoph Hellwig
  2018-09-20 20:04   ` Darrick J. Wong
  2018-09-27 15:07   ` Brian Foster
  2018-09-20 14:42 ` [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow Christoph Hellwig
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

We need to make sure we have no outstanding COW blocks before we swap
extents, as there is nothing preventing us from having COW preallocations
on an inode that swapext is called on.  That case can easily be
reproduced by the upcoming always_cow mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_bmap_util.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 414dbc31139c..e0e9cbc98ccd 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1515,6 +1515,12 @@ xfs_swap_extent_flush(
 		return error;
 	truncate_pagecache_range(VFS_I(ip), 0, -1);
 
+	if (xfs_inode_has_cow_data(ip)) {
+		error = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true);
+		if (error)
+			return error;
+	}
+
 	/* Verify O_DIRECT for ftmp */
 	if (VFS_I(ip)->i_mapping->nrpages)
 		return -EINVAL;
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow
  2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
@ 2018-09-20 14:42 ` Christoph Hellwig
  2018-09-20 20:22   ` Darrick J. Wong
  2018-09-20 14:42 ` [PATCH 3/5] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

The iomap direct I/O code issues a single ->end_io call for the whole
I/O request, and if some of the extents cowered needed a COW operation
it will call xfs_reflink_end_cow over the whole range.

When we do AIO writes we drop the iolock after doing the initial setup,
but before the I/O completion.  Between dropping the lock and completing
the I/O we can have a racing buffered write create new delalloc COW fork
extents in the region covered by the outstanding direct I/O write, and
thus see delalloc COW fork extents in xfs_reflink_end_cow.  As
concurrent writes are fundamentally racy and no guarantees are given we
can simply skip those.

This can be easily reproduced with xfstests generic/208 in always_cow
mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_reflink.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index ac94ace45424..d1758771f21a 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -686,14 +686,12 @@ xfs_reflink_end_cow(
 		if (!del.br_blockcount)
 			goto prev_extent;
 
-		ASSERT(!isnullstartblock(got.br_startblock));
-
 		/*
-		 * Don't remap unwritten extents; these are
-		 * speculatively preallocated CoW extents that have been
-		 * allocated but have not yet been involved in a write.
+		 * Only remap real extent that contain data.  With AIO
+		 * speculatively preallocations can leak into the range we
+		 * are called upon, and we need to skip them.
 		 */
-		if (got.br_state == XFS_EXT_UNWRITTEN)
+		if (!xfs_bmap_is_real_extent(&got))
 			goto prev_extent;
 
 		/* Unmap the old blocks in the data fork. */
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] xfs: fix fork selection in xfs_find_trim_cow_extent
  2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow Christoph Hellwig
@ 2018-09-20 14:42 ` Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
  2018-09-20 14:42 ` [PATCH 5/5] xfs: introduce an always_cow mode Christoph Hellwig
  4 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

We should want to write directly into the data fork for blocks that don't
have an extent in the COW fork covering them yet.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_reflink.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index d1758771f21a..b25ea7a3a0e7 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -369,9 +369,13 @@ xfs_find_trim_cow_extent(
 	 * If we don't find an overlapping extent, trim the range we need to
 	 * allocate to fit the hole we found.
 	 */
-	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got) ||
-	    got.br_startoff > offset_fsb)
+	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got))
+		got.br_startoff = offset_fsb + count_fsb;
+	if (got.br_startoff > offset_fsb) {
+		xfs_trim_extent(imap, imap->br_startoff,
+				got.br_startoff - imap->br_startoff);
 		return xfs_reflink_trim_around_shared(ip, imap, shared);
+	}
 
 	*shared = true;
 	if (isnullstartblock(got.br_startblock)) {
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] xfs: print dangling delalloc extents
  2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
                   ` (2 preceding siblings ...)
  2018-09-20 14:42 ` [PATCH 3/5] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
@ 2018-09-20 14:42 ` Christoph Hellwig
  2018-09-20 20:06   ` Darrick J. Wong
  2018-09-27 15:07   ` Brian Foster
  2018-09-20 14:42 ` [PATCH 5/5] xfs: introduce an always_cow mode Christoph Hellwig
  4 siblings, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

Instead of just asserting that we have no delalloc space dangling
in an inode that gets freed print the actual offenders for debug
mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_super.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 207ee302b1bb..62850b420dc7 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -933,6 +933,31 @@ xfs_fs_alloc_inode(
 	return NULL;
 }
 
+#ifdef DEBUG
+static void
+xfs_check_delalloc(
+	struct xfs_inode	*ip,
+	int			whichfork)
+{
+	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
+	struct xfs_bmbt_irec	got;
+	struct xfs_iext_cursor	icur;
+
+	if (!ifp || !xfs_iext_lookup_extent(ip, ifp, 0, &icur, &got))
+		return;
+	do {
+		if (isnullstartblock(got.br_startblock)) {
+			xfs_warn(ip->i_mount,
+				"%s fork has delalloc extent at [0x%llx:0x%llx]",
+				whichfork == XFS_DATA_FORK ? "data" : "cow",
+				got.br_startoff, got.br_blockcount);
+		}
+	} while (xfs_iext_next_extent(ifp, &icur, &got));
+}
+#else
+#define xfs_check_delalloc(ip, whichfork)	do { } while (0)
+#endif
+
 /*
  * Now that the generic code is guaranteed not to be accessing
  * the linux inode, we can inactivate and reclaim the inode.
@@ -951,7 +976,12 @@ xfs_fs_destroy_inode(
 
 	xfs_inactive(ip);
 
-	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
+	if (!XFS_FORCED_SHUTDOWN(ip->i_mount) && ip->i_delayed_blks) {
+		xfs_check_delalloc(ip, XFS_DATA_FORK);
+		xfs_check_delalloc(ip, XFS_COW_FORK);
+		ASSERT(0);
+	}
+
 	XFS_STATS_INC(ip->i_mount, vn_reclaim);
 
 	/*
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] xfs: introduce an always_cow mode
  2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
@ 2018-09-20 14:42 ` Christoph Hellwig
  2018-09-20 20:17   ` Darrick J. Wong
  4 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-20 14:42 UTC (permalink / raw)
  To: linux-xfs

When the (debug-only) always_cow module parameter is set, we will always
write out place, even if the file is not reflinked.  In addition to being
a useful debug aid this prepares for modes where we must always write
out of place.  An example for that is the upcoming support for atomic
overwrites.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_aops.c    |  2 +-
 fs/xfs/xfs_file.c    |  2 +-
 fs/xfs/xfs_iomap.c   |  8 +++----
 fs/xfs/xfs_reflink.c | 50 ++++++++++++++++++++++++++++++++------------
 fs/xfs/xfs_reflink.h | 13 ++++++++++++
 5 files changed, 56 insertions(+), 19 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 775cdcfe70c2..65352d52d24a 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -980,7 +980,7 @@ xfs_vm_bmap(
 	 * Since we don't pass back blockdev info, we can't return bmap
 	 * information for rt files either.
 	 */
-	if (xfs_is_reflink_inode(ip) || XFS_IS_REALTIME_INODE(ip))
+	if (xfs_is_cow_inode(ip) || XFS_IS_REALTIME_INODE(ip))
 		return 0;
 	return iomap_bmap(mapping, block, &xfs_iomap_ops);
 }
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index fa8fbc84eec8..2756fc583716 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -507,7 +507,7 @@ xfs_file_dio_aio_write(
 		 * We can't properly handle unaligned direct I/O to reflink
 		 * files yet, as we can't unshare a partial block.
 		 */
-		if (xfs_is_reflink_inode(ip)) {
+		if (xfs_is_cow_inode(ip)) {
 			trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count);
 			return -EREMCHG;
 		}
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 1bfc40ce581a..398c5b23a368 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -570,7 +570,7 @@ xfs_delalloc_iomap_begin(
 	if (eof)
 		got.br_startoff = maxbytes_fsb;
 	if (got.br_startoff <= offset_fsb) {
-		if (xfs_is_reflink_inode(ip) &&
+		if (xfs_is_cow_inode(ip) &&
 		    ((flags & IOMAP_WRITE) ||
 		     got.br_state != XFS_EXT_UNWRITTEN)) {
 			end_fsb = min(XFS_B_TO_FSB(mp, offset + count),
@@ -1048,7 +1048,7 @@ xfs_ilock_for_iomap(
 	 * COW writes may allocate delalloc space or convert unwritten COW
 	 * extents, so we need to make sure to take the lock exclusively here.
 	 */
-	if (xfs_is_reflink_inode(ip) && is_write) {
+	if (xfs_is_cow_inode(ip) && is_write) {
 		/*
 		 * FIXME: It could still overwrite on unshared extents and not
 		 * need allocation.
@@ -1082,7 +1082,7 @@ xfs_ilock_for_iomap(
 	 * check, so if we got ILOCK_SHARED for a write and but we're now a
 	 * reflink inode we have to switch to ILOCK_EXCL and relock.
 	 */
-	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_reflink_inode(ip)) {
+	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
 		xfs_iunlock(ip, mode);
 		mode = XFS_ILOCK_EXCL;
 		goto relock;
@@ -1151,7 +1151,7 @@ xfs_file_iomap_begin(
 	 * Break shared extents if necessary. Checks for non-blocking IO have
 	 * been done up front, so we don't need to do them here.
 	 */
-	if (xfs_is_reflink_inode(ip)) {
+	if (xfs_is_cow_inode(ip)) {
 		error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode);
 		if (error)
 			goto out_unlock;
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index b25ea7a3a0e7..a893f6d65ff0 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -129,6 +129,11 @@
  * ioend, the better.
  */
 
+#ifdef DEBUG
+bool xfs_always_cow;
+module_param_named(always_cow, xfs_always_cow, bool, 0644);
+#endif
+
 /*
  * Given an AG extent, find the lowest-numbered run of shared blocks
  * within that range and return the range in fbno/flen.  If
@@ -192,7 +197,7 @@ xfs_reflink_trim_around_shared(
 	int			error = 0;
 
 	/* Holes, unwritten, and delalloc extents cannot be shared */
-	if (!xfs_is_reflink_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
+	if (!xfs_is_cow_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
 		*shared = false;
 		return 0;
 	}
@@ -234,6 +239,22 @@ xfs_reflink_trim_around_shared(
 	}
 }
 
+static bool
+xfs_inode_need_cow(
+	struct xfs_inode	*ip,
+	struct xfs_bmbt_irec	*imap,
+	bool			*shared)
+{
+	/* We can't update any real extents in always COW mode. */
+	if (xfs_always_cow && !isnullstartblock(imap->br_startblock)) {
+		*shared = true;
+		return 0;
+	}
+
+	/* Trim the mapping to the nearest shared extent boundary. */
+	return xfs_reflink_trim_around_shared(ip, imap, shared);
+}
+
 /*
  * Trim the passed in imap to the next shared/unshared extent boundary, and
  * if imap->br_startoff points to a shared extent reserve space for it in the
@@ -248,13 +269,17 @@ xfs_reflink_reserve_cow(
 	struct xfs_inode	*ip,
 	struct xfs_bmbt_irec	*imap)
 {
-	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
 	struct xfs_bmbt_irec	got;
 	int			error = 0;
 	bool			eof = false;
 	struct xfs_iext_cursor	icur;
 	bool			shared;
 
+	if (!ip->i_cowfp) {
+		ASSERT(!xfs_is_reflink_inode(ip));
+		xfs_ifork_init_cow(ip);
+	}
+
 	/*
 	 * Search the COW fork extent list first.  This serves two purposes:
 	 * first this implement the speculative preallocation using cowextisze,
@@ -263,8 +288,8 @@ xfs_reflink_reserve_cow(
 	 * extent list is generally faster than going out to the shared extent
 	 * tree.
 	 */
-
-	if (!xfs_iext_lookup_extent(ip, ifp, imap->br_startoff, &icur, &got))
+	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, imap->br_startoff, &icur,
+			&got))
 		eof = true;
 	if (!eof && got.br_startoff <= imap->br_startoff) {
 		trace_xfs_reflink_cow_found(ip, imap);
@@ -273,14 +298,10 @@ xfs_reflink_reserve_cow(
 	}
 
 	/* Trim the mapping to the nearest shared extent boundary. */
-	error = xfs_reflink_trim_around_shared(ip, imap, &shared);
-	if (error)
+	error = xfs_inode_need_cow(ip, imap, &shared);
+	if (error || !shared)
 		return error;
 
-	/* Not shared?  Just report the (potentially capped) extent. */
-	if (!shared)
-		return 0;
-
 	/*
 	 * Fork all the shared blocks from our write offset until the end of
 	 * the extent.
@@ -374,7 +395,7 @@ xfs_find_trim_cow_extent(
 	if (got.br_startoff > offset_fsb) {
 		xfs_trim_extent(imap, imap->br_startoff,
 				got.br_startoff - imap->br_startoff);
-		return xfs_reflink_trim_around_shared(ip, imap, shared);
+		return xfs_inode_need_cow(ip, imap, shared);
 	}
 
 	*shared = true;
@@ -408,7 +429,10 @@ xfs_reflink_allocate_cow(
 	xfs_extlen_t		resblks = 0;
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
-	ASSERT(xfs_is_reflink_inode(ip));
+	if (!ip->i_cowfp) {
+		ASSERT(!xfs_is_reflink_inode(ip));
+		xfs_ifork_init_cow(ip);
+	}
 
 	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
 	if (error || !*shared)
@@ -585,7 +609,7 @@ xfs_reflink_cancel_cow_range(
 	int			error;
 
 	trace_xfs_reflink_cancel_cow_range(ip, offset, count);
-	ASSERT(xfs_is_reflink_inode(ip));
+	ASSERT(ip->i_cowfp);
 
 	offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset);
 	if (count == NULLFILEOFF)
diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
index 7f47202b5639..b89e07239b82 100644
--- a/fs/xfs/xfs_reflink.h
+++ b/fs/xfs/xfs_reflink.h
@@ -6,6 +6,19 @@
 #ifndef __XFS_REFLINK_H
 #define __XFS_REFLINK_H 1
 
+#ifdef DEBUG
+extern bool xfs_always_cow;
+#else
+#define xfs_always_cow	false
+#endif
+
+static inline bool xfs_is_cow_inode(struct xfs_inode *ip)
+{
+	return xfs_is_reflink_inode(ip) ||
+		(xfs_always_cow &&
+		 xfs_sb_version_hasreflink(&ip->i_mount->m_sb));
+}
+
 extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
 		xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
 		xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] xfs: cancel COW blocks before swapext
  2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
@ 2018-09-20 20:04   ` Darrick J. Wong
  2018-09-27 15:07   ` Brian Foster
  1 sibling, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2018-09-20 20:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:16PM +0200, Christoph Hellwig wrote:
> We need to make sure we have no outstanding COW blocks before we swap
> extents, as there is nothing preventing us from having COW preallocations
> on an inode that swapext is called on.  That case can easily be
> reproduced by the upcoming always_cow mode.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Seems reasonable to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_bmap_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 414dbc31139c..e0e9cbc98ccd 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -1515,6 +1515,12 @@ xfs_swap_extent_flush(
>  		return error;
>  	truncate_pagecache_range(VFS_I(ip), 0, -1);
>  
> +	if (xfs_inode_has_cow_data(ip)) {
> +		error = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true);
> +		if (error)
> +			return error;
> +	}
> +
>  	/* Verify O_DIRECT for ftmp */
>  	if (VFS_I(ip)->i_mapping->nrpages)
>  		return -EINVAL;
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] xfs: print dangling delalloc extents
  2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
@ 2018-09-20 20:06   ` Darrick J. Wong
  2018-09-27 15:07   ` Brian Foster
  1 sibling, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2018-09-20 20:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:19PM +0200, Christoph Hellwig wrote:
> Instead of just asserting that we have no delalloc space dangling
> in an inode that gets freed print the actual offenders for debug
> mode.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_super.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 207ee302b1bb..62850b420dc7 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -933,6 +933,31 @@ xfs_fs_alloc_inode(
>  	return NULL;
>  }
>  
> +#ifdef DEBUG
> +static void
> +xfs_check_delalloc(
> +	struct xfs_inode	*ip,
> +	int			whichfork)
> +{
> +	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
> +	struct xfs_bmbt_irec	got;
> +	struct xfs_iext_cursor	icur;
> +
> +	if (!ifp || !xfs_iext_lookup_extent(ip, ifp, 0, &icur, &got))
> +		return;
> +	do {
> +		if (isnullstartblock(got.br_startblock)) {
> +			xfs_warn(ip->i_mount,
> +				"%s fork has delalloc extent at [0x%llx:0x%llx]",
> +				whichfork == XFS_DATA_FORK ? "data" : "cow",
> +				got.br_startoff, got.br_blockcount);
> +		}
> +	} while (xfs_iext_next_extent(ifp, &icur, &got));
> +}
> +#else
> +#define xfs_check_delalloc(ip, whichfork)	do { } while (0)
> +#endif
> +
>  /*
>   * Now that the generic code is guaranteed not to be accessing
>   * the linux inode, we can inactivate and reclaim the inode.
> @@ -951,7 +976,12 @@ xfs_fs_destroy_inode(
>  
>  	xfs_inactive(ip);
>  
> -	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
> +	if (!XFS_FORCED_SHUTDOWN(ip->i_mount) && ip->i_delayed_blks) {
> +		xfs_check_delalloc(ip, XFS_DATA_FORK);
> +		xfs_check_delalloc(ip, XFS_COW_FORK);
> +		ASSERT(0);
> +	}
> +
>  	XFS_STATS_INC(ip->i_mount, vn_reclaim);
>  
>  	/*
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] xfs: introduce an always_cow mode
  2018-09-20 14:42 ` [PATCH 5/5] xfs: introduce an always_cow mode Christoph Hellwig
@ 2018-09-20 20:17   ` Darrick J. Wong
  2018-09-20 21:23     ` Dave Chinner
  0 siblings, 1 reply; 16+ messages in thread
From: Darrick J. Wong @ 2018-09-20 20:17 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:20PM +0200, Christoph Hellwig wrote:
> When the (debug-only) always_cow module parameter is set, we will always
> write out place, even if the file is not reflinked.  In addition to being
> a useful debug aid this prepares for modes where we must always write
> out of place.  An example for that is the upcoming support for atomic
> overwrites.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_aops.c    |  2 +-
>  fs/xfs/xfs_file.c    |  2 +-
>  fs/xfs/xfs_iomap.c   |  8 +++----
>  fs/xfs/xfs_reflink.c | 50 ++++++++++++++++++++++++++++++++------------
>  fs/xfs/xfs_reflink.h | 13 ++++++++++++
>  5 files changed, 56 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 775cdcfe70c2..65352d52d24a 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -980,7 +980,7 @@ xfs_vm_bmap(
>  	 * Since we don't pass back blockdev info, we can't return bmap
>  	 * information for rt files either.
>  	 */
> -	if (xfs_is_reflink_inode(ip) || XFS_IS_REALTIME_INODE(ip))
> +	if (xfs_is_cow_inode(ip) || XFS_IS_REALTIME_INODE(ip))
>  		return 0;
>  	return iomap_bmap(mapping, block, &xfs_iomap_ops);
>  }
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index fa8fbc84eec8..2756fc583716 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -507,7 +507,7 @@ xfs_file_dio_aio_write(
>  		 * We can't properly handle unaligned direct I/O to reflink
>  		 * files yet, as we can't unshare a partial block.
>  		 */
> -		if (xfs_is_reflink_inode(ip)) {
> +		if (xfs_is_cow_inode(ip)) {
>  			trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count);
>  			return -EREMCHG;
>  		}
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 1bfc40ce581a..398c5b23a368 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -570,7 +570,7 @@ xfs_delalloc_iomap_begin(
>  	if (eof)
>  		got.br_startoff = maxbytes_fsb;
>  	if (got.br_startoff <= offset_fsb) {
> -		if (xfs_is_reflink_inode(ip) &&
> +		if (xfs_is_cow_inode(ip) &&
>  		    ((flags & IOMAP_WRITE) ||
>  		     got.br_state != XFS_EXT_UNWRITTEN)) {
>  			end_fsb = min(XFS_B_TO_FSB(mp, offset + count),
> @@ -1048,7 +1048,7 @@ xfs_ilock_for_iomap(
>  	 * COW writes may allocate delalloc space or convert unwritten COW
>  	 * extents, so we need to make sure to take the lock exclusively here.
>  	 */
> -	if (xfs_is_reflink_inode(ip) && is_write) {
> +	if (xfs_is_cow_inode(ip) && is_write) {
>  		/*
>  		 * FIXME: It could still overwrite on unshared extents and not
>  		 * need allocation.
> @@ -1082,7 +1082,7 @@ xfs_ilock_for_iomap(
>  	 * check, so if we got ILOCK_SHARED for a write and but we're now a
>  	 * reflink inode we have to switch to ILOCK_EXCL and relock.
>  	 */
> -	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_reflink_inode(ip)) {
> +	if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
>  		xfs_iunlock(ip, mode);
>  		mode = XFS_ILOCK_EXCL;
>  		goto relock;
> @@ -1151,7 +1151,7 @@ xfs_file_iomap_begin(
>  	 * Break shared extents if necessary. Checks for non-blocking IO have
>  	 * been done up front, so we don't need to do them here.
>  	 */
> -	if (xfs_is_reflink_inode(ip)) {
> +	if (xfs_is_cow_inode(ip)) {
>  		error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode);
>  		if (error)
>  			goto out_unlock;
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index b25ea7a3a0e7..a893f6d65ff0 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -129,6 +129,11 @@
>   * ioend, the better.
>   */
>  
> +#ifdef DEBUG
> +bool xfs_always_cow;
> +module_param_named(always_cow, xfs_always_cow, bool, 0644);
> +#endif

I would have expected this to end up in /proc/sys/fs/xfs/always_cow
along with all the other behavior knobs, but I guess this is a debugging
thing...?

> +
>  /*
>   * Given an AG extent, find the lowest-numbered run of shared blocks
>   * within that range and return the range in fbno/flen.  If
> @@ -192,7 +197,7 @@ xfs_reflink_trim_around_shared(
>  	int			error = 0;
>  
>  	/* Holes, unwritten, and delalloc extents cannot be shared */
> -	if (!xfs_is_reflink_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
> +	if (!xfs_is_cow_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
>  		*shared = false;
>  		return 0;
>  	}
> @@ -234,6 +239,22 @@ xfs_reflink_trim_around_shared(
>  	}
>  }
>  
> +static bool
> +xfs_inode_need_cow(
> +	struct xfs_inode	*ip,
> +	struct xfs_bmbt_irec	*imap,
> +	bool			*shared)
> +{
> +	/* We can't update any real extents in always COW mode. */

/me struggled with this for a minute before figuring out what the
comment was trying to tell me.  Is the following accurate?

"In always_cow mode, behave as if the incoming data fork imap is
completely shared." ?

--D

> +	if (xfs_always_cow && !isnullstartblock(imap->br_startblock)) {
> +		*shared = true;
> +		return 0;
> +	}
> +
> +	/* Trim the mapping to the nearest shared extent boundary. */
> +	return xfs_reflink_trim_around_shared(ip, imap, shared);
> +}
> +
>  /*
>   * Trim the passed in imap to the next shared/unshared extent boundary, and
>   * if imap->br_startoff points to a shared extent reserve space for it in the
> @@ -248,13 +269,17 @@ xfs_reflink_reserve_cow(
>  	struct xfs_inode	*ip,
>  	struct xfs_bmbt_irec	*imap)
>  {
> -	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
>  	struct xfs_bmbt_irec	got;
>  	int			error = 0;
>  	bool			eof = false;
>  	struct xfs_iext_cursor	icur;
>  	bool			shared;
>  
> +	if (!ip->i_cowfp) {
> +		ASSERT(!xfs_is_reflink_inode(ip));
> +		xfs_ifork_init_cow(ip);
> +	}
> +
>  	/*
>  	 * Search the COW fork extent list first.  This serves two purposes:
>  	 * first this implement the speculative preallocation using cowextisze,
> @@ -263,8 +288,8 @@ xfs_reflink_reserve_cow(
>  	 * extent list is generally faster than going out to the shared extent
>  	 * tree.
>  	 */
> -
> -	if (!xfs_iext_lookup_extent(ip, ifp, imap->br_startoff, &icur, &got))
> +	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, imap->br_startoff, &icur,
> +			&got))
>  		eof = true;
>  	if (!eof && got.br_startoff <= imap->br_startoff) {
>  		trace_xfs_reflink_cow_found(ip, imap);
> @@ -273,14 +298,10 @@ xfs_reflink_reserve_cow(
>  	}
>  
>  	/* Trim the mapping to the nearest shared extent boundary. */
> -	error = xfs_reflink_trim_around_shared(ip, imap, &shared);
> -	if (error)
> +	error = xfs_inode_need_cow(ip, imap, &shared);
> +	if (error || !shared)
>  		return error;
>  
> -	/* Not shared?  Just report the (potentially capped) extent. */
> -	if (!shared)
> -		return 0;
> -
>  	/*
>  	 * Fork all the shared blocks from our write offset until the end of
>  	 * the extent.
> @@ -374,7 +395,7 @@ xfs_find_trim_cow_extent(
>  	if (got.br_startoff > offset_fsb) {
>  		xfs_trim_extent(imap, imap->br_startoff,
>  				got.br_startoff - imap->br_startoff);
> -		return xfs_reflink_trim_around_shared(ip, imap, shared);
> +		return xfs_inode_need_cow(ip, imap, shared);
>  	}
>  
>  	*shared = true;
> @@ -408,7 +429,10 @@ xfs_reflink_allocate_cow(
>  	xfs_extlen_t		resblks = 0;
>  
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> -	ASSERT(xfs_is_reflink_inode(ip));
> +	if (!ip->i_cowfp) {
> +		ASSERT(!xfs_is_reflink_inode(ip));
> +		xfs_ifork_init_cow(ip);
> +	}
>  
>  	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
>  	if (error || !*shared)
> @@ -585,7 +609,7 @@ xfs_reflink_cancel_cow_range(
>  	int			error;
>  
>  	trace_xfs_reflink_cancel_cow_range(ip, offset, count);
> -	ASSERT(xfs_is_reflink_inode(ip));
> +	ASSERT(ip->i_cowfp);
>  
>  	offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset);
>  	if (count == NULLFILEOFF)
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index 7f47202b5639..b89e07239b82 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -6,6 +6,19 @@
>  #ifndef __XFS_REFLINK_H
>  #define __XFS_REFLINK_H 1
>  
> +#ifdef DEBUG
> +extern bool xfs_always_cow;
> +#else
> +#define xfs_always_cow	false
> +#endif
> +
> +static inline bool xfs_is_cow_inode(struct xfs_inode *ip)
> +{
> +	return xfs_is_reflink_inode(ip) ||
> +		(xfs_always_cow &&
> +		 xfs_sb_version_hasreflink(&ip->i_mount->m_sb));
> +}
> +
>  extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
>  		xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
>  		xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow
  2018-09-20 14:42 ` [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow Christoph Hellwig
@ 2018-09-20 20:22   ` Darrick J. Wong
  0 siblings, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2018-09-20 20:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:17PM +0200, Christoph Hellwig wrote:
> The iomap direct I/O code issues a single ->end_io call for the whole
> I/O request, and if some of the extents cowered needed a COW operation
> it will call xfs_reflink_end_cow over the whole range.
> 
> When we do AIO writes we drop the iolock after doing the initial setup,
> but before the I/O completion.  Between dropping the lock and completing
> the I/O we can have a racing buffered write create new delalloc COW fork
> extents in the region covered by the outstanding direct I/O write, and
> thus see delalloc COW fork extents in xfs_reflink_end_cow.  As
> concurrent writes are fundamentally racy and no guarantees are given we
> can simply skip those.
> 
> This can be easily reproduced with xfstests generic/208 in always_cow
> mode.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok to me...
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_reflink.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index ac94ace45424..d1758771f21a 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -686,14 +686,12 @@ xfs_reflink_end_cow(
>  		if (!del.br_blockcount)
>  			goto prev_extent;
>  
> -		ASSERT(!isnullstartblock(got.br_startblock));
> -
>  		/*
> -		 * Don't remap unwritten extents; these are
> -		 * speculatively preallocated CoW extents that have been
> -		 * allocated but have not yet been involved in a write.
> +		 * Only remap real extent that contain data.  With AIO
> +		 * speculatively preallocations can leak into the range we
> +		 * are called upon, and we need to skip them.
>  		 */
> -		if (got.br_state == XFS_EXT_UNWRITTEN)
> +		if (!xfs_bmap_is_real_extent(&got))
>  			goto prev_extent;
>  
>  		/* Unmap the old blocks in the data fork. */
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] xfs: introduce an always_cow mode
  2018-09-20 20:17   ` Darrick J. Wong
@ 2018-09-20 21:23     ` Dave Chinner
  2018-09-21  5:23       ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Chinner @ 2018-09-20 21:23 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Thu, Sep 20, 2018 at 01:17:02PM -0700, Darrick J. Wong wrote:
> On Thu, Sep 20, 2018 at 04:42:20PM +0200, Christoph Hellwig wrote:
> > When the (debug-only) always_cow module parameter is set, we will always
> > write out place, even if the file is not reflinked.  In addition to being
> > a useful debug aid this prepares for modes where we must always write
> > out of place.  An example for that is the upcoming support for atomic
> > overwrites.
....
> > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > index b25ea7a3a0e7..a893f6d65ff0 100644
> > --- a/fs/xfs/xfs_reflink.c
> > +++ b/fs/xfs/xfs_reflink.c
> > @@ -129,6 +129,11 @@
> >   * ioend, the better.
> >   */
> >  
> > +#ifdef DEBUG
> > +bool xfs_always_cow;
> > +module_param_named(always_cow, xfs_always_cow, bool, 0644);
> > +#endif
> 
> I would have expected this to end up in /proc/sys/fs/xfs/always_cow
> along with all the other behavior knobs, but I guess this is a debugging
> thing...?

sys/fs/xfs/debug/always_cow is the place for this if it is truly a
global debug option. However, that's probably not the best way to
enable this sort of debug for people with XFS root filesystems
trying to test XFS filesystems. I suspect it should be a per-mount
configuration (i.e. /sys/fs/xfs/<dev>/debug/always_cow)

However, this mode of operation seems like it is going to form the
basis of other per-filesystem features, so it would be good to know
how/if those features make use of this always_cow mode or are
controlled by some other set of knobs first.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] xfs: introduce an always_cow mode
  2018-09-20 21:23     ` Dave Chinner
@ 2018-09-21  5:23       ` Christoph Hellwig
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-21  5:23 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Darrick J. Wong, Christoph Hellwig, linux-xfs

On Fri, Sep 21, 2018 at 07:23:57AM +1000, Dave Chinner wrote:
> sys/fs/xfs/debug/always_cow is the place for this if it is truly a
> global debug option. However, that's probably not the best way to
> enable this sort of debug for people with XFS root filesystems
> trying to test XFS filesystems. I suspect it should be a per-mount
> configuration (i.e. /sys/fs/xfs/<dev>/debug/always_cow)

True.

> However, this mode of operation seems like it is going to form the
> basis of other per-filesystem features, so it would be good to know
> how/if those features make use of this always_cow mode or are
> controlled by some other set of knobs first.

The current atomic write support is a per-inode fcntl based flag.
(used to be an O_ATOMIC per open file flag, but that had all kinds
of issues).

The zoned device support doesn't really existing for a bit of prototype
write path code, but the current assumption is that all files in
ѕequential write only zones have the RT bit set, and will get this
behavior on that an a per-subperblock flag.

In other words - the know here is supposed to remain a debug build
only debug nob.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] xfs: cancel COW blocks before swapext
  2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
  2018-09-20 20:04   ` Darrick J. Wong
@ 2018-09-27 15:07   ` Brian Foster
  2018-09-30 22:50     ` Christoph Hellwig
  1 sibling, 1 reply; 16+ messages in thread
From: Brian Foster @ 2018-09-27 15:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:16PM +0200, Christoph Hellwig wrote:
> We need to make sure we have no outstanding COW blocks before we swap
> extents, as there is nothing preventing us from having COW preallocations
> on an inode that swapext is called on.  That case can easily be
> reproduced by the upcoming always_cow mode.
> 

Why? Is the current code to swap the cow forks not sufficient or
problematic in some way?

> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_bmap_util.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 414dbc31139c..e0e9cbc98ccd 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -1515,6 +1515,12 @@ xfs_swap_extent_flush(
>  		return error;
>  	truncate_pagecache_range(VFS_I(ip), 0, -1);
>  
> +	if (xfs_inode_has_cow_data(ip)) {
> +		error = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true);
> +		if (error)
> +			return error;
> +	}
> +

The change seems fine, but doesn't this make the follow on code to swap
the cow blocks tags spurious..?

Brian

>  	/* Verify O_DIRECT for ftmp */
>  	if (VFS_I(ip)->i_mapping->nrpages)
>  		return -EINVAL;
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] xfs: print dangling delalloc extents
  2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
  2018-09-20 20:06   ` Darrick J. Wong
@ 2018-09-27 15:07   ` Brian Foster
  1 sibling, 0 replies; 16+ messages in thread
From: Brian Foster @ 2018-09-27 15:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Sep 20, 2018 at 04:42:19PM +0200, Christoph Hellwig wrote:
> Instead of just asserting that we have no delalloc space dangling
> in an inode that gets freed print the actual offenders for debug
> mode.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_super.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 207ee302b1bb..62850b420dc7 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -933,6 +933,31 @@ xfs_fs_alloc_inode(
>  	return NULL;
>  }
>  
> +#ifdef DEBUG
> +static void
> +xfs_check_delalloc(
> +	struct xfs_inode	*ip,
> +	int			whichfork)
> +{
> +	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
> +	struct xfs_bmbt_irec	got;
> +	struct xfs_iext_cursor	icur;
> +
> +	if (!ifp || !xfs_iext_lookup_extent(ip, ifp, 0, &icur, &got))
> +		return;
> +	do {
> +		if (isnullstartblock(got.br_startblock)) {
> +			xfs_warn(ip->i_mount,
> +				"%s fork has delalloc extent at [0x%llx:0x%llx]",
> +				whichfork == XFS_DATA_FORK ? "data" : "cow",
> +				got.br_startoff, got.br_blockcount);

If we're going to print warnings, can we include inode-identifying info
as well?

Brian

> +		}
> +	} while (xfs_iext_next_extent(ifp, &icur, &got));
> +}
> +#else
> +#define xfs_check_delalloc(ip, whichfork)	do { } while (0)
> +#endif
> +
>  /*
>   * Now that the generic code is guaranteed not to be accessing
>   * the linux inode, we can inactivate and reclaim the inode.
> @@ -951,7 +976,12 @@ xfs_fs_destroy_inode(
>  
>  	xfs_inactive(ip);
>  
> -	ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0);
> +	if (!XFS_FORCED_SHUTDOWN(ip->i_mount) && ip->i_delayed_blks) {
> +		xfs_check_delalloc(ip, XFS_DATA_FORK);
> +		xfs_check_delalloc(ip, XFS_COW_FORK);
> +		ASSERT(0);
> +	}
> +
>  	XFS_STATS_INC(ip->i_mount, vn_reclaim);
>  
>  	/*
> -- 
> 2.18.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] xfs: cancel COW blocks before swapext
  2018-09-27 15:07   ` Brian Foster
@ 2018-09-30 22:50     ` Christoph Hellwig
  2018-10-01 11:03       ` Brian Foster
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2018-09-30 22:50 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Thu, Sep 27, 2018 at 11:07:31AM -0400, Brian Foster wrote:
> On Thu, Sep 20, 2018 at 04:42:16PM +0200, Christoph Hellwig wrote:
> > We need to make sure we have no outstanding COW blocks before we swap
> > extents, as there is nothing preventing us from having COW preallocations
> > on an inode that swapext is called on.  That case can easily be
> > reproduced by the upcoming always_cow mode.
> > 
> 
> Why? Is the current code to swap the cow forks not sufficient or
> problematic in some way?

The code didn't seem happy with non-delalloc COW fork extents sticking
around over swapext, but I'd have to go back and look up why.

It seemed easiest to just cancel them as there is no reason to keep
these alive over a swapext.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] xfs: cancel COW blocks before swapext
  2018-09-30 22:50     ` Christoph Hellwig
@ 2018-10-01 11:03       ` Brian Foster
  0 siblings, 0 replies; 16+ messages in thread
From: Brian Foster @ 2018-10-01 11:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Oct 01, 2018 at 12:50:30AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 27, 2018 at 11:07:31AM -0400, Brian Foster wrote:
> > On Thu, Sep 20, 2018 at 04:42:16PM +0200, Christoph Hellwig wrote:
> > > We need to make sure we have no outstanding COW blocks before we swap
> > > extents, as there is nothing preventing us from having COW preallocations
> > > on an inode that swapext is called on.  That case can easily be
> > > reproduced by the upcoming always_cow mode.
> > > 
> > 
> > Why? Is the current code to swap the cow forks not sufficient or
> > problematic in some way?
> 
> The code didn't seem happy with non-delalloc COW fork extents sticking
> around over swapext, but I'd have to go back and look up why.
> 

It would be nice to have at least a high level note in the commit log
for historical purposes, so we have a reference on what needs fixing if
we revisit it in the future.

> It seemed easiest to just cancel them as there is no reason to keep
> these alive over a swapext.

Seems reasonable to me. I'd just perhaps replace the cowblocks flag swap
code with asserts or something to be consistent, because it seems that
should never be set on either inode with this change in place.

Brian

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-10-01 17:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
2018-09-20 20:04   ` Darrick J. Wong
2018-09-27 15:07   ` Brian Foster
2018-09-30 22:50     ` Christoph Hellwig
2018-10-01 11:03       ` Brian Foster
2018-09-20 14:42 ` [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow Christoph Hellwig
2018-09-20 20:22   ` Darrick J. Wong
2018-09-20 14:42 ` [PATCH 3/5] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
2018-09-20 20:06   ` Darrick J. Wong
2018-09-27 15:07   ` Brian Foster
2018-09-20 14:42 ` [PATCH 5/5] xfs: introduce an always_cow mode Christoph Hellwig
2018-09-20 20:17   ` Darrick J. Wong
2018-09-20 21:23     ` Dave Chinner
2018-09-21  5:23       ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.