All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] xfs: small fixes for 5.19 cycle
@ 2022-05-24  2:21 Dave Chinner
  2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Dave Chinner @ 2022-05-24  2:21 UTC (permalink / raw)
  To: linux-xfs

Hi folks,

In this series are two small changes to debug code that have made
test runs a little more resilient for me over this cycle. One if a
fix to an assert that is popping on an error handling path that
doesn't take into account an error occurring. The other is to
convert a hard ASSERT fail to a XFS_IS_CORRUPT check that will dump
a failure to dmesg and fail tests taht way instead of hanging the
machine by killing an unmount process.

The other patch is a small optimisation to the new btree sibling
pointer checking. This explicitly inlines the new checking function,
reducing the code size significantly and making the code simpler and
faster to execute. This should address the small performance
regressions reported on AIM7 workloads caused by the sibling checks.

Cheers,

Dave.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions
  2022-05-24  2:21 [PATCH 0/3] xfs: small fixes for 5.19 cycle Dave Chinner
@ 2022-05-24  2:21 ` Dave Chinner
  2022-05-24  3:46   ` Darrick J. Wong
  2022-05-24  8:13   ` Christoph Hellwig
  2022-05-24  2:21 ` [PATCH 2/3] xfs: don't assert fail on perag references on teardown Dave Chinner
  2022-05-24  2:21 ` [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error Dave Chinner
  2 siblings, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2022-05-24  2:21 UTC (permalink / raw)
  To: linux-xfs

From: Dave Chinner <dchinner@redhat.com>

Commit dc04db2aa7c9 has caused a small aim7 regression, showing a
small increase in CPU usage in __xfs_btree_check_sblock() as a
result of the extra checking.

This is likely due to the endian conversion of the sibling poitners
being unconditional instead of relying on the compiler to endian
convert the NULL pointer at compile time and avoiding the runtime
conversion for this common case.

Rework the checks so that endian conversion of the sibling pointers
is only done if they are not null as the original code did.

.... and these need to be "inline" because the compiler completely
fails to inline them automatically like it should be doing.

$ size fs/xfs/libxfs/xfs_btree.o*
   text	   data	    bss	    dec	    hex	filename
  51874	    240	      0	  52114	   cb92 fs/xfs/libxfs/xfs_btree.o.orig
  51562	    240	      0	  51802	   ca5a fs/xfs/libxfs/xfs_btree.o.inline

Just when you think the tools have advanced sufficiently we don't
have to care about stuff like this anymore, along comes a reminder
that *our tools still suck*.

Fixes: dc04db2aa7c9 ("xfs: detect self referencing btree sibling pointers")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_btree.c | 47 +++++++++++++++++++++++++++------------
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 2aa300f7461f..786ec1cb1bba 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -51,16 +51,31 @@ xfs_btree_magic(
 	return magic;
 }
 
-static xfs_failaddr_t
+/*
+ * These sibling pointer checks are optimised for null sibling pointers. This
+ * happens a lot, and we don't need to byte swap at runtime if the sibling
+ * pointer is NULL.
+ *
+ * These are explicitly marked at inline because the cost of calling them as
+ * functions instead of inlining them is about 36 bytes extra code per call site
+ * on x86-64. Yes, gcc-11 fails to inline them, and explicit inlining of these
+ * two sibling check functions reduces the compiled code size by over 300
+ * bytes.
+ */
+static inline xfs_failaddr_t
 xfs_btree_check_lblock_siblings(
 	struct xfs_mount	*mp,
 	struct xfs_btree_cur	*cur,
 	int			level,
 	xfs_fsblock_t		fsb,
-	xfs_fsblock_t		sibling)
+	__be64			dsibling)
 {
-	if (sibling == NULLFSBLOCK)
+	xfs_fsblock_t		sibling;
+
+	if (dsibling == cpu_to_be64(NULLFSBLOCK))
 		return NULL;
+
+	sibling = be64_to_cpu(dsibling);
 	if (sibling == fsb)
 		return __this_address;
 	if (level >= 0) {
@@ -74,17 +89,21 @@ xfs_btree_check_lblock_siblings(
 	return NULL;
 }
 
-static xfs_failaddr_t
+static inline xfs_failaddr_t
 xfs_btree_check_sblock_siblings(
 	struct xfs_mount	*mp,
 	struct xfs_btree_cur	*cur,
 	int			level,
 	xfs_agnumber_t		agno,
 	xfs_agblock_t		agbno,
-	xfs_agblock_t		sibling)
+	__be32			dsibling)
 {
-	if (sibling == NULLAGBLOCK)
+	xfs_agblock_t		sibling;
+
+	if (dsibling == cpu_to_be32(NULLAGBLOCK))
 		return NULL;
+
+	sibling = be32_to_cpu(dsibling);
 	if (sibling == agbno)
 		return __this_address;
 	if (level >= 0) {
@@ -136,10 +155,10 @@ __xfs_btree_check_lblock(
 		fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
 
 	fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
-			be64_to_cpu(block->bb_u.l.bb_leftsib));
+			block->bb_u.l.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
-				be64_to_cpu(block->bb_u.l.bb_rightsib));
+				block->bb_u.l.bb_rightsib);
 	return fa;
 }
 
@@ -204,10 +223,10 @@ __xfs_btree_check_sblock(
 	}
 
 	fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno,
-			be32_to_cpu(block->bb_u.s.bb_leftsib));
+			block->bb_u.s.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno,
-				 agbno, be32_to_cpu(block->bb_u.s.bb_rightsib));
+				 agbno, block->bb_u.s.bb_rightsib);
 	return fa;
 }
 
@@ -4523,10 +4542,10 @@ xfs_btree_lblock_verify(
 	/* sibling pointer verification */
 	fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
 	fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
-			be64_to_cpu(block->bb_u.l.bb_leftsib));
+			block->bb_u.l.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
-				be64_to_cpu(block->bb_u.l.bb_rightsib));
+				block->bb_u.l.bb_rightsib);
 	return fa;
 }
 
@@ -4580,10 +4599,10 @@ xfs_btree_sblock_verify(
 	agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp));
 	agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp));
 	fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
-			be32_to_cpu(block->bb_u.s.bb_leftsib));
+			block->bb_u.s.bb_leftsib);
 	if (!fa)
 		fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
-				be32_to_cpu(block->bb_u.s.bb_rightsib));
+				block->bb_u.s.bb_rightsib);
 	return fa;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] xfs: don't assert fail on perag references on teardown
  2022-05-24  2:21 [PATCH 0/3] xfs: small fixes for 5.19 cycle Dave Chinner
  2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
@ 2022-05-24  2:21 ` Dave Chinner
  2022-05-24  3:48   ` Darrick J. Wong
  2022-05-24  8:14   ` Christoph Hellwig
  2022-05-24  2:21 ` [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error Dave Chinner
  2 siblings, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2022-05-24  2:21 UTC (permalink / raw)
  To: linux-xfs

From: Dave Chinner <dchinner@redhat.com>

Not fatal, the assert is there to catch developer attention. I'm
seeing this occasionally during recoveryloop testing after a
shutdown, and I don't want this to stop an overnight recoveryloop
run as it is currently doing.

Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a
corruption report into the log and cause a test failure that way,
but it won't stop the machine dead.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_ag.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index 1e4ee042d52f..3e920cf1b454 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -173,7 +173,6 @@ __xfs_free_perag(
 	struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
 
 	ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
-	ASSERT(atomic_read(&pag->pag_ref) == 0);
 	kmem_free(pag);
 }
 
@@ -192,7 +191,7 @@ xfs_free_perag(
 		pag = radix_tree_delete(&mp->m_perag_tree, agno);
 		spin_unlock(&mp->m_perag_lock);
 		ASSERT(pag);
-		ASSERT(atomic_read(&pag->pag_ref) == 0);
+		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
 
 		cancel_delayed_work_sync(&pag->pag_blockgc_work);
 		xfs_iunlink_destroy(pag);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error
  2022-05-24  2:21 [PATCH 0/3] xfs: small fixes for 5.19 cycle Dave Chinner
  2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
  2022-05-24  2:21 ` [PATCH 2/3] xfs: don't assert fail on perag references on teardown Dave Chinner
@ 2022-05-24  2:21 ` Dave Chinner
  2022-05-24  3:48   ` Darrick J. Wong
  2022-05-24  8:15   ` Christoph Hellwig
  2 siblings, 2 replies; 12+ messages in thread
From: Dave Chinner @ 2022-05-24  2:21 UTC (permalink / raw)
  To: linux-xfs

From: Dave Chinner <dchinner@redhat.com>

xfs/538 on a 1kB block filesystem failed with this assert:

XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448

The problem was that an allocation failed unexpectedly in
xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error
injections, resulting in an EFSCORRUPTED error being returned to
xfs_bmapi_write(). The error occurred on extent-to-btree format
conversion allocating the new root block:

 RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210
 Call Trace:
  <TASK>
  xfs_btree_new_iroot+0xdf/0x520
  xfs_btree_make_block_unfull+0x10d/0x1c0
  xfs_btree_insrec+0x364/0x790
  xfs_btree_insert+0xaa/0x210
  xfs_bmap_add_extent_hole_real+0x1fe/0x9a0
  xfs_bmapi_allocate+0x34c/0x420
  xfs_bmapi_write+0x53c/0x9c0
  xfs_alloc_file_space+0xee/0x320
  xfs_file_fallocate+0x36b/0x450
  vfs_fallocate+0x148/0x340
  __x64_sys_fallocate+0x3c/0x70
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa

Why the allocation failed at this point is unknown, but is likely
that we ran the transaction out of reserved space and filesystem out
of space with bmbt blocks because of all the minlen allocations
being done causing worst case fragmentation of a large allocation.

Regardless of the cause, we've then called xfs_bmapi_finish() which
calls xfs_btree_del_cursor(cur, error) to tear down the cursor.

So we have a failed operation, error != 0, cur->bc_ino.allocated > 0
and the filesystem is still up. The assert fails to take into
account that allocation can fail with an error and the transaction
teardown will shut the filesystem down if necessary. i.e. the
assert needs to check "|| error != 0" as well, because at this point
shutdown is pending because the current transaction is dirty....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_btree.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 786ec1cb1bba..32100cfb9dfc 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -445,8 +445,14 @@ xfs_btree_del_cursor(
 			break;
 	}
 
+	/*
+	 * If we are doing a BMBT update, the number of unaccounted blocks
+	 * allocated during this cursor life time should be zero. If it's not
+	 * zero, then we should be shut down or on our way to shutdown due to
+	 * cancelling a dirty transaction on error.
+	 */
 	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 ||
-	       xfs_is_shutdown(cur->bc_mp));
+	       xfs_is_shutdown(cur->bc_mp) || error != 0);
 	if (unlikely(cur->bc_flags & XFS_BTREE_STAGING))
 		kmem_free(cur->bc_ops);
 	if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions
  2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
@ 2022-05-24  3:46   ` Darrick J. Wong
  2022-05-24  8:13   ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2022-05-24  3:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 24, 2022 at 12:21:56PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Commit dc04db2aa7c9 has caused a small aim7 regression, showing a
> small increase in CPU usage in __xfs_btree_check_sblock() as a
> result of the extra checking.
> 
> This is likely due to the endian conversion of the sibling poitners
> being unconditional instead of relying on the compiler to endian
> convert the NULL pointer at compile time and avoiding the runtime
> conversion for this common case.
> 
> Rework the checks so that endian conversion of the sibling pointers
> is only done if they are not null as the original code did.
> 
> .... and these need to be "inline" because the compiler completely
> fails to inline them automatically like it should be doing.
> 
> $ size fs/xfs/libxfs/xfs_btree.o*
>    text	   data	    bss	    dec	    hex	filename
>   51874	    240	      0	  52114	   cb92 fs/xfs/libxfs/xfs_btree.o.orig
>   51562	    240	      0	  51802	   ca5a fs/xfs/libxfs/xfs_btree.o.inline
> 
> Just when you think the tools have advanced sufficiently we don't
> have to care about stuff like this anymore, along comes a reminder
> that *our tools still suck*.
> 
> Fixes: dc04db2aa7c9 ("xfs: detect self referencing btree sibling pointers")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_btree.c | 47 +++++++++++++++++++++++++++------------
>  1 file changed, 33 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 2aa300f7461f..786ec1cb1bba 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -51,16 +51,31 @@ xfs_btree_magic(
>  	return magic;
>  }
>  
> -static xfs_failaddr_t
> +/*
> + * These sibling pointer checks are optimised for null sibling pointers. This
> + * happens a lot, and we don't need to byte swap at runtime if the sibling
> + * pointer is NULL.
> + *
> + * These are explicitly marked at inline because the cost of calling them as
> + * functions instead of inlining them is about 36 bytes extra code per call site
> + * on x86-64. Yes, gcc-11 fails to inline them, and explicit inlining of these
> + * two sibling check functions reduces the compiled code size by over 300
> + * bytes.
> + */
> +static inline xfs_failaddr_t
>  xfs_btree_check_lblock_siblings(
>  	struct xfs_mount	*mp,
>  	struct xfs_btree_cur	*cur,
>  	int			level,
>  	xfs_fsblock_t		fsb,
> -	xfs_fsblock_t		sibling)
> +	__be64			dsibling)
>  {
> -	if (sibling == NULLFSBLOCK)
> +	xfs_fsblock_t		sibling;
> +
> +	if (dsibling == cpu_to_be64(NULLFSBLOCK))
>  		return NULL;
> +
> +	sibling = be64_to_cpu(dsibling);
>  	if (sibling == fsb)
>  		return __this_address;
>  	if (level >= 0) {
> @@ -74,17 +89,21 @@ xfs_btree_check_lblock_siblings(
>  	return NULL;
>  }
>  
> -static xfs_failaddr_t
> +static inline xfs_failaddr_t
>  xfs_btree_check_sblock_siblings(
>  	struct xfs_mount	*mp,
>  	struct xfs_btree_cur	*cur,
>  	int			level,
>  	xfs_agnumber_t		agno,
>  	xfs_agblock_t		agbno,
> -	xfs_agblock_t		sibling)
> +	__be32			dsibling)
>  {
> -	if (sibling == NULLAGBLOCK)
> +	xfs_agblock_t		sibling;
> +
> +	if (dsibling == cpu_to_be32(NULLAGBLOCK))
>  		return NULL;
> +
> +	sibling = be32_to_cpu(dsibling);
>  	if (sibling == agbno)
>  		return __this_address;
>  	if (level >= 0) {
> @@ -136,10 +155,10 @@ __xfs_btree_check_lblock(
>  		fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
>  
>  	fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
> -			be64_to_cpu(block->bb_u.l.bb_leftsib));
> +			block->bb_u.l.bb_leftsib);
>  	if (!fa)
>  		fa = xfs_btree_check_lblock_siblings(mp, cur, level, fsb,
> -				be64_to_cpu(block->bb_u.l.bb_rightsib));
> +				block->bb_u.l.bb_rightsib);
>  	return fa;
>  }
>  
> @@ -204,10 +223,10 @@ __xfs_btree_check_sblock(
>  	}
>  
>  	fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno, agbno,
> -			be32_to_cpu(block->bb_u.s.bb_leftsib));
> +			block->bb_u.s.bb_leftsib);
>  	if (!fa)
>  		fa = xfs_btree_check_sblock_siblings(mp, cur, level, agno,
> -				 agbno, be32_to_cpu(block->bb_u.s.bb_rightsib));
> +				 agbno, block->bb_u.s.bb_rightsib);
>  	return fa;
>  }
>  
> @@ -4523,10 +4542,10 @@ xfs_btree_lblock_verify(
>  	/* sibling pointer verification */
>  	fsb = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
>  	fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
> -			be64_to_cpu(block->bb_u.l.bb_leftsib));
> +			block->bb_u.l.bb_leftsib);
>  	if (!fa)
>  		fa = xfs_btree_check_lblock_siblings(mp, NULL, -1, fsb,
> -				be64_to_cpu(block->bb_u.l.bb_rightsib));
> +				block->bb_u.l.bb_rightsib);
>  	return fa;

The next thing I wanna do is make __xfs_btree_check_[sl]block actually
print out the failaddr_t returned to it.

I half wonder if it would be *even faster* to pass in a *pointer* to the
sibling fields and use be64_to_cpup, but the time savings will probably
be eaten up on regrokking asm code, so

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  }
>  
> @@ -4580,10 +4599,10 @@ xfs_btree_sblock_verify(
>  	agno = xfs_daddr_to_agno(mp, xfs_buf_daddr(bp));
>  	agbno = xfs_daddr_to_agbno(mp, xfs_buf_daddr(bp));
>  	fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
> -			be32_to_cpu(block->bb_u.s.bb_leftsib));
> +			block->bb_u.s.bb_leftsib);
>  	if (!fa)
>  		fa = xfs_btree_check_sblock_siblings(mp, NULL, -1, agno, agbno,
> -				be32_to_cpu(block->bb_u.s.bb_rightsib));
> +				block->bb_u.s.bb_rightsib);
>  	return fa;
>  }
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: don't assert fail on perag references on teardown
  2022-05-24  2:21 ` [PATCH 2/3] xfs: don't assert fail on perag references on teardown Dave Chinner
@ 2022-05-24  3:48   ` Darrick J. Wong
  2022-05-24  4:00     ` Dave Chinner
  2022-05-24  8:14   ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2022-05-24  3:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 24, 2022 at 12:21:57PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Not fatal, the assert is there to catch developer attention. I'm
> seeing this occasionally during recoveryloop testing after a
> shutdown, and I don't want this to stop an overnight recoveryloop
> run as it is currently doing.
> 
> Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a
> corruption report into the log and cause a test failure that way,
> but it won't stop the machine dead.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_ag.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> index 1e4ee042d52f..3e920cf1b454 100644
> --- a/fs/xfs/libxfs/xfs_ag.c
> +++ b/fs/xfs/libxfs/xfs_ag.c
> @@ -173,7 +173,6 @@ __xfs_free_perag(
>  	struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
>  
>  	ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
> -	ASSERT(atomic_read(&pag->pag_ref) == 0);

Er, shouldn't this also be converted to XFS_IS_CORRUPT?  That's what the
commit message said...

--D

>  	kmem_free(pag);
>  }
>  
> @@ -192,7 +191,7 @@ xfs_free_perag(
>  		pag = radix_tree_delete(&mp->m_perag_tree, agno);
>  		spin_unlock(&mp->m_perag_lock);
>  		ASSERT(pag);
> -		ASSERT(atomic_read(&pag->pag_ref) == 0);
> +		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
>  
>  		cancel_delayed_work_sync(&pag->pag_blockgc_work);
>  		xfs_iunlink_destroy(pag);
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error
  2022-05-24  2:21 ` [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error Dave Chinner
@ 2022-05-24  3:48   ` Darrick J. Wong
  2022-05-24  8:15   ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2022-05-24  3:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 24, 2022 at 12:21:58PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> xfs/538 on a 1kB block filesystem failed with this assert:
> 
> XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448
> 
> The problem was that an allocation failed unexpectedly in
> xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error
> injections, resulting in an EFSCORRUPTED error being returned to
> xfs_bmapi_write(). The error occurred on extent-to-btree format
> conversion allocating the new root block:
> 
>  RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210
>  Call Trace:
>   <TASK>
>   xfs_btree_new_iroot+0xdf/0x520
>   xfs_btree_make_block_unfull+0x10d/0x1c0
>   xfs_btree_insrec+0x364/0x790
>   xfs_btree_insert+0xaa/0x210
>   xfs_bmap_add_extent_hole_real+0x1fe/0x9a0
>   xfs_bmapi_allocate+0x34c/0x420
>   xfs_bmapi_write+0x53c/0x9c0
>   xfs_alloc_file_space+0xee/0x320
>   xfs_file_fallocate+0x36b/0x450
>   vfs_fallocate+0x148/0x340
>   __x64_sys_fallocate+0x3c/0x70
>   do_syscall_64+0x35/0x80
>   entry_SYSCALL_64_after_hwframe+0x44/0xa
> 
> Why the allocation failed at this point is unknown, but is likely
> that we ran the transaction out of reserved space and filesystem out
> of space with bmbt blocks because of all the minlen allocations
> being done causing worst case fragmentation of a large allocation.
> 
> Regardless of the cause, we've then called xfs_bmapi_finish() which
> calls xfs_btree_del_cursor(cur, error) to tear down the cursor.
> 
> So we have a failed operation, error != 0, cur->bc_ino.allocated > 0
> and the filesystem is still up. The assert fails to take into
> account that allocation can fail with an error and the transaction
> teardown will shut the filesystem down if necessary. i.e. the
> assert needs to check "|| error != 0" as well, because at this point
> shutdown is pending because the current transaction is dirty....
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_btree.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 786ec1cb1bba..32100cfb9dfc 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -445,8 +445,14 @@ xfs_btree_del_cursor(
>  			break;
>  	}
>  
> +	/*
> +	 * If we are doing a BMBT update, the number of unaccounted blocks
> +	 * allocated during this cursor life time should be zero. If it's not
> +	 * zero, then we should be shut down or on our way to shutdown due to
> +	 * cancelling a dirty transaction on error.
> +	 */
>  	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 ||
> -	       xfs_is_shutdown(cur->bc_mp));
> +	       xfs_is_shutdown(cur->bc_mp) || error != 0);

Ewww, multiline assertions! 8-D

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  	if (unlikely(cur->bc_flags & XFS_BTREE_STAGING))
>  		kmem_free(cur->bc_ops);
>  	if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag)
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: don't assert fail on perag references on teardown
  2022-05-24  3:48   ` Darrick J. Wong
@ 2022-05-24  4:00     ` Dave Chinner
  2022-05-24  4:10       ` Darrick J. Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2022-05-24  4:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, May 23, 2022 at 08:48:06PM -0700, Darrick J. Wong wrote:
> On Tue, May 24, 2022 at 12:21:57PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Not fatal, the assert is there to catch developer attention. I'm
> > seeing this occasionally during recoveryloop testing after a
> > shutdown, and I don't want this to stop an overnight recoveryloop
> > run as it is currently doing.
> > 
> > Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a
> > corruption report into the log and cause a test failure that way,
> > but it won't stop the machine dead.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/libxfs/xfs_ag.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> > index 1e4ee042d52f..3e920cf1b454 100644
> > --- a/fs/xfs/libxfs/xfs_ag.c
> > +++ b/fs/xfs/libxfs/xfs_ag.c
> > @@ -173,7 +173,6 @@ __xfs_free_perag(
> >  	struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
> >  
> >  	ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
> > -	ASSERT(atomic_read(&pag->pag_ref) == 0);
> 
> Er, shouldn't this also be converted to XFS_IS_CORRUPT?  That's what the
> commit message said...

That's in the RCU callback context and we never get here when the
ASSERT fires. i.e. the assert in xfs_free_perag fires before we
queue the rcu callback to free this, so checking it here is kinda
redundant.

i.e. it's not where this issue is being caught - it's
being caught by the check below (in xfs_free_perag()) where the
conversion to XFS_IS_CORRUPT is done....

Cheers,

Dave.

> >  	kmem_free(pag);
> >  }
> >  
> > @@ -192,7 +191,7 @@ xfs_free_perag(
> >  		pag = radix_tree_delete(&mp->m_perag_tree, agno);
> >  		spin_unlock(&mp->m_perag_lock);
> >  		ASSERT(pag);
> > -		ASSERT(atomic_read(&pag->pag_ref) == 0);
> > +		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
> >  
> >  		cancel_delayed_work_sync(&pag->pag_blockgc_work);
> >  		xfs_iunlink_destroy(pag);
> > -- 
> > 2.35.1
> > 
> 

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: don't assert fail on perag references on teardown
  2022-05-24  4:00     ` Dave Chinner
@ 2022-05-24  4:10       ` Darrick J. Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2022-05-24  4:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 24, 2022 at 02:00:15PM +1000, Dave Chinner wrote:
> On Mon, May 23, 2022 at 08:48:06PM -0700, Darrick J. Wong wrote:
> > On Tue, May 24, 2022 at 12:21:57PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > Not fatal, the assert is there to catch developer attention. I'm
> > > seeing this occasionally during recoveryloop testing after a
> > > shutdown, and I don't want this to stop an overnight recoveryloop
> > > run as it is currently doing.
> > > 
> > > Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a
> > > corruption report into the log and cause a test failure that way,
> > > but it won't stop the machine dead.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_ag.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> > > index 1e4ee042d52f..3e920cf1b454 100644
> > > --- a/fs/xfs/libxfs/xfs_ag.c
> > > +++ b/fs/xfs/libxfs/xfs_ag.c
> > > @@ -173,7 +173,6 @@ __xfs_free_perag(
> > >  	struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
> > >  
> > >  	ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
> > > -	ASSERT(atomic_read(&pag->pag_ref) == 0);
> > 
> > Er, shouldn't this also be converted to XFS_IS_CORRUPT?  That's what the
> > commit message said...
> 
> That's in the RCU callback context and we never get here when the
> ASSERT fires. i.e. the assert in xfs_free_perag fires before we
> queue the rcu callback to free this, so checking it here is kinda
> redundant.
> 
> i.e. it's not where this issue is being caught - it's
> being caught by the check below (in xfs_free_perag()) where the
> conversion to XFS_IS_CORRUPT is done....

Ah, right.  Ok then,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> Cheers,
> 
> Dave.
> 
> > >  	kmem_free(pag);
> > >  }
> > >  
> > > @@ -192,7 +191,7 @@ xfs_free_perag(
> > >  		pag = radix_tree_delete(&mp->m_perag_tree, agno);
> > >  		spin_unlock(&mp->m_perag_lock);
> > >  		ASSERT(pag);
> > > -		ASSERT(atomic_read(&pag->pag_ref) == 0);
> > > +		XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
> > >  
> > >  		cancel_delayed_work_sync(&pag->pag_blockgc_work);
> > >  		xfs_iunlink_destroy(pag);
> > > -- 
> > > 2.35.1
> > > 
> > 
> 
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions
  2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
  2022-05-24  3:46   ` Darrick J. Wong
@ 2022-05-24  8:13   ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-24  8:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] xfs: don't assert fail on perag references on teardown
  2022-05-24  2:21 ` [PATCH 2/3] xfs: don't assert fail on perag references on teardown Dave Chinner
  2022-05-24  3:48   ` Darrick J. Wong
@ 2022-05-24  8:14   ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-24  8:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error
  2022-05-24  2:21 ` [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error Dave Chinner
  2022-05-24  3:48   ` Darrick J. Wong
@ 2022-05-24  8:15   ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-05-24  8:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-05-24  8:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24  2:21 [PATCH 0/3] xfs: small fixes for 5.19 cycle Dave Chinner
2022-05-24  2:21 ` [PATCH 1/3] xfs: avoid unnecessary runtime sibling pointer endian conversions Dave Chinner
2022-05-24  3:46   ` Darrick J. Wong
2022-05-24  8:13   ` Christoph Hellwig
2022-05-24  2:21 ` [PATCH 2/3] xfs: don't assert fail on perag references on teardown Dave Chinner
2022-05-24  3:48   ` Darrick J. Wong
2022-05-24  4:00     ` Dave Chinner
2022-05-24  4:10       ` Darrick J. Wong
2022-05-24  8:14   ` Christoph Hellwig
2022-05-24  2:21 ` [PATCH 3/3] xfs: assert in xfs_btree_del_cursor should take into account error Dave Chinner
2022-05-24  3:48   ` Darrick J. Wong
2022-05-24  8:15   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.