All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck
@ 2022-10-02 18:19 Darrick J. Wong
  2022-10-02 18:19 ` [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-02 18:19 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

Hi all,

Here we fix a couple of problems with the errno values that we return to
userspace.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=scrub-fix-return-value
---
 fs/xfs/scrub/common.h |    2 +-
 fs/xfs/scrub/quota.c  |    2 +-
 fs/xfs/scrub/repair.c |    9 ++++++---
 3 files changed, 8 insertions(+), 5 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub
  2022-10-02 18:19 [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck Darrick J. Wong
  2022-10-02 18:19 ` [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned Darrick J. Wong
  2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
@ 2022-10-02 18:19 ` Darrick J. Wong
  2022-10-13 22:43   ` Dave Chinner
  2022-10-02 18:19 ` [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub Darrick J. Wong
  3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-02 18:19 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If the program calling online fsck is terminated with a fatal signal,
bail out to userspace by returning EINTR, not EAGAIN.  EAGAIN is used by
scrubbers to indicate that we should try again with more resources
locked, and not to indicate that the operation was cancelled.  The
miswiring is mostly harmless, but it shows up in the trace data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/common.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 454145db10e7..b73648d81d23 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -25,7 +25,7 @@ xchk_should_terminate(
 
 	if (fatal_signal_pending(current)) {
 		if (*error == 0)
-			*error = -EAGAIN;
+			*error = -EINTR;
 		return true;
 	}
 	return false;


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub
  2022-10-02 18:19 [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck Darrick J. Wong
                   ` (2 preceding siblings ...)
  2022-10-02 18:19 ` [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub Darrick J. Wong
@ 2022-10-02 18:19 ` Darrick J. Wong
  2022-10-13 22:43   ` Dave Chinner
  3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-02 18:19 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If the scrub process is sent a fatal signal while we're checking dquots,
the predicate for this will set the error code to -EINTR.  Don't then
squash that into -ECANCELED, because the wrong errno turns up in the
trace output.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/quota.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
index 21b4c9006859..0b643ff32b22 100644
--- a/fs/xfs/scrub/quota.c
+++ b/fs/xfs/scrub/quota.c
@@ -84,7 +84,7 @@ xchk_quota_item(
 	int			error = 0;
 
 	if (xchk_should_terminate(sc, &error))
-		return -ECANCELED;
+		return error;
 
 	/*
 	 * Except for the root dquot, the actual dquot we got must either have


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned
  2022-10-02 18:19 [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck Darrick J. Wong
@ 2022-10-02 18:19 ` Darrick J. Wong
  2022-10-13 22:46   ` Dave Chinner
  2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-02 18:19 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Repair functions will not return EAGAIN -- if they were not able to
obtain resources, they should return EDEADLOCK (like the rest of online
fsck) to signal that we need to grab all the resources and try again.
Hence we don't need to deal with this case except as a debugging
assertion.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/repair.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index 92c661b98892..f6c4cb013346 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -61,7 +61,6 @@ xrep_attempt(
 		sc->flags |= XREP_ALREADY_FIXED;
 		return -EAGAIN;
 	case -EDEADLOCK:
-	case -EAGAIN:
 		/* Tell the caller to try again having grabbed all the locks. */
 		if (!(sc->flags & XCHK_TRY_HARDER)) {
 			sc->flags |= XCHK_TRY_HARDER;
@@ -73,6 +72,10 @@ xrep_attempt(
 		 * so report back to userspace.
 		 */
 		return -EFSCORRUPTED;
+	case -EAGAIN:
+		/* Repair functions should return EDEADLOCK, not EAGAIN. */
+		ASSERT(0);
+		fallthrough;
 	default:
 		return error;
 	}


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-10-02 18:19 [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck Darrick J. Wong
  2022-10-02 18:19 ` [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned Darrick J. Wong
@ 2022-10-02 18:19 ` Darrick J. Wong
  2022-10-13 22:49   ` Dave Chinner
                     ` (2 more replies)
  2022-10-02 18:19 ` [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub Darrick J. Wong
  2022-10-02 18:19 ` [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub Darrick J. Wong
  3 siblings, 3 replies; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-02 18:19 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If we tried to repair something but the repair failed with -EDEADLOCK or
-EAGAIN, that means that the repair function couldn't grab some resource
it needed and wants us to try again.  If we try again (with TRY_HARDER)
but still can't do it, exit back to userspace, since xfs_scrub_metadata
requires xrep_attempt to return -EAGAIN.

This makes the return value diagnostics look less weird, and fixes a
wart that remains from very early in the repair implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/repair.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index f6c4cb013346..34fc0dc5f200 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -69,9 +69,9 @@ xrep_attempt(
 		/*
 		 * We tried harder but still couldn't grab all the resources
 		 * we needed to fix it.  The corruption has not been fixed,
-		 * so report back to userspace.
+		 * so exit to userspace.
 		 */
-		return -EFSCORRUPTED;
+		return 0;
 	case -EAGAIN:
 		/* Repair functions should return EDEADLOCK, not EAGAIN. */
 		ASSERT(0);


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub
  2022-10-02 18:19 ` [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub Darrick J. Wong
@ 2022-10-13 22:43   ` Dave Chinner
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2022-10-13 22:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If the program calling online fsck is terminated with a fatal signal,
> bail out to userspace by returning EINTR, not EAGAIN.  EAGAIN is used by
> scrubbers to indicate that we should try again with more resources
> locked, and not to indicate that the operation was cancelled.  The
> miswiring is mostly harmless, but it shows up in the trace data.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/scrub/common.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
> index 454145db10e7..b73648d81d23 100644
> --- a/fs/xfs/scrub/common.h
> +++ b/fs/xfs/scrub/common.h
> @@ -25,7 +25,7 @@ xchk_should_terminate(
>  
>  	if (fatal_signal_pending(current)) {
>  		if (*error == 0)
> -			*error = -EAGAIN;
> +			*error = -EINTR;
>  		return true;
>  	}
>  	return false;

Makes sense.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub
  2022-10-02 18:19 ` [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub Darrick J. Wong
@ 2022-10-13 22:43   ` Dave Chinner
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2022-10-13 22:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If the scrub process is sent a fatal signal while we're checking dquots,
> the predicate for this will set the error code to -EINTR.  Don't then
> squash that into -ECANCELED, because the wrong errno turns up in the
> trace output.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/scrub/quota.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
> index 21b4c9006859..0b643ff32b22 100644
> --- a/fs/xfs/scrub/quota.c
> +++ b/fs/xfs/scrub/quota.c
> @@ -84,7 +84,7 @@ xchk_quota_item(
>  	int			error = 0;
>  
>  	if (xchk_should_terminate(sc, &error))
> -		return -ECANCELED;
> +		return error;

*nod*

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned
  2022-10-02 18:19 ` [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned Darrick J. Wong
@ 2022-10-13 22:46   ` Dave Chinner
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2022-10-13 22:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Repair functions will not return EAGAIN -- if they were not able to
> obtain resources, they should return EDEADLOCK (like the rest of online
> fsck) to signal that we need to grab all the resources and try again.
> Hence we don't need to deal with this case except as a debugging
> assertion.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/scrub/repair.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
> index 92c661b98892..f6c4cb013346 100644
> --- a/fs/xfs/scrub/repair.c
> +++ b/fs/xfs/scrub/repair.c
> @@ -61,7 +61,6 @@ xrep_attempt(
>  		sc->flags |= XREP_ALREADY_FIXED;
>  		return -EAGAIN;
>  	case -EDEADLOCK:
> -	case -EAGAIN:
>  		/* Tell the caller to try again having grabbed all the locks. */
>  		if (!(sc->flags & XCHK_TRY_HARDER)) {
>  			sc->flags |= XCHK_TRY_HARDER;
> @@ -73,6 +72,10 @@ xrep_attempt(
>  		 * so report back to userspace.
>  		 */
>  		return -EFSCORRUPTED;
> +	case -EAGAIN:
> +		/* Repair functions should return EDEADLOCK, not EAGAIN. */
> +		ASSERT(0);
> +		fallthrough;
>  	default:
>  		return error;
>  	}

I'd do this rather than ASSERT(0) and fall through:

	default:
		ASSERT(error != -EAGAIN);
		return error;
	}

but that's just personal preference. Change it if you want, either
way works so:

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
@ 2022-10-13 22:49   ` Dave Chinner
  2022-10-14 21:44     ` Darrick J. Wong
  2022-11-04 20:35   ` [PATCH v23.2 " Darrick J. Wong
  2022-11-08  1:28   ` [PATCH v23.3 " Darrick J. Wong
  2 siblings, 1 reply; 14+ messages in thread
From: Dave Chinner @ 2022-10-13 22:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If we tried to repair something but the repair failed with -EDEADLOCK or
> -EAGAIN, that means that the repair function couldn't grab some resource

Nothing should fail with EAGAIN by this point?

> it needed and wants us to try again.  If we try again (with TRY_HARDER)
> but still can't do it, exit back to userspace, since xfs_scrub_metadata
> requires xrep_attempt to return -EAGAIN.

-EDEADLOCK, not -EAGAIN?

Confused.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-10-13 22:49   ` Dave Chinner
@ 2022-10-14 21:44     ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2022-10-14 21:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Fri, Oct 14, 2022 at 09:49:11AM +1100, Dave Chinner wrote:
> On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > If we tried to repair something but the repair failed with -EDEADLOCK or
> > -EAGAIN, that means that the repair function couldn't grab some resource
> 
> Nothing should fail with EAGAIN by this point?

Right.

> > it needed and wants us to try again.  If we try again (with TRY_HARDER)
> > but still can't do it, exit back to userspace, since xfs_scrub_metadata
> > requires xrep_attempt to return -EAGAIN.
> 
> -EDEADLOCK, not -EAGAIN?

That part of the message confused me too.

How about this revision?

"If we tried to repair something but the repair failed with -EDEADLOCK,
that means that the repair function couldn't grab some resource it
needed and wants us to try again.  If we try again (with TRY_HARDER) but
still can't get all the resources we need, the repair fails and errors
remain on the filesystem.

"Right now, repair returns the -EDEADLOCK to the caller, which passes it
up to userspace without copying the xfs_scrub_metadata structure back to
userspace.  This is very confusing for userspace since xfs_scrub merely
reports "Resource deadlock would occur" and gives no indication that
there are uncorrected errors on the filesystem.  Hence we want to return
0 here so that the ioctl code copies the CORRUPTION flag back to
userspace."

Clearer, I hope?

--D

> 
> Confused.
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v23.2 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
  2022-10-13 22:49   ` Dave Chinner
@ 2022-11-04 20:35   ` Darrick J. Wong
  2022-11-08  1:27     ` Darrick J. Wong
  2022-11-08  1:28   ` [PATCH v23.3 " Darrick J. Wong
  2 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2022-11-04 20:35 UTC (permalink / raw)
  To: linux-xfs, Dave Chinner

From: Darrick J. Wong <djwong@kernel.org>

If we tried to repair something but the repair failed with -EDEADLOCK,
that means that the repair function couldn't grab some resource it
needed and wants us to try again.  If we try again (with TRY_HARDER) but
still can't get all the resources we need, the repair fails and errors
remain on the filesystem.

Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
This is not correct because repair has not determined that anything is
corrupt.  If the repair had been invoked on an object that could be
optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
resources will be reported to userspace as corrupt metadata, and users
will be unnecessarily alarmed that their suboptimal metadata turned into
a corruption.

Fix this by returning zero so that the results of the actual scrub will
be copied back out to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
v23.2: fix the commit message to discuss what's really going on in this
       patch.
---
 fs/xfs/scrub/repair.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index 7323bd9fddfb..86f770af6737 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -69,9 +69,9 @@ xrep_attempt(
 		/*
 		 * We tried harder but still couldn't grab all the resources
 		 * we needed to fix it.  The corruption has not been fixed,
-		 * so report back to userspace.
+		 * so exit to userspace with the corruption flags still set.
 		 */
-		return -EFSCORRUPTED;
+		return 0;
 	default:
 		/*
 		 * EAGAIN tells the caller to re-scrub, so we cannot return

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v23.2 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-11-04 20:35   ` [PATCH v23.2 " Darrick J. Wong
@ 2022-11-08  1:27     ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2022-11-08  1:27 UTC (permalink / raw)
  To: linux-xfs, Dave Chinner

On Fri, Nov 04, 2022 at 01:35:44PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If we tried to repair something but the repair failed with -EDEADLOCK,
> that means that the repair function couldn't grab some resource it
> needed and wants us to try again.  If we try again (with TRY_HARDER) but
> still can't get all the resources we need, the repair fails and errors
> remain on the filesystem.
> 
> Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
> which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
> This is not correct because repair has not determined that anything is
> corrupt.  If the repair had been invoked on an object that could be
> optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
> resources will be reported to userspace as corrupt metadata, and users
> will be unnecessarily alarmed that their suboptimal metadata turned into
> a corruption.
> 
> Fix this by returning zero so that the results of the actual scrub will
> be copied back out to userspace.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> v23.2: fix the commit message to discuss what's really going on in this
>        patch.
> ---
>  fs/xfs/scrub/repair.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
> index 7323bd9fddfb..86f770af6737 100644
> --- a/fs/xfs/scrub/repair.c
> +++ b/fs/xfs/scrub/repair.c
> @@ -69,9 +69,9 @@ xrep_attempt(
>  		/*
>  		 * We tried harder but still couldn't grab all the resources
>  		 * we needed to fix it.  The corruption has not been fixed,
> -		 * so report back to userspace.
> +		 * so exit to userspace with the corruption flags still set.

<groan> this wording is vague, i'll try again...

--D

>  		 */
> -		return -EFSCORRUPTED;
> +		return 0;
>  	default:
>  		/*
>  		 * EAGAIN tells the caller to re-scrub, so we cannot return

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v23.3 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
  2022-10-13 22:49   ` Dave Chinner
  2022-11-04 20:35   ` [PATCH v23.2 " Darrick J. Wong
@ 2022-11-08  1:28   ` Darrick J. Wong
  2022-11-15  2:54     ` Dave Chinner
  2 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2022-11-08  1:28 UTC (permalink / raw)
  To: linux-xfs, Dave Chinner

From: Darrick J. Wong <djwong@kernel.org>

If we tried to repair something but the repair failed with -EDEADLOCK,
that means that the repair function couldn't grab some resource it
needed and wants us to try again.  If we try again (with TRY_HARDER) but
still can't get all the resources we need, the repair fails and errors
remain on the filesystem.

Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
This is not correct because repair has not determined that anything is
corrupt.  If the repair had been invoked on an object that could be
optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
resources will be reported to userspace as corrupt metadata, and users
will be unnecessarily alarmed that their suboptimal metadata turned into
a corruption.

Fix this by returning zero so that the results of the actual scrub will
be copied back out to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
v23.3: fix vague wording of comment
v23.2: fix the commit message to discuss what's really going on in this
patch.
---
 fs/xfs/scrub/repair.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index 7323bd9fddfb..4b92f9253ccd 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -69,9 +69,9 @@ xrep_attempt(
 		/*
 		 * We tried harder but still couldn't grab all the resources
 		 * we needed to fix it.  The corruption has not been fixed,
-		 * so report back to userspace.
+		 * so exit to userspace with the scan's output flags unchanged.
 		 */
-		return -EFSCORRUPTED;
+		return 0;
 	default:
 		/*
 		 * EAGAIN tells the caller to re-scrub, so we cannot return

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v23.3 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed
  2022-11-08  1:28   ` [PATCH v23.3 " Darrick J. Wong
@ 2022-11-15  2:54     ` Dave Chinner
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Chinner @ 2022-11-15  2:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Nov 07, 2022 at 05:28:41PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If we tried to repair something but the repair failed with -EDEADLOCK,
> that means that the repair function couldn't grab some resource it
> needed and wants us to try again.  If we try again (with TRY_HARDER) but
> still can't get all the resources we need, the repair fails and errors
> remain on the filesystem.
> 
> Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
> which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
> This is not correct because repair has not determined that anything is
> corrupt.  If the repair had been invoked on an object that could be
> optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
> resources will be reported to userspace as corrupt metadata, and users
> will be unnecessarily alarmed that their suboptimal metadata turned into
> a corruption.
> 
> Fix this by returning zero so that the results of the actual scrub will
> be copied back out to userspace.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> v23.3: fix vague wording of comment
> v23.2: fix the commit message to discuss what's really going on in this
> patch.
> ---
>  fs/xfs/scrub/repair.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Looks good.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-15  2:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-02 18:19 [PATCHSET v23.1 0/4] xfs: fix incorrect return values in online fsck Darrick J. Wong
2022-10-02 18:19 ` [PATCH 3/4] xfs: don't retry repairs harder when EAGAIN is returned Darrick J. Wong
2022-10-13 22:46   ` Dave Chinner
2022-10-02 18:19 ` [PATCH 4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed Darrick J. Wong
2022-10-13 22:49   ` Dave Chinner
2022-10-14 21:44     ` Darrick J. Wong
2022-11-04 20:35   ` [PATCH v23.2 " Darrick J. Wong
2022-11-08  1:27     ` Darrick J. Wong
2022-11-08  1:28   ` [PATCH v23.3 " Darrick J. Wong
2022-11-15  2:54     ` Dave Chinner
2022-10-02 18:19 ` [PATCH 1/4] xfs: return EINTR when a fatal signal terminates scrub Darrick J. Wong
2022-10-13 22:43   ` Dave Chinner
2022-10-02 18:19 ` [PATCH 2/4] xfs: fix return code when fatal signal encountered during dquot scrub Darrick J. Wong
2022-10-13 22:43   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.