All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: limit superblock corruption errors to probable corruption
@ 2014-01-29  5:11 Eric Sandeen
  2014-01-30 20:26 ` Brian Foster
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Sandeen @ 2014-01-29  5:11 UTC (permalink / raw)
  To: xfs-oss

Today, if

xfs_sb_read_verify
  xfs_sb_verify
    xfs_mount_validate_sb

detects superblock corruption, it'll be extremely noisy, dumping
2 stacks, 2 hexdumps, etc.

This is because we call XFS_CORRUPTION_ERROR in xfs_mount_validate_sb
as well as in xfs_sb_read_verify.

Also, *any* errors in xfs_mount_validate_sb which are not corruption
per se; things like too-big-blocksize, bad version, bad magic, v1 dirs,
rw-incompat etc - things which do not return EFSCORRUPTED - will
still do the whole XFS_CORRUPTION_ERROR spew when xfs_sb_read_verify
sees any error at all.  And it suggests to the user that they 
should run xfs_repair, even if the root cause of the mount failure
is a simple incompatibility.

I'll submit that the probably-not-corrupted errors don't warrant
this much noise, so this patch removes the high-level
XFS_CORRUPTION_ERROR which was firing for every error return
except EWRONGFS.

It also adds one to the path which detects a failed checksum.

The idea is, if it's really _corruption_ we can call
XFS_CORRUPTION_ERROR at the point of detection.  More benign
incompatibilities can do a little printk & fail the mount without
so much drama.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

I could see an argument where we might still want the hexdump
for things like bad magic - ok, just what *was* the magic?  But
I think we do need to reserve the oops-mimicing-backtraces for
the most severe problems.  Discuss.  ;)

diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
index 511cce9..b575317 100644
--- a/fs/xfs/xfs_sb.c
+++ b/fs/xfs/xfs_sb.c
@@ -617,6 +617,8 @@ xfs_sb_read_verify(
 			/* Only fail bad secondaries on a known V5 filesystem */
 			if (bp->b_bn != XFS_SB_DADDR &&
 			    xfs_sb_version_hascrc(&mp->m_sb)) {
+				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
+						     mp, bp->b_addr);
 				error = EFSCORRUPTED;
 				goto out_error;
 			}
@@ -625,12 +627,8 @@ xfs_sb_read_verify(
 	error = xfs_sb_verify(bp, true);
 
 out_error:
-	if (error) {
-		if (error != EWRONGFS)
-			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
-					     mp, bp->b_addr);
+	if (error)
 		xfs_buf_ioerror(bp, error);
-	}
 }
 
 /*

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
  2014-01-29  5:11 [PATCH] xfs: limit superblock corruption errors to probable corruption Eric Sandeen
@ 2014-01-30 20:26 ` Brian Foster
  2014-01-30 20:30   ` Eric Sandeen
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Foster @ 2014-01-30 20:26 UTC (permalink / raw)
  To: Eric Sandeen, xfs-oss

On 01/29/2014 12:11 AM, Eric Sandeen wrote:
> Today, if
> 
> xfs_sb_read_verify
>   xfs_sb_verify
>     xfs_mount_validate_sb
> 
> detects superblock corruption, it'll be extremely noisy, dumping
> 2 stacks, 2 hexdumps, etc.
> 
> This is because we call XFS_CORRUPTION_ERROR in xfs_mount_validate_sb
> as well as in xfs_sb_read_verify.
> 
> Also, *any* errors in xfs_mount_validate_sb which are not corruption
> per se; things like too-big-blocksize, bad version, bad magic, v1 dirs,
> rw-incompat etc - things which do not return EFSCORRUPTED - will
> still do the whole XFS_CORRUPTION_ERROR spew when xfs_sb_read_verify
> sees any error at all.  And it suggests to the user that they 
> should run xfs_repair, even if the root cause of the mount failure
> is a simple incompatibility.
> 
> I'll submit that the probably-not-corrupted errors don't warrant
> this much noise, so this patch removes the high-level
> XFS_CORRUPTION_ERROR which was firing for every error return
> except EWRONGFS.
> 
> It also adds one to the path which detects a failed checksum.
> 
> The idea is, if it's really _corruption_ we can call
> XFS_CORRUPTION_ERROR at the point of detection.  More benign
> incompatibilities can do a little printk & fail the mount without
> so much drama.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> I could see an argument where we might still want the hexdump
> for things like bad magic - ok, just what *was* the magic?  But
> I think we do need to reserve the oops-mimicing-backtraces for
> the most severe problems.  Discuss.  ;)
> 

This seems pretty reasonable to me, particularly if pretty much any
error via the xfs_sb_verify() path dumps corruption noise...

> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
> index 511cce9..b575317 100644
> --- a/fs/xfs/xfs_sb.c
> +++ b/fs/xfs/xfs_sb.c
> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
>  			/* Only fail bad secondaries on a known V5 filesystem */
>  			if (bp->b_bn != XFS_SB_DADDR &&
>  			    xfs_sb_version_hascrc(&mp->m_sb)) {
> +				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> +						     mp, bp->b_addr);
>  				error = EFSCORRUPTED;
>  				goto out_error;
>  			}
> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
>  	error = xfs_sb_verify(bp, true);
>  
>  out_error:
> -	if (error) {
> -		if (error != EWRONGFS)
> -			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> -					     mp, bp->b_addr);
> +	if (error)
>  		xfs_buf_ioerror(bp, error);
> -	}
>  }

... but why not leave the corruption output here in out_error, change
the check to (error == EFSCORRUPTED) and remove the now duplicate
corruption message in xfs_mount_validate_sb() (or replace it with a
warn/notice message)? This would catch the other EFSCORRUPTED returns in
a consistent manner, including another potential duplicate in the write
verifier. I guess we'd lose a little specificity between the crc failure
and sb validation, but we could add a warn/notice for the former too.

Brian

>  
>  /*
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
  2014-01-30 20:26 ` Brian Foster
@ 2014-01-30 20:30   ` Eric Sandeen
  2014-01-30 20:54     ` Brian Foster
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Sandeen @ 2014-01-30 20:30 UTC (permalink / raw)
  To: Brian Foster, xfs-oss

On 1/30/14, 2:26 PM, Brian Foster wrote:
>> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
>> > index 511cce9..b575317 100644
>> > --- a/fs/xfs/xfs_sb.c
>> > +++ b/fs/xfs/xfs_sb.c
>> > @@ -617,6 +617,8 @@ xfs_sb_read_verify(
>> >  			/* Only fail bad secondaries on a known V5 filesystem */
>> >  			if (bp->b_bn != XFS_SB_DADDR &&
>> >  			    xfs_sb_version_hascrc(&mp->m_sb)) {
>> > +				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>> > +						     mp, bp->b_addr);
>> >  				error = EFSCORRUPTED;
>> >  				goto out_error;
>> >  			}
>> > @@ -625,12 +627,8 @@ xfs_sb_read_verify(
>> >  	error = xfs_sb_verify(bp, true);
>> >  
>> >  out_error:
>> > -	if (error) {
>> > -		if (error != EWRONGFS)
>> > -			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>> > -					     mp, bp->b_addr);
>> > +	if (error)
>> >  		xfs_buf_ioerror(bp, error);
>> > -	}
>> >  }
> ... but why not leave the corruption output here in out_error, change
> the check to (error == EFSCORRUPTED) and remove the now duplicate
> corruption message in xfs_mount_validate_sb() (or replace it with a
> warn/notice message)? This would catch the other EFSCORRUPTED returns in
> a consistent manner, including another potential duplicate in the write
> verifier. I guess we'd lose a little specificity between the crc failure
> and sb validation, but we could add a warn/notice for the former too.
> 
> Brian
> 

Well, I went back and forth on this.  It's probably philosophical. ;)

Should we emit the corruption error at the point of corruption detection,
or at a higher level?  I guess my concern was that while *this* caller
might catch the return & yell, if another caller got added it might not.

Putting it at the point of detection seemed foolproof in that regard.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
  2014-01-30 20:30   ` Eric Sandeen
@ 2014-01-30 20:54     ` Brian Foster
  2014-02-06  6:43       ` Dave Chinner
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Foster @ 2014-01-30 20:54 UTC (permalink / raw)
  To: Eric Sandeen, xfs-oss

On 01/30/2014 03:30 PM, Eric Sandeen wrote:
> On 1/30/14, 2:26 PM, Brian Foster wrote:
>>> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
>>>> index 511cce9..b575317 100644
>>>> --- a/fs/xfs/xfs_sb.c
>>>> +++ b/fs/xfs/xfs_sb.c
>>>> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
>>>>  			/* Only fail bad secondaries on a known V5 filesystem */
>>>>  			if (bp->b_bn != XFS_SB_DADDR &&
>>>>  			    xfs_sb_version_hascrc(&mp->m_sb)) {
>>>> +				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>>>> +						     mp, bp->b_addr);
>>>>  				error = EFSCORRUPTED;
>>>>  				goto out_error;
>>>>  			}
>>>> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
>>>>  	error = xfs_sb_verify(bp, true);
>>>>  
>>>>  out_error:
>>>> -	if (error) {
>>>> -		if (error != EWRONGFS)
>>>> -			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>>>> -					     mp, bp->b_addr);
>>>> +	if (error)
>>>>  		xfs_buf_ioerror(bp, error);
>>>> -	}
>>>>  }
>> ... but why not leave the corruption output here in out_error, change
>> the check to (error == EFSCORRUPTED) and remove the now duplicate
>> corruption message in xfs_mount_validate_sb() (or replace it with a
>> warn/notice message)? This would catch the other EFSCORRUPTED returns in
>> a consistent manner, including another potential duplicate in the write
>> verifier. I guess we'd lose a little specificity between the crc failure
>> and sb validation, but we could add a warn/notice for the former too.
>>
>> Brian
>>
> 
> Well, I went back and forth on this.  It's probably philosophical. ;)
> 
> Should we emit the corruption error at the point of corruption detection,
> or at a higher level?  I guess my concern was that while *this* caller
> might catch the return & yell, if another caller got added it might not.
> 
> Putting it at the point of detection seemed foolproof in that regard.
> 

Yeah, that makes sense too. If we were consistent, that model would
suggest the write verifier corruption message could go and we'd embed
corruption errors along with the other associated EFSCORRUPTED returns
(at least where the resulting message is appropriate) in
xfs_mount_validate_sb().

Either way seems reasonable to me. I guess if all the remaining
situations are in fact real corruption situations, the point of
detection approach is probably more resilient. It would still be nice to
make the verifiers consistent in that though. ;)

Brian

> -Eric
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
  2014-01-30 20:54     ` Brian Foster
@ 2014-02-06  6:43       ` Dave Chinner
  2014-02-07  4:23         ` Eric Sandeen
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2014-02-06  6:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: Eric Sandeen, xfs-oss

On Thu, Jan 30, 2014 at 03:54:16PM -0500, Brian Foster wrote:
> On 01/30/2014 03:30 PM, Eric Sandeen wrote:
> > On 1/30/14, 2:26 PM, Brian Foster wrote:
> >>> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
> >>>> index 511cce9..b575317 100644
> >>>> --- a/fs/xfs/xfs_sb.c
> >>>> +++ b/fs/xfs/xfs_sb.c
> >>>> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
> >>>>  			/* Only fail bad secondaries on a known V5 filesystem */
> >>>>  			if (bp->b_bn != XFS_SB_DADDR &&
> >>>>  			    xfs_sb_version_hascrc(&mp->m_sb)) {
> >>>> +				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> >>>> +						     mp, bp->b_addr);
> >>>>  				error = EFSCORRUPTED;
> >>>>  				goto out_error;
> >>>>  			}
> >>>> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
> >>>>  	error = xfs_sb_verify(bp, true);
> >>>>  
> >>>>  out_error:
> >>>> -	if (error) {
> >>>> -		if (error != EWRONGFS)
> >>>> -			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
> >>>> -					     mp, bp->b_addr);
> >>>> +	if (error)
> >>>>  		xfs_buf_ioerror(bp, error);
> >>>> -	}
> >>>>  }
> >> ... but why not leave the corruption output here in out_error, change
> >> the check to (error == EFSCORRUPTED) and remove the now duplicate
> >> corruption message in xfs_mount_validate_sb() (or replace it with a
> >> warn/notice message)? This would catch the other EFSCORRUPTED returns in
> >> a consistent manner, including another potential duplicate in the write
> >> verifier. I guess we'd lose a little specificity between the crc failure
> >> and sb validation, but we could add a warn/notice for the former too.
> >>
> >> Brian
> >>
> > 
> > Well, I went back and forth on this.  It's probably philosophical. ;)
> > 
> > Should we emit the corruption error at the point of corruption detection,
> > or at a higher level?  I guess my concern was that while *this* caller
> > might catch the return & yell, if another caller got added it might not.
> > 
> > Putting it at the point of detection seemed foolproof in that regard.
> > 
> 
> Yeah, that makes sense too. If we were consistent, that model would
> suggest the write verifier corruption message could go and we'd embed
> corruption errors along with the other associated EFSCORRUPTED returns
> (at least where the resulting message is appropriate) in
> xfs_mount_validate_sb().
> 
> Either way seems reasonable to me. I guess if all the remaining
> situations are in fact real corruption situations, the point of
> detection approach is probably more resilient. It would still be nice to
> make the verifiers consistent in that though. ;)

And the conclusion to this discussion is ...?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xfs: limit superblock corruption errors to probable corruption
  2014-02-06  6:43       ` Dave Chinner
@ 2014-02-07  4:23         ` Eric Sandeen
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Sandeen @ 2014-02-07  4:23 UTC (permalink / raw)
  To: Dave Chinner, Brian Foster; +Cc: Eric Sandeen, xfs-oss

On 2/6/14, 12:43 AM, Dave Chinner wrote:
> On Thu, Jan 30, 2014 at 03:54:16PM -0500, Brian Foster wrote:
>> On 01/30/2014 03:30 PM, Eric Sandeen wrote:
>>> On 1/30/14, 2:26 PM, Brian Foster wrote:
>>>>> diff --git a/fs/xfs/xfs_sb.c b/fs/xfs/xfs_sb.c
>>>>>> index 511cce9..b575317 100644
>>>>>> --- a/fs/xfs/xfs_sb.c
>>>>>> +++ b/fs/xfs/xfs_sb.c
>>>>>> @@ -617,6 +617,8 @@ xfs_sb_read_verify(
>>>>>>  			/* Only fail bad secondaries on a known V5 filesystem */
>>>>>>  			if (bp->b_bn != XFS_SB_DADDR &&
>>>>>>  			    xfs_sb_version_hascrc(&mp->m_sb)) {
>>>>>> +				XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>>>>>> +						     mp, bp->b_addr);
>>>>>>  				error = EFSCORRUPTED;
>>>>>>  				goto out_error;
>>>>>>  			}
>>>>>> @@ -625,12 +627,8 @@ xfs_sb_read_verify(
>>>>>>  	error = xfs_sb_verify(bp, true);
>>>>>>  
>>>>>>  out_error:
>>>>>> -	if (error) {
>>>>>> -		if (error != EWRONGFS)
>>>>>> -			XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW,
>>>>>> -					     mp, bp->b_addr);
>>>>>> +	if (error)
>>>>>>  		xfs_buf_ioerror(bp, error);
>>>>>> -	}
>>>>>>  }
>>>> ... but why not leave the corruption output here in out_error, change
>>>> the check to (error == EFSCORRUPTED) and remove the now duplicate
>>>> corruption message in xfs_mount_validate_sb() (or replace it with a
>>>> warn/notice message)? This would catch the other EFSCORRUPTED returns in
>>>> a consistent manner, including another potential duplicate in the write
>>>> verifier. I guess we'd lose a little specificity between the crc failure
>>>> and sb validation, but we could add a warn/notice for the former too.
>>>>
>>>> Brian
>>>>
>>>
>>> Well, I went back and forth on this.  It's probably philosophical. ;)
>>>
>>> Should we emit the corruption error at the point of corruption detection,
>>> or at a higher level?  I guess my concern was that while *this* caller
>>> might catch the return & yell, if another caller got added it might not.
>>>
>>> Putting it at the point of detection seemed foolproof in that regard.
>>>
>>
>> Yeah, that makes sense too. If we were consistent, that model would
>> suggest the write verifier corruption message could go and we'd embed
>> corruption errors along with the other associated EFSCORRUPTED returns
>> (at least where the resulting message is appropriate) in
>> xfs_mount_validate_sb().
>>
>> Either way seems reasonable to me. I guess if all the remaining
>> situations are in fact real corruption situations, the point of
>> detection approach is probably more resilient. It would still be nice to
>> make the verifiers consistent in that though. ;)
> 
> And the conclusion to this discussion is ...?

I think Brian has some valid points, I'll take another look at it.

Thanks,
-Eric

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-02-07  4:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-29  5:11 [PATCH] xfs: limit superblock corruption errors to probable corruption Eric Sandeen
2014-01-30 20:26 ` Brian Foster
2014-01-30 20:30   ` Eric Sandeen
2014-01-30 20:54     ` Brian Foster
2014-02-06  6:43       ` Dave Chinner
2014-02-07  4:23         ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.