All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs_repair: update the manual content about xfs_repair exit status
@ 2017-01-11  5:18 Zorro Lang
  2017-01-11 13:47 ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Zorro Lang @ 2017-01-11  5:18 UTC (permalink / raw)
  To: linux-xfs

The man 8 xfs_repair said "xfs_repair run without the -n option will
always return a status code of 0". That's not correct.

xfs_repair will return 2 if it finds a fs log which needs to be
replayed or cleared, 1 if runtime error is encountered, and 0 for
all other cases.

Signed-off-by: Zorro Lang <zlang@redhat.com>
---

Hi,

This patch has been stayed in my local xfsprogs repo for a long
time. So I'm sending it out again :)

Thanks,
Zorro

 man/man8/xfs_repair.8 | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
index 1b4d9e3..314f2c2 100644
--- a/man/man8/xfs_repair.8
+++ b/man/man8/xfs_repair.8
@@ -504,12 +504,18 @@ that is known to be free. The entry is therefore invalid and is deleted.
 This message refers to a large directory.
 If the directory were small, the message would read "junking entry ...".
 .SH EXIT STATUS
+.TP
 .B xfs_repair \-n
-(no modify node)
+(no modify mode)
 will return a status of 1 if filesystem corruption was detected and
 0 if no filesystem corruption was detected.
+.TP
 .B xfs_repair
-run without the \-n option will always return a status code of 0.
+run without the \-n option will return a status code of 2 if it finds a
+filesystem log which needs to be replayed (by a mount/umount cycle) or
+cleared (by -L option), 1 if a runtime error is encountered, filesystem
+may be even more broken than before, so repair needs to be run again,
+and 0 in all other cases, whether or not filesystem corruption was detected.
 .SH BUGS
 The filesystem to be checked and repaired must have been
 unmounted cleanly using normal system administration procedures
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2017-01-11  5:18 [PATCH] xfs_repair: update the manual content about xfs_repair exit status Zorro Lang
@ 2017-01-11 13:47 ` Eric Sandeen
  2017-01-11 17:45   ` Darrick J. Wong
  2017-01-12  4:53   ` Zorro Lang
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Sandeen @ 2017-01-11 13:47 UTC (permalink / raw)
  To: Zorro Lang, linux-xfs



On 1/10/17 11:18 PM, Zorro Lang wrote:
> The man 8 xfs_repair said "xfs_repair run without the -n option will
> always return a status code of 0". That's not correct.
> 
> xfs_repair will return 2 if it finds a fs log which needs to be
> replayed or cleared, 1 if runtime error is encountered, and 0 for
> all other cases.
> 
> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
> 
> Hi,
> 
> This patch has been stayed in my local xfsprogs repo for a long
> time. So I'm sending it out again :)

Yep, sorry about that.  Last comment on it was that the sentence
had become a run-on sentence ...

How about:

.B xfs_repair
run without the -n option will always return a status code of 0 if it
runs without problems, regardless of whether filesystem corruption was
detected.  If an unexpected runtime error is encountered, it will return
a status code of 1, and xfs_repair should be restarted.  If a dirty
log is encountered which prevents it from continuing, it will return a
status code of 2.

(I think that the right place to document mount/unmount and/or -L is
/not/ in the status code docs - if we need that info, it should go
elsewhere.)

-eric

> Thanks,
> Zorro
> 
>  man/man8/xfs_repair.8 | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> index 1b4d9e3..314f2c2 100644
> --- a/man/man8/xfs_repair.8
> +++ b/man/man8/xfs_repair.8
> @@ -504,12 +504,18 @@ that is known to be free. The entry is therefore invalid and is deleted.
>  This message refers to a large directory.
>  If the directory were small, the message would read "junking entry ...".
>  .SH EXIT STATUS
> +.TP
>  .B xfs_repair \-n
> -(no modify node)
> +(no modify mode)
>  will return a status of 1 if filesystem corruption was detected and
>  0 if no filesystem corruption was detected.
> +.TP
>  .B xfs_repair
> -run without the \-n option will always return a status code of 0.
> +run without the \-n option will return a status code of 2 if it finds a
> +filesystem log which needs to be replayed (by a mount/umount cycle) or
> +cleared (by -L option), 1 if a runtime error is encountered, filesystem
> +may be even more broken than before, so repair needs to be run again,
> +and 0 in all other cases, whether or not filesystem corruption was detected.
>  .SH BUGS
>  The filesystem to be checked and repaired must have been
>  unmounted cleanly using normal system administration procedures
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2017-01-11 13:47 ` Eric Sandeen
@ 2017-01-11 17:45   ` Darrick J. Wong
  2017-01-12  5:00     ` Zorro Lang
  2017-01-12  4:53   ` Zorro Lang
  1 sibling, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2017-01-11 17:45 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Zorro Lang, linux-xfs

On Wed, Jan 11, 2017 at 07:47:41AM -0600, Eric Sandeen wrote:
> 
> 
> On 1/10/17 11:18 PM, Zorro Lang wrote:
> > The man 8 xfs_repair said "xfs_repair run without the -n option will
> > always return a status code of 0". That's not correct.
> > 
> > xfs_repair will return 2 if it finds a fs log which needs to be
> > replayed or cleared, 1 if runtime error is encountered, and 0 for
> > all other cases.
> > 
> > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > ---
> > 
> > Hi,
> > 
> > This patch has been stayed in my local xfsprogs repo for a long
> > time. So I'm sending it out again :)
> 
> Yep, sorry about that.  Last comment on it was that the sentence
> had become a run-on sentence ...
> 
> How about:
> 
> .B xfs_repair
> run without the -n option will always return a status code of 0 if it
> runs without problems, regardless of whether filesystem corruption was
> detected.  If an unexpected runtime error is encountered, it will return
> a status code of 1, and xfs_repair should be restarted.  If a dirty
> log is encountered which prevents it from continuing, it will return a
> status code of 2.
> 
> (I think that the right place to document mount/unmount and/or -L is
> /not/ in the status code docs - if we need that info, it should go
> elsewhere.)

We sort of mumble about needing to mount and umount to clear a dirty log
in the BUGS section, but I think we should just add a section about
dirty logs and what to do with them, then link to it from the status
code section and the -L option section.

"DIRTY LOGS

"Due to the design of the XFS log, a dirty log can only be replayed on a
machine having the same CPU architecture as the machine which was
writing to the log.  xfs_repair cannot replay a dirty log and will
return a status code of 2 when it detects a dirty log.

"In this situation, the log can be replayed by mounting and immediately
unmounting the filesystem on the same class of machine that crashed.
Please make sure that the machine's hardware is reliable before
replaying to avoid compounding the problems.

"If mounting fails, the log can be erased by running xfs_repair with
the -L option.  All metadata updates in progress at the time of the
crash will be lost, which may cause significant filesystem damage.  This
should only be used as a last resort."

(If you decide to add that paragraph,
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>)

Though I wonder why don't just fix the endianness issues with the log,
and teach xfs_repair how to replay them...

--D

> 
> -eric
> 
> > Thanks,
> > Zorro
> > 
> >  man/man8/xfs_repair.8 | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> > index 1b4d9e3..314f2c2 100644
> > --- a/man/man8/xfs_repair.8
> > +++ b/man/man8/xfs_repair.8
> > @@ -504,12 +504,18 @@ that is known to be free. The entry is therefore invalid and is deleted.
> >  This message refers to a large directory.
> >  If the directory were small, the message would read "junking entry ...".
> >  .SH EXIT STATUS
> > +.TP
> >  .B xfs_repair \-n
> > -(no modify node)
> > +(no modify mode)
> >  will return a status of 1 if filesystem corruption was detected and
> >  0 if no filesystem corruption was detected.
> > +.TP
> >  .B xfs_repair
> > -run without the \-n option will always return a status code of 0.
> > +run without the \-n option will return a status code of 2 if it finds a
> > +filesystem log which needs to be replayed (by a mount/umount cycle) or
> > +cleared (by -L option), 1 if a runtime error is encountered, filesystem
> > +may be even more broken than before, so repair needs to be run again,
> > +and 0 in all other cases, whether or not filesystem corruption was detected.
> >  .SH BUGS
> >  The filesystem to be checked and repaired must have been
> >  unmounted cleanly using normal system administration procedures
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2017-01-11 13:47 ` Eric Sandeen
  2017-01-11 17:45   ` Darrick J. Wong
@ 2017-01-12  4:53   ` Zorro Lang
  1 sibling, 0 replies; 9+ messages in thread
From: Zorro Lang @ 2017-01-12  4:53 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Wed, Jan 11, 2017 at 07:47:41AM -0600, Eric Sandeen wrote:
> 
> 
> On 1/10/17 11:18 PM, Zorro Lang wrote:
> > The man 8 xfs_repair said "xfs_repair run without the -n option will
> > always return a status code of 0". That's not correct.
> > 
> > xfs_repair will return 2 if it finds a fs log which needs to be
> > replayed or cleared, 1 if runtime error is encountered, and 0 for
> > all other cases.
> > 
> > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > ---
> > 
> > Hi,
> > 
> > This patch has been stayed in my local xfsprogs repo for a long
> > time. So I'm sending it out again :)
> 
> Yep, sorry about that.  Last comment on it was that the sentence
> had become a run-on sentence ...
> 
> How about:
> 
> .B xfs_repair
> run without the -n option will always return a status code of 0 if it
> runs without problems, regardless of whether filesystem corruption was
> detected.  If an unexpected runtime error is encountered, it will return
> a status code of 1, and xfs_repair should be restarted.  If a dirty
> log is encountered which prevents it from continuing, it will return a

If we use "dirty log" at here, I think maybe many people don't know how to deal
with "dirty log" (even what's dirty log:)

> status code of 2.
> 
> (I think that the right place to document mount/unmount and/or -L is
> /not/ in the status code docs - if we need that info, it should go
> elsewhere.)

Hmm, that sounds make sense, maybe I can add those description about
"-L" behind the "-L" option description line.

Thanks,
Zorro

> 
> -eric
> 
> > Thanks,
> > Zorro
> > 
> >  man/man8/xfs_repair.8 | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> > index 1b4d9e3..314f2c2 100644
> > --- a/man/man8/xfs_repair.8
> > +++ b/man/man8/xfs_repair.8
> > @@ -504,12 +504,18 @@ that is known to be free. The entry is therefore invalid and is deleted.
> >  This message refers to a large directory.
> >  If the directory were small, the message would read "junking entry ...".
> >  .SH EXIT STATUS
> > +.TP
> >  .B xfs_repair \-n
> > -(no modify node)
> > +(no modify mode)
> >  will return a status of 1 if filesystem corruption was detected and
> >  0 if no filesystem corruption was detected.
> > +.TP
> >  .B xfs_repair
> > -run without the \-n option will always return a status code of 0.
> > +run without the \-n option will return a status code of 2 if it finds a
> > +filesystem log which needs to be replayed (by a mount/umount cycle) or
> > +cleared (by -L option), 1 if a runtime error is encountered, filesystem
> > +may be even more broken than before, so repair needs to be run again,
> > +and 0 in all other cases, whether or not filesystem corruption was detected.
> >  .SH BUGS
> >  The filesystem to be checked and repaired must have been
> >  unmounted cleanly using normal system administration procedures
> > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2017-01-11 17:45   ` Darrick J. Wong
@ 2017-01-12  5:00     ` Zorro Lang
  0 siblings, 0 replies; 9+ messages in thread
From: Zorro Lang @ 2017-01-12  5:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Eric Sandeen, linux-xfs

On Wed, Jan 11, 2017 at 09:45:30AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 11, 2017 at 07:47:41AM -0600, Eric Sandeen wrote:
> > 
> > 
> > On 1/10/17 11:18 PM, Zorro Lang wrote:
> > > The man 8 xfs_repair said "xfs_repair run without the -n option will
> > > always return a status code of 0". That's not correct.
> > > 
> > > xfs_repair will return 2 if it finds a fs log which needs to be
> > > replayed or cleared, 1 if runtime error is encountered, and 0 for
> > > all other cases.
> > > 
> > > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > > ---
> > > 
> > > Hi,
> > > 
> > > This patch has been stayed in my local xfsprogs repo for a long
> > > time. So I'm sending it out again :)
> > 
> > Yep, sorry about that.  Last comment on it was that the sentence
> > had become a run-on sentence ...
> > 
> > How about:
> > 
> > .B xfs_repair
> > run without the -n option will always return a status code of 0 if it
> > runs without problems, regardless of whether filesystem corruption was
> > detected.  If an unexpected runtime error is encountered, it will return
> > a status code of 1, and xfs_repair should be restarted.  If a dirty
> > log is encountered which prevents it from continuing, it will return a
> > status code of 2.
> > 
> > (I think that the right place to document mount/unmount and/or -L is
> > /not/ in the status code docs - if we need that info, it should go
> > elsewhere.)
> 
> We sort of mumble about needing to mount and umount to clear a dirty log
> in the BUGS section, but I think we should just add a section about
> dirty logs and what to do with them, then link to it from the status
> code section and the -L option section.
> 
> "DIRTY LOGS
> 
> "Due to the design of the XFS log, a dirty log can only be replayed on a
> machine having the same CPU architecture as the machine which was
> writing to the log.  xfs_repair cannot replay a dirty log and will
> return a status code of 2 when it detects a dirty log.
> 
> "In this situation, the log can be replayed by mounting and immediately
> unmounting the filesystem on the same class of machine that crashed.
> Please make sure that the machine's hardware is reliable before
> replaying to avoid compounding the problems.
> 
> "If mounting fails, the log can be erased by running xfs_repair with
> the -L option.  All metadata updates in progress at the time of the
> crash will be lost, which may cause significant filesystem damage.  This
> should only be used as a last resort."

Ah, this's what I said on my last email. If we use 'dirty log' directly,
maybe we should explain what's dirty log and how to deal with it simply :)

I think there's no the best way to describe something in doc, so I'd
like to follow the maintainer's preference :)

Thanks,
Zorro

> 
> (If you decide to add that paragraph,
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>)
> 
> Though I wonder why don't just fix the endianness issues with the log,
> and teach xfs_repair how to replay them...
> 
> --D
> 
> > 
> > -eric
> > 
> > > Thanks,
> > > Zorro
> > > 
> > >  man/man8/xfs_repair.8 | 10 ++++++++--
> > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> > > index 1b4d9e3..314f2c2 100644
> > > --- a/man/man8/xfs_repair.8
> > > +++ b/man/man8/xfs_repair.8
> > > @@ -504,12 +504,18 @@ that is known to be free. The entry is therefore invalid and is deleted.
> > >  This message refers to a large directory.
> > >  If the directory were small, the message would read "junking entry ...".
> > >  .SH EXIT STATUS
> > > +.TP
> > >  .B xfs_repair \-n
> > > -(no modify node)
> > > +(no modify mode)
> > >  will return a status of 1 if filesystem corruption was detected and
> > >  0 if no filesystem corruption was detected.
> > > +.TP
> > >  .B xfs_repair
> > > -run without the \-n option will always return a status code of 0.
> > > +run without the \-n option will return a status code of 2 if it finds a
> > > +filesystem log which needs to be replayed (by a mount/umount cycle) or
> > > +cleared (by -L option), 1 if a runtime error is encountered, filesystem
> > > +may be even more broken than before, so repair needs to be run again,
> > > +and 0 in all other cases, whether or not filesystem corruption was detected.
> > >  .SH BUGS
> > >  The filesystem to be checked and repaired must have been
> > >  unmounted cleanly using normal system administration procedures
> > > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2016-09-13 14:44   ` Zorro Lang
@ 2016-09-13 14:49     ` Eric Sandeen
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2016-09-13 14:49 UTC (permalink / raw)
  To: Zorro Lang; +Cc: linux-xfs, xfs



On 9/13/16 9:44 AM, Zorro Lang wrote:
> On Mon, Sep 12, 2016 at 11:01:12AM -0500, Eric Sandeen wrote:
>> On 9/9/16 11:47 PM, Zorro Lang wrote:
>>> The man 8 xfs_repair said "xfs_repair run without the -n option will
>>> always return a status code of 0". That's not correct.
>>>
>>> xfs_repair will return 2 if it find valuable metadata changes in log
>>> which needs to be replayed, 1 if it can't fix the corruption or some
>>> other errors happened and 0 if nothing wrong or all the corruptions
>>> were fixed.
>>>
>>> Generally xfs_repair -L will always return 0, except it can't clear
>>> the log.
>>
>> And I think that's an operational type error, not the result
>> of a filesystem problem; more like an IO error, or a code bug,
>> I *think* ... more below.
>>
>>
>>> Signed-off-by: Zorro Lang <zlang@redhat.com>
>>> ---
>>>
>>> Hi,
>>>
>>> I  trusted the xfs_repair manpage, and thought xfs_repair will always return 0.
>>> But recently I found it lies when I tried to review someone xfstests case.
>>>
>>> A correct manpage will help more people to write right cases, so I try to modify
>>> the manpage, by search all exit/do_error in xfsprogs/repair. I'm not the best
>>> one who learn about xfs_repair, so I just hope I did the right thing:-P Please
>>> feel free to correct me.
>>>
>>> Thanks,
>>> Zorro
>>>
>>>  man/man8/xfs_repair.8 | 13 ++++++++++++-
>>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
>>> index 1b4d9e3..1f8f13b 100644
>>> --- a/man/man8/xfs_repair.8
>>> +++ b/man/man8/xfs_repair.8
>>> @@ -504,12 +504,23 @@ that is known to be free. The entry is therefore invalid and is deleted.
>>>  This message refers to a large directory.
>>>  If the directory were small, the message would read "junking entry ...".
>>>  .SH EXIT STATUS
>>> +.TP
>>>  .B xfs_repair \-n
>>>  (no modify node)
>>>  will return a status of 1 if filesystem corruption was detected and
>>>  0 if no filesystem corruption was detected.
>>> +.TP
>>>  .B xfs_repair
>>> -run without the \-n option will always return a status code of 0.
>>> +run without the \-n option will return a status code of 2 if it find the
>>> +filesystem has valuable metadata changes in log which needs to be
>>> +replayed, 1 if there's corruption left to be fixed
>>
>> I'm not sure that's the best description; from a quick look, I think
>> those exit values of 1 result from do_error(), and in repair that's
>> (usually?) due to something like a memory allocation failure, or an
>> inconsistent state in the tool; more like hitting an ASSERT.  That might
>> leave corruption, but only as a follow-on effect.
> 
> Hi Eric,
> 
> Many thanks for you can help to review this patch.
> 
> I've check all code will exit(1), generally it caused by memory or disk
> errors. But some other situations likes:
>  - No enough matching AGs or superblocks
>  - Primary superblock bad after phase 1
>  - Sector size on host filesystem larger than image sector size, when try
>    to repair a file image
>  ...
> 
> will exit(1) too.

Sigh, ok.  I guess the exit(1) has proliferated a lot.  :(

> But yes, they're all belong to runtime error:) There're too many situations
> can return 1. But only one place can return 2, so we can say except return 0
> and 2, others will return 1 :-P
> 
> 
>>
>>> + or can't find log head
>>> +and tail or some other errors happened, 
>>
>> Which is the same as above, I think - an internal error.
>>
>>> and 0 if nothing wrong or all the
>>> +corruptions were fixed.
>>> +.TP
>>> +.B xfs_repair \-L
>>> +(Force Log Zeroing)
>>> +will return a status code of 1 if it can't clear the log, or will always
>>> +return 0.
>>
>>
>> How about something like this:
>>
>>  .B xfs_repair \-n
>>  (no modify node)
>>  will return a status of 1 if filesystem corruption was detected and
>>  0 if no filesystem corruption was detected.
>>  .TP
>>  .B xfs_repair
>>  run without the \-n option will return a status code of 2 if it finds a
>>  filesystem log which needs to be replayed (by a mount/umount cycle), 1 if
>>  a runtime error is encountered, and 0 in all other cases, whether or not
>>  filesystem corruption was detected.
> 
> Your patch(xfs_repair: exit with status 2 if log dirtiness is unknown) will
> make xfs_repair return 2, when it can't find log head/tail. I think xfs_repair
> won't think the log needs to be replayed if it can't find the log tail/head.
> 
> So how about "return a status code of 2 if it finds filesystem log needs to be
> replayed or cleared"?

That seems reasonable...

-Eric

> Thanks,
> Zorro
> 
>>
>> and I'd leave out the bit about xfs_repair -L; really that's just a runtime
>> error - if we clear the log and then can't find the head/tail, something
>> strange has gone wrong.
>>
>> Thanks,
>>
>> -Eric
>>
>>>  .SH BUGS
>>>  The filesystem to be checked and repaired must have been
>>>  unmounted cleanly using normal system administration procedures
>>>
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2016-09-12 16:01 ` Eric Sandeen
@ 2016-09-13 14:44   ` Zorro Lang
  2016-09-13 14:49     ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Zorro Lang @ 2016-09-13 14:44 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs, xfs

On Mon, Sep 12, 2016 at 11:01:12AM -0500, Eric Sandeen wrote:
> On 9/9/16 11:47 PM, Zorro Lang wrote:
> > The man 8 xfs_repair said "xfs_repair run without the -n option will
> > always return a status code of 0". That's not correct.
> > 
> > xfs_repair will return 2 if it find valuable metadata changes in log
> > which needs to be replayed, 1 if it can't fix the corruption or some
> > other errors happened and 0 if nothing wrong or all the corruptions
> > were fixed.
> > 
> > Generally xfs_repair -L will always return 0, except it can't clear
> > the log.
> 
> And I think that's an operational type error, not the result
> of a filesystem problem; more like an IO error, or a code bug,
> I *think* ... more below.
> 
> 
> > Signed-off-by: Zorro Lang <zlang@redhat.com>
> > ---
> > 
> > Hi,
> > 
> > I  trusted the xfs_repair manpage, and thought xfs_repair will always return 0.
> > But recently I found it lies when I tried to review someone xfstests case.
> > 
> > A correct manpage will help more people to write right cases, so I try to modify
> > the manpage, by search all exit/do_error in xfsprogs/repair. I'm not the best
> > one who learn about xfs_repair, so I just hope I did the right thing:-P Please
> > feel free to correct me.
> > 
> > Thanks,
> > Zorro
> > 
> >  man/man8/xfs_repair.8 | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> > index 1b4d9e3..1f8f13b 100644
> > --- a/man/man8/xfs_repair.8
> > +++ b/man/man8/xfs_repair.8
> > @@ -504,12 +504,23 @@ that is known to be free. The entry is therefore invalid and is deleted.
> >  This message refers to a large directory.
> >  If the directory were small, the message would read "junking entry ...".
> >  .SH EXIT STATUS
> > +.TP
> >  .B xfs_repair \-n
> >  (no modify node)
> >  will return a status of 1 if filesystem corruption was detected and
> >  0 if no filesystem corruption was detected.
> > +.TP
> >  .B xfs_repair
> > -run without the \-n option will always return a status code of 0.
> > +run without the \-n option will return a status code of 2 if it find the
> > +filesystem has valuable metadata changes in log which needs to be
> > +replayed, 1 if there's corruption left to be fixed
> 
> I'm not sure that's the best description; from a quick look, I think
> those exit values of 1 result from do_error(), and in repair that's
> (usually?) due to something like a memory allocation failure, or an
> inconsistent state in the tool; more like hitting an ASSERT.  That might
> leave corruption, but only as a follow-on effect.

Hi Eric,

Many thanks for you can help to review this patch.

I've check all code will exit(1), generally it caused by memory or disk
errors. But some other situations likes:
 - No enough matching AGs or superblocks
 - Primary superblock bad after phase 1
 - Sector size on host filesystem larger than image sector size, when try
   to repair a file image
 ...

will exit(1) too.

But yes, they're all belong to runtime error:) There're too many situations
can return 1. But only one place can return 2, so we can say except return 0
and 2, others will return 1 :-P


>
> > + or can't find log head
> > +and tail or some other errors happened, 
> 
> Which is the same as above, I think - an internal error.
> 
> > and 0 if nothing wrong or all the
> > +corruptions were fixed.
> > +.TP
> > +.B xfs_repair \-L
> > +(Force Log Zeroing)
> > +will return a status code of 1 if it can't clear the log, or will always
> > +return 0.
> 
> 
> How about something like this:
> 
>  .B xfs_repair \-n
>  (no modify node)
>  will return a status of 1 if filesystem corruption was detected and
>  0 if no filesystem corruption was detected.
>  .TP
>  .B xfs_repair
>  run without the \-n option will return a status code of 2 if it finds a
>  filesystem log which needs to be replayed (by a mount/umount cycle), 1 if
>  a runtime error is encountered, and 0 in all other cases, whether or not
>  filesystem corruption was detected.

Your patch(xfs_repair: exit with status 2 if log dirtiness is unknown) will
make xfs_repair return 2, when it can't find log head/tail. I think xfs_repair
won't think the log needs to be replayed if it can't find the log tail/head.

So how about "return a status code of 2 if it finds filesystem log needs to be
replayed or cleared"?

Thanks,
Zorro

> 
> and I'd leave out the bit about xfs_repair -L; really that's just a runtime
> error - if we clear the log and then can't find the head/tail, something
> strange has gone wrong.
> 
> Thanks,
> 
> -Eric
> 
> >  .SH BUGS
> >  The filesystem to be checked and repaired must have been
> >  unmounted cleanly using normal system administration procedures
> > 
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs_repair: update the manual content about xfs_repair exit status
  2016-09-10  4:47 Zorro Lang
@ 2016-09-12 16:01 ` Eric Sandeen
  2016-09-13 14:44   ` Zorro Lang
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2016-09-12 16:01 UTC (permalink / raw)
  To: Zorro Lang, linux-xfs; +Cc: xfs

On 9/9/16 11:47 PM, Zorro Lang wrote:
> The man 8 xfs_repair said "xfs_repair run without the -n option will
> always return a status code of 0". That's not correct.
> 
> xfs_repair will return 2 if it find valuable metadata changes in log
> which needs to be replayed, 1 if it can't fix the corruption or some
> other errors happened and 0 if nothing wrong or all the corruptions
> were fixed.
> 
> Generally xfs_repair -L will always return 0, except it can't clear
> the log.

And I think that's an operational type error, not the result
of a filesystem problem; more like an IO error, or a code bug,
I *think* ... more below.


> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
> 
> Hi,
> 
> I  trusted the xfs_repair manpage, and thought xfs_repair will always return 0.
> But recently I found it lies when I tried to review someone xfstests case.
> 
> A correct manpage will help more people to write right cases, so I try to modify
> the manpage, by search all exit/do_error in xfsprogs/repair. I'm not the best
> one who learn about xfs_repair, so I just hope I did the right thing:-P Please
> feel free to correct me.
> 
> Thanks,
> Zorro
> 
>  man/man8/xfs_repair.8 | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
> index 1b4d9e3..1f8f13b 100644
> --- a/man/man8/xfs_repair.8
> +++ b/man/man8/xfs_repair.8
> @@ -504,12 +504,23 @@ that is known to be free. The entry is therefore invalid and is deleted.
>  This message refers to a large directory.
>  If the directory were small, the message would read "junking entry ...".
>  .SH EXIT STATUS
> +.TP
>  .B xfs_repair \-n
>  (no modify node)
>  will return a status of 1 if filesystem corruption was detected and
>  0 if no filesystem corruption was detected.
> +.TP
>  .B xfs_repair
> -run without the \-n option will always return a status code of 0.
> +run without the \-n option will return a status code of 2 if it find the
> +filesystem has valuable metadata changes in log which needs to be
> +replayed, 1 if there's corruption left to be fixed

I'm not sure that's the best description; from a quick look, I think
those exit values of 1 result from do_error(), and in repair that's
(usually?) due to something like a memory allocation failure, or an
inconsistent state in the tool; more like hitting an ASSERT.  That might
leave corruption, but only as a follow-on effect.

> + or can't find log head
> +and tail or some other errors happened, 

Which is the same as above, I think - an internal error.

> and 0 if nothing wrong or all the
> +corruptions were fixed.
> +.TP
> +.B xfs_repair \-L
> +(Force Log Zeroing)
> +will return a status code of 1 if it can't clear the log, or will always
> +return 0.


How about something like this:

 .B xfs_repair \-n
 (no modify node)
 will return a status of 1 if filesystem corruption was detected and
 0 if no filesystem corruption was detected.
 .TP
 .B xfs_repair
 run without the \-n option will return a status code of 2 if it finds a
 filesystem log which needs to be replayed (by a mount/umount cycle), 1 if
 a runtime error is encountered, and 0 in all other cases, whether or not
 filesystem corruption was detected.

and I'd leave out the bit about xfs_repair -L; really that's just a runtime
error - if we clear the log and then can't find the head/tail, something
strange has gone wrong.

Thanks,

-Eric

>  .SH BUGS
>  The filesystem to be checked and repaired must have been
>  unmounted cleanly using normal system administration procedures
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] xfs_repair: update the manual content about xfs_repair exit status
@ 2016-09-10  4:47 Zorro Lang
  2016-09-12 16:01 ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Zorro Lang @ 2016-09-10  4:47 UTC (permalink / raw)
  To: linux-xfs; +Cc: xfs, Zorro Lang

The man 8 xfs_repair said "xfs_repair run without the -n option will
always return a status code of 0". That's not correct.

xfs_repair will return 2 if it find valuable metadata changes in log
which needs to be replayed, 1 if it can't fix the corruption or some
other errors happened and 0 if nothing wrong or all the corruptions
were fixed.

Generally xfs_repair -L will always return 0, except it can't clear
the log.

Signed-off-by: Zorro Lang <zlang@redhat.com>
---

Hi,

I  trusted the xfs_repair manpage, and thought xfs_repair will always return 0.
But recently I found it lies when I tried to review someone xfstests case.

A correct manpage will help more people to write right cases, so I try to modify
the manpage, by search all exit/do_error in xfsprogs/repair. I'm not the best
one who learn about xfs_repair, so I just hope I did the right thing:-P Please
feel free to correct me.

Thanks,
Zorro

 man/man8/xfs_repair.8 | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
index 1b4d9e3..1f8f13b 100644
--- a/man/man8/xfs_repair.8
+++ b/man/man8/xfs_repair.8
@@ -504,12 +504,23 @@ that is known to be free. The entry is therefore invalid and is deleted.
 This message refers to a large directory.
 If the directory were small, the message would read "junking entry ...".
 .SH EXIT STATUS
+.TP
 .B xfs_repair \-n
 (no modify node)
 will return a status of 1 if filesystem corruption was detected and
 0 if no filesystem corruption was detected.
+.TP
 .B xfs_repair
-run without the \-n option will always return a status code of 0.
+run without the \-n option will return a status code of 2 if it find the
+filesystem has valuable metadata changes in log which needs to be
+replayed, 1 if there's corruption left to be fixed or can't find log head
+and tail or some other errors happened, and 0 if nothing wrong or all the
+corruptions were fixed.
+.TP
+.B xfs_repair \-L
+(Force Log Zeroing)
+will return a status code of 1 if it can't clear the log, or will always
+return 0.
 .SH BUGS
 The filesystem to be checked and repaired must have been
 unmounted cleanly using normal system administration procedures
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-12  5:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-11  5:18 [PATCH] xfs_repair: update the manual content about xfs_repair exit status Zorro Lang
2017-01-11 13:47 ` Eric Sandeen
2017-01-11 17:45   ` Darrick J. Wong
2017-01-12  5:00     ` Zorro Lang
2017-01-12  4:53   ` Zorro Lang
  -- strict thread matches above, loose matches on Subject: below --
2016-09-10  4:47 Zorro Lang
2016-09-12 16:01 ` Eric Sandeen
2016-09-13 14:44   ` Zorro Lang
2016-09-13 14:49     ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.