All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zorro Lang <zlang@redhat.com>
To: Xiao Yang <yangx.jy@cn.fujitsu.com>
Cc: fstests@vger.kernel.org, linux-xfs@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH v2] xfs/098: fix xfs_repair on newer xfsprogs
Date: Fri, 9 Sep 2016 05:28:32 -0700 (PDT)	[thread overview]
Message-ID: <20160909122831.GD12847@dhcp12-143.nay.redhat.com> (raw)
In-Reply-To: <57D28101.6000902@cn.fujitsu.com>

On Fri, Sep 09, 2016 at 05:29:37PM +0800, Xiao Yang wrote:
> 于 2016/08/26 17:05, Zorro Lang 写道:
> >On Fri, Aug 26, 2016 at 02:10:00PM +0800, Xiao Yang wrote:
> >>On 2016/08/26 12:48, Zorro Lang wrote:
> >>>On Fri, Aug 26, 2016 at 11:36:52AM +0800, Xiao Yang wrote:
> >>>>Make sure xfs_repair can't clear the log by default when it is corrupted.
> >>>>xfs_repair always and only clear the log when the -L parameter is specified.
> >>>>This has updated by:
> >>>>Commit f2053bc ("xfs_repair: don't clear the log by default")
> >>>>
> >>>>Signed-off-by: Xiao Yang<yangx.jy@cn.fujitsu.com>
> >>>>---
> >>>>  common/rc     | 4 ++--
> >>>>  tests/xfs/098 | 2 +-
> >>>>  2 files changed, 3 insertions(+), 3 deletions(-)
> >>>>
> >>>>diff --git a/common/rc b/common/rc
> >>>>index 3fb0600..c693a31 100644
> >>>>--- a/common/rc
> >>>>+++ b/common/rc
> >>>>@@ -1143,9 +1143,9 @@ _repair_scratch_fs()
> >>>Hi Xiao
> >>>
> >>>You should explain why you changed this function in commit log. Or
> >>>the reviewer can't understand why you change it.
> >>>
> >>>>      xfs)
> >>>>          _scratch_xfs_repair "$@" 2>&1
> >>>>  	res=$?
> >>>>-	if [ "$res" -eq 2 ]; then
> >>>>+	if [ "$res" -ne 0 ]; then
> >>>Hi Darrick,
> >>>
> >>>The xfs_repair manpage said:
> >>>xfs_repair run without the -n option will always return a status code of 0.
> >>>
> >>>I don't understand why you think it return 2 here? (Please check below)
> >>>
> >>Hi Zorro
> >>
> >>I don't understand why it return 2 here too.  I want to change this
> >>function because xfs_repair
> >>without -L option return 1 when log is corrupted on newer xfsprogs-dev.
> >>>>  		echo "xfs_repair returns $res; replay log?"
> >>>>-		_scratch_mount
> >>>>+		_scratch_mount 2>&1
> >>>>  		res=$?
> >>>>  		if [ "$res" -gt 0 ]; then
> >>>>  			echo "mount returns $res; zap log?"
> >>>>diff --git a/tests/xfs/098 b/tests/xfs/098
> >>>>index d91d617..eb33bb1 100755
> >>>>--- a/tests/xfs/098
> >>>>+++ b/tests/xfs/098
> >>>>@@ -93,7 +93,7 @@ echo "+ mount image"
> >>>>  _scratch_mount 2>/dev/null&&   _fail "mount should not succeed"
> >>>>
> >>>>  echo "+ repair fs"
> >>>>-_scratch_xfs_repair>>   $seqres.full 2>&1
> >>>>+_repair_scratch_fs>>   $seqres.full
> >You should print the stderr to $seqres.full too. Because in
> >"_repair_scratch_fs", its code likes below:
> >
> >     xfs)
> >         _scratch_xfs_repair "$@" 2>&1
> >>>>This repair won't clear the corrupted log anymore.
> >         res=$?
> >         if [ "$res" -eq 2 ]; then
> >                 echo "xfs_repair returns $res; replay log?"
> >                 _scratch_mount
> >>>>So this mount maybe failed if it can't deal with the corrupted log.
> >>>>If it print some error messages, it'll break the golden image of xfs/098
> >                 res=$?
> >                 if [ "$res" -gt 0 ]; then
> >                         echo "mount returns $res; zap log?"
> >                         _scratch_xfs_repair -L 2>&1
> >
> >
> >>>If just call xfs_repair without any options, the _repair_scratch_fs won't
> >>>help to call xfs_repair -L I think.
> >>>
> >>>So I think this patch won't fix the problem.
> >>>
> >>>Feel free to correct me, if I misunderstand something:)
> >>>
> >>>Thanks,
> >>>Zorro
> >>>
> >>If xfs_repair without any option succeed to repair filesystem when
> >>log is corrupted,
> >>_repair_scratch_fs don't need  to call  xfs_repair -L.  If it failed
> >>to repair filesystem,
> >>_repair_scratch_fs needs  to call  xfs_repair -L.
> >Oh, sorry, I just tried to run ths case. The "_scratch_xfs_repair" really return
> >non-zero when it try to repair a corrupted xfs...
> >
> >But the manpage(man xfs_repair) really said:
> >xfs_repair run without the -n option will always return a status code of 0.
> >
> >Maybe we should update the manpage? I'll check it later.
> >
> >Any way, there's still a problem in your patch, please see above:
> >
> >Thanks,
> >Zorro
> Hi Zorro
> Do you know why it returns 2 instead of 1 when we use xfs_repair
> without any options.
> I can't understand it, because it always return 1 on my machine.

Hi Xiao,

Please CC the mail list, there's no secret. And the most important
thing is if I said something wrong, others great developers maybe
glad to correct me:-P

I've asked DJ Wong about the return value of xfs_repair, and he
already replied:

"xfs_repair returns 2 when the log is corrupted, 1 when there's corruption left
to be fixed *or* some kind of operation error happened, and 0 if either it
found nothing wrong or all the corruptions were fixed."

I'm sure that email has been sent to you too.

If you can't understand why it return 1, you can check your xfs/098.full file,
you'll find:

"Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
Log inconsistent (didn't find previous header)
failed to find log head
zero_log: cannot find log head/tail (xlog_find_tail=5)

fatal error -- ERROR: The log head and/or tail cannot be discovered. Attempt to mount the
filesystem to replay the log or use the -L option to destroy the log and
attempt a repair.
xfs_repair failed, err=1"

This output from below xfsprogs code:

        error = xlog_find_tail(log, &head_blk, &tail_blk);
        if (error) {
                do_warn(
                _("zero_log: cannot find log head/tail (xlog_find_tail=%d)\n"),
                        error);
                if (!no_modify && !zap_log)
>>> [exit from here] >>>    do_error(_(
"ERROR: The log head and/or tail cannot be discovered. Attempt to mount the\n"
"filesystem to replay the log or use the -L option to destroy the log and\n"
"attempt a repair.\n"));
        } else {
                if (verbose) {
                        do_warn(
        _("zero_log: head block %" PRId64 " tail block %" PRId64 "\n"),
                                head_blk, tail_blk);
                }
                if (!no_modify && head_blk != tail_blk) {
                        if (zap_log) {
                                do_warn(_(
"ALERT: The filesystem has valuable metadata changes in a log which is being\n"
"destroyed because the -L option was used.\n"));
                        } else {
                                do_warn(_(
"ERROR: The filesystem has valuable metadata changes in a log which needs to\n"
"be replayed.  Mount the filesystem to replay the log, and unmount it before\n"
"re-running xfs_repair.  If you are unable to mount the filesystem, then use\n"
"the -L option to destroy the log and attempt a repair.\n"
"Note that destroying the log may cause corruption -- please attempt a mount\n"
"of the filesystem before doing this.\n"));
                                exit(2);
                        }
                }
        }

I've marked [exit from here] for you. do_error will call exit(1). And the output
message already tell you the reason about why it fail.

You can keep reading, there's a "exit(2)" at the end of above code. I can't find
more exit(2) from xfsprogs/repair/ . So maybe this's the only one place which
can return 2. From the information above that exit(2), you can see that
xfs_repair will return 2 when it find there're some valuable metadata changes in
a log. It think a mount operation maybe can replay this log, so it return 2 and
suggest the user try to mount the filesystem. If mount can't replay the log, -L
is the next choice.

So I think the _repair_scratch_fs function in xfstests/common/rc doesn't think
about above situation. xfs_repair doesn't always return 2 if log corrupted.
Only xfs_repair feel log can be replay, it'll return 2, or it'll return 1. So
maybe we should change "if [ $res -eq 2 ]" to "if [ $res -ne 0 ]". Or we need
to change xfs_repair to make it return 2:-P

For xfs/098's problem, you can change the line#96:
from
_scratch_xfs_repair >> $seqres.full 2>&1
to
_repair_scratch_fs >> $seqres.full 2>&1

And _repair_scratch_fs need to be modified as I said above. I think I should write
a patch to describe the return value of xfs_repair(without -n). The current
xfs_repair manpage said:
"xfs_repair run without the -n option will always return a status code of 0."
it's wrong.

OK, I've talked too much. If anyone feel anything wrong, please corrent me:)

Thanks,
Zorro

> 
> Thanks,
> yang
> >>Thanks
> >>Xiao Yang.
> >>>>  echo "+ mount image (2)"
> >>>>  _scratch_mount
> >>>>-- 
> >>>>1.8.3.1
> >>>>
> >>>>
> >>>>
> >>>>--
> >>>>To unsubscribe from this list: send the line "unsubscribe fstests" in
> >>>>the body of a message to majordomo@vger.kernel.org
> >>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>.
> >>>
> >>
> >>
> >>--
> >>To unsubscribe from this list: send the line "unsubscribe fstests" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >.
> >
> 
> 
> 

WARNING: multiple messages have this Message-ID (diff)
From: Zorro Lang <zlang@redhat.com>
To: Xiao Yang <yangx.jy@cn.fujitsu.com>
Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH v2] xfs/098: fix xfs_repair on newer xfsprogs
Date: Fri, 9 Sep 2016 20:28:32 +0800	[thread overview]
Message-ID: <20160909122831.GD12847@dhcp12-143.nay.redhat.com> (raw)
In-Reply-To: <57D28101.6000902@cn.fujitsu.com>

On Fri, Sep 09, 2016 at 05:29:37PM +0800, Xiao Yang wrote:
> 于 2016/08/26 17:05, Zorro Lang 写道:
> >On Fri, Aug 26, 2016 at 02:10:00PM +0800, Xiao Yang wrote:
> >>On 2016/08/26 12:48, Zorro Lang wrote:
> >>>On Fri, Aug 26, 2016 at 11:36:52AM +0800, Xiao Yang wrote:
> >>>>Make sure xfs_repair can't clear the log by default when it is corrupted.
> >>>>xfs_repair always and only clear the log when the -L parameter is specified.
> >>>>This has updated by:
> >>>>Commit f2053bc ("xfs_repair: don't clear the log by default")
> >>>>
> >>>>Signed-off-by: Xiao Yang<yangx.jy@cn.fujitsu.com>
> >>>>---
> >>>>  common/rc     | 4 ++--
> >>>>  tests/xfs/098 | 2 +-
> >>>>  2 files changed, 3 insertions(+), 3 deletions(-)
> >>>>
> >>>>diff --git a/common/rc b/common/rc
> >>>>index 3fb0600..c693a31 100644
> >>>>--- a/common/rc
> >>>>+++ b/common/rc
> >>>>@@ -1143,9 +1143,9 @@ _repair_scratch_fs()
> >>>Hi Xiao
> >>>
> >>>You should explain why you changed this function in commit log. Or
> >>>the reviewer can't understand why you change it.
> >>>
> >>>>      xfs)
> >>>>          _scratch_xfs_repair "$@" 2>&1
> >>>>  	res=$?
> >>>>-	if [ "$res" -eq 2 ]; then
> >>>>+	if [ "$res" -ne 0 ]; then
> >>>Hi Darrick,
> >>>
> >>>The xfs_repair manpage said:
> >>>xfs_repair run without the -n option will always return a status code of 0.
> >>>
> >>>I don't understand why you think it return 2 here? (Please check below)
> >>>
> >>Hi Zorro
> >>
> >>I don't understand why it return 2 here too.  I want to change this
> >>function because xfs_repair
> >>without -L option return 1 when log is corrupted on newer xfsprogs-dev.
> >>>>  		echo "xfs_repair returns $res; replay log?"
> >>>>-		_scratch_mount
> >>>>+		_scratch_mount 2>&1
> >>>>  		res=$?
> >>>>  		if [ "$res" -gt 0 ]; then
> >>>>  			echo "mount returns $res; zap log?"
> >>>>diff --git a/tests/xfs/098 b/tests/xfs/098
> >>>>index d91d617..eb33bb1 100755
> >>>>--- a/tests/xfs/098
> >>>>+++ b/tests/xfs/098
> >>>>@@ -93,7 +93,7 @@ echo "+ mount image"
> >>>>  _scratch_mount 2>/dev/null&&   _fail "mount should not succeed"
> >>>>
> >>>>  echo "+ repair fs"
> >>>>-_scratch_xfs_repair>>   $seqres.full 2>&1
> >>>>+_repair_scratch_fs>>   $seqres.full
> >You should print the stderr to $seqres.full too. Because in
> >"_repair_scratch_fs", its code likes below:
> >
> >     xfs)
> >         _scratch_xfs_repair "$@" 2>&1
> >>>>This repair won't clear the corrupted log anymore.
> >         res=$?
> >         if [ "$res" -eq 2 ]; then
> >                 echo "xfs_repair returns $res; replay log?"
> >                 _scratch_mount
> >>>>So this mount maybe failed if it can't deal with the corrupted log.
> >>>>If it print some error messages, it'll break the golden image of xfs/098
> >                 res=$?
> >                 if [ "$res" -gt 0 ]; then
> >                         echo "mount returns $res; zap log?"
> >                         _scratch_xfs_repair -L 2>&1
> >
> >
> >>>If just call xfs_repair without any options, the _repair_scratch_fs won't
> >>>help to call xfs_repair -L I think.
> >>>
> >>>So I think this patch won't fix the problem.
> >>>
> >>>Feel free to correct me, if I misunderstand something:)
> >>>
> >>>Thanks,
> >>>Zorro
> >>>
> >>If xfs_repair without any option succeed to repair filesystem when
> >>log is corrupted,
> >>_repair_scratch_fs don't need  to call  xfs_repair -L.  If it failed
> >>to repair filesystem,
> >>_repair_scratch_fs needs  to call  xfs_repair -L.
> >Oh, sorry, I just tried to run ths case. The "_scratch_xfs_repair" really return
> >non-zero when it try to repair a corrupted xfs...
> >
> >But the manpage(man xfs_repair) really said:
> >xfs_repair run without the -n option will always return a status code of 0.
> >
> >Maybe we should update the manpage? I'll check it later.
> >
> >Any way, there's still a problem in your patch, please see above:
> >
> >Thanks,
> >Zorro
> Hi Zorro
> Do you know why it returns 2 instead of 1 when we use xfs_repair
> without any options.
> I can't understand it, because it always return 1 on my machine.

Hi Xiao,

Please CC the mail list, there's no secret. And the most important
thing is if I said something wrong, others great developers maybe
glad to correct me:-P

I've asked DJ Wong about the return value of xfs_repair, and he
already replied:

"xfs_repair returns 2 when the log is corrupted, 1 when there's corruption left
to be fixed *or* some kind of operation error happened, and 0 if either it
found nothing wrong or all the corruptions were fixed."

I'm sure that email has been sent to you too.

If you can't understand why it return 1, you can check your xfs/098.full file,
you'll find:

"Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
Log inconsistent (didn't find previous header)
failed to find log head
zero_log: cannot find log head/tail (xlog_find_tail=5)

fatal error -- ERROR: The log head and/or tail cannot be discovered. Attempt to mount the
filesystem to replay the log or use the -L option to destroy the log and
attempt a repair.
xfs_repair failed, err=1"

This output from below xfsprogs code:

        error = xlog_find_tail(log, &head_blk, &tail_blk);
        if (error) {
                do_warn(
                _("zero_log: cannot find log head/tail (xlog_find_tail=%d)\n"),
                        error);
                if (!no_modify && !zap_log)
>>> [exit from here] >>>    do_error(_(
"ERROR: The log head and/or tail cannot be discovered. Attempt to mount the\n"
"filesystem to replay the log or use the -L option to destroy the log and\n"
"attempt a repair.\n"));
        } else {
                if (verbose) {
                        do_warn(
        _("zero_log: head block %" PRId64 " tail block %" PRId64 "\n"),
                                head_blk, tail_blk);
                }
                if (!no_modify && head_blk != tail_blk) {
                        if (zap_log) {
                                do_warn(_(
"ALERT: The filesystem has valuable metadata changes in a log which is being\n"
"destroyed because the -L option was used.\n"));
                        } else {
                                do_warn(_(
"ERROR: The filesystem has valuable metadata changes in a log which needs to\n"
"be replayed.  Mount the filesystem to replay the log, and unmount it before\n"
"re-running xfs_repair.  If you are unable to mount the filesystem, then use\n"
"the -L option to destroy the log and attempt a repair.\n"
"Note that destroying the log may cause corruption -- please attempt a mount\n"
"of the filesystem before doing this.\n"));
                                exit(2);
                        }
                }
        }

I've marked [exit from here] for you. do_error will call exit(1). And the output
message already tell you the reason about why it fail.

You can keep reading, there's a "exit(2)" at the end of above code. I can't find
more exit(2) from xfsprogs/repair/ . So maybe this's the only one place which
can return 2. From the information above that exit(2), you can see that
xfs_repair will return 2 when it find there're some valuable metadata changes in
a log. It think a mount operation maybe can replay this log, so it return 2 and
suggest the user try to mount the filesystem. If mount can't replay the log, -L
is the next choice.

So I think the _repair_scratch_fs function in xfstests/common/rc doesn't think
about above situation. xfs_repair doesn't always return 2 if log corrupted.
Only xfs_repair feel log can be replay, it'll return 2, or it'll return 1. So
maybe we should change "if [ $res -eq 2 ]" to "if [ $res -ne 0 ]". Or we need
to change xfs_repair to make it return 2:-P

For xfs/098's problem, you can change the line#96:
from
_scratch_xfs_repair >> $seqres.full 2>&1
to
_repair_scratch_fs >> $seqres.full 2>&1

And _repair_scratch_fs need to be modified as I said above. I think I should write
a patch to describe the return value of xfs_repair(without -n). The current
xfs_repair manpage said:
"xfs_repair run without the -n option will always return a status code of 0."
it's wrong.

OK, I've talked too much. If anyone feel anything wrong, please corrent me:)

Thanks,
Zorro

> 
> Thanks,
> yang
> >>Thanks
> >>Xiao Yang.
> >>>>  echo "+ mount image (2)"
> >>>>  _scratch_mount
> >>>>-- 
> >>>>1.8.3.1
> >>>>
> >>>>
> >>>>
> >>>>--
> >>>>To unsubscribe from this list: send the line "unsubscribe fstests" in
> >>>>the body of a message to majordomo@vger.kernel.org
> >>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>.
> >>>
> >>
> >>
> >>--
> >>To unsubscribe from this list: send the line "unsubscribe fstests" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >.
> >
> 
> 
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2016-09-09 12:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-25  9:22 [PATCH] xfs/098: fix xfs_repair on newer xfsprogs Xiao Yang
2016-08-25 12:09 ` Zorro Lang
2016-08-25 15:40   ` Darrick J. Wong
2016-08-25 16:32     ` Zorro Lang
2016-08-26  3:32     ` Xiao Yang
2016-08-26 16:18       ` Darrick J. Wong
2016-08-26  3:36     ` [PATCH v2] " Xiao Yang
2016-08-26  4:42       ` Eryu Guan
2016-08-26  5:44         ` Xiao Yang
2016-08-26  4:48       ` Zorro Lang
2016-08-26  6:10         ` Xiao Yang
2016-08-26  9:05           ` Zorro Lang
     [not found]             ` <57D28101.6000902@cn.fujitsu.com>
2016-09-09 12:28               ` Zorro Lang [this message]
2016-09-09 12:28                 ` Zorro Lang
2016-09-12  1:07                 ` Xiao Yang
2016-09-12  1:07                   ` Xiao Yang
2016-09-12  5:13                 ` [PATCH v3] " Xiao Yang
2016-09-12 12:59                   ` Eric Sandeen
2016-09-13  6:12                     ` Xiao Yang
2016-09-13  7:08                     ` Zorro Lang
2016-09-14  1:43                       ` Xiao Yang
2016-09-14  2:52                       ` [PATCH v4] " Xiao Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909122831.GD12847@dhcp12-143.nay.redhat.com \
    --to=zlang@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    --cc=yangx.jy@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.