All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Joseph Qi <jiangqi903@gmail.com>
Cc: Srikanth C S <srikanth.c.s@oracle.com>,
	linux-xfs@vger.kernel.org, darrick.wong@oracle.com,
	rajesh.sivaramasubramaniom@oracle.com, junxiao.bi@oracle.com,
	david@fromorbit.com
Subject: Re: [PATCH v3] fsck.xfs: mount/umount xfs fs to replay log before running xfs_repair
Date: Mon, 14 Nov 2022 14:59:02 -0800	[thread overview]
Message-ID: <Y3LINrI4HbuxjRn+@magnolia> (raw)
In-Reply-To: <190cba7f-30a0-4eb3-6587-5b42d7251868@gmail.com>

On Fri, Nov 11, 2022 at 02:24:25PM +0800, Joseph Qi wrote:
> Hi,
> 
> On 11/4/22 2:10 PM, Srikanth C S wrote:
> > After a recent data center crash, we had to recover root filesystems
> > on several thousands of VMs via a boot time fsck. Since these
> > machines are remotely manageable, support can inject the kernel
> > command line with 'fsck.mode=force fsck.repair=yes' to kick off
> > xfs_repair if the machine won't come up or if they suspect there
> > might be deeper issues with latent errors in the fs metadata, which
> > is what they did to try to get everyone running ASAP while
> > anticipating any future problems. But, fsck.xfs does not address the
> > journal replay in case of a crash.
> > 
> > fsck.xfs does xfs_repair -e if fsck.mode=force is set. It is
> > possible that when the machine crashes, the fs is in inconsistent
> > state with the journal log not yet replayed. This can drop the machine
> > into the rescue shell because xfs_fsck.sh does not know how to clean the
> > log. Since the administrator told us to force repairs, address the
> > deficiency by cleaning the log and rerunning xfs_repair.
> > 
> > Run xfs_repair -e when fsck.mode=force and repair=auto or yes.
> > Replay the logs only if fsck.mode=force and fsck.repair=yes. For
> > other option -fa and -f drop to the rescue shell if repair detects
> > any corruptions.
> > 
> > Signed-off-by: Srikanth C S <srikanth.c.s@oracle.com>
> > ---
> >  fsck/xfs_fsck.sh | 31 +++++++++++++++++++++++++++++--
> >  1 file changed, 29 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fsck/xfs_fsck.sh b/fsck/xfs_fsck.sh
> > index 6af0f22..62a1e0b 100755
> > --- a/fsck/xfs_fsck.sh
> > +++ b/fsck/xfs_fsck.sh
> > @@ -31,10 +31,12 @@ repair2fsck_code() {
> >  
> >  AUTO=false
> >  FORCE=false
> > +REPAIR=false
> >  while getopts ":aApyf" c
> >  do
> >         case $c in
> > -       a|A|p|y)        AUTO=true;;
> > +       a|A|p)          AUTO=true;;
> > +       y)              REPAIR=true;;
> >         f)              FORCE=true;;
> >         esac
> >  done
> > @@ -64,7 +66,32 @@ fi
> >  
> >  if $FORCE; then
> >         xfs_repair -e $DEV
> > -       repair2fsck_code $?
> > +       error=$?
> > +       if [ $error -eq 2 ] && [ $REPAIR = true ]; then
> > +               echo "Replaying log for $DEV"
> > +               mkdir -p /tmp/repair_mnt || exit 1
> > +               for x in $(cat /proc/cmdline); do
> > +                       case $x in
> > +                               root=*)
> > +                                       ROOT="${x#root=}"
> > +                               ;;
> > +                               rootflags=*)
> > +                                       ROOTFLAGS="-o ${x#rootflags=}"
> > +                               ;;
> > +                       esac
> > +               done
> > +               test -b "$ROOT" || ROOT=$(blkid -t "$ROOT" -o device)
> > +               if [ $(basename $DEV) = $(basename $ROOT) ]; then
> > +                       mount $DEV /tmp/repair_mnt $ROOTFLAGS || exit 1
> > +               else
> > +                       mount $DEV /tmp/repair_mnt || exit 1
> > +               fi
> 
> If do normal boot, it will try to mount according to fstab.
> So in the crash case you've described, it seems that it can't mount
> successfully? Or am I missing something?

Yes, we're assuming that support has injected the magic command lines
into the bootloader to trigger xfs_repair after boot failed due to a
bad/corrupt rootfs.

--D

> Thanks,
> Joseph
> 
> > +               umount /tmp/repair_mnt
> > +               xfs_repair -e $DEV
> > +               error=$?
> > +               rm -d /tmp/repair_mnt
> > +       fi
> > +       repair2fsck_code $error
> >         exit $?
> >  fi
> >  

  reply	other threads:[~2022-11-14 22:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04  6:10 [PATCH v3] fsck.xfs: mount/umount xfs fs to replay log before running xfs_repair Srikanth C S
2022-11-07  6:00 ` Gao Xiang
2022-11-07 16:55   ` Darrick J. Wong
2022-11-07 17:24     ` Gao Xiang
2022-11-11  6:24 ` Joseph Qi
2022-11-14 22:59   ` Darrick J. Wong [this message]
     [not found] <NdSU2Rq0FpWJ3II4JAnJNk-0HW5bns_UxhQ03sSOaek-nu9QPA-ZMx0HDXFtVx8ahgKhWe0Wcfh13NH0ZSwJjg==@protonmail.internalid>
2022-11-23  6:30 ` Srikanth C S
2022-11-23  8:36   ` Carlos Maiolino
2022-12-12 12:13   ` Carlos Maiolino
2022-12-13  9:39   ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3LINrI4HbuxjRn+@magnolia \
    --to=djwong@kernel.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=jiangqi903@gmail.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rajesh.sivaramasubramaniom@oracle.com \
    --cc=srikanth.c.s@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.