fsck.xfs proposed improvements

All of lore.kernel.org
 help / color / mirror / Atom feed

* fsck.xfs proposed improvements
       [not found] <mailman.0.1240318659.128675.xfs@oss.sgi.com>
@ 2009-04-21 14:23 ` Mike Ashton
  2009-04-21 22:09   ` Russell Cattelan
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Ashton @ 2009-04-21 14:23 UTC (permalink / raw)
  To: xfs

Hello folks,

I've been using XFS as my filesystem of choice for many, many years
now and for all the years of, er, joy, I have encountered a few
difficulties with filesystem recovery after machine crashes/hard
reboots and so on.  Google confirms that I'm not alone in this.

You're all probably perfectly well aware that fsck.xfs is a shell
script that does nothing much, on the premise that XFS has a journal
and therefore doesn't suffer from the routine corruption of more
primitive filesystems.  However, I have found that the journal itself
is prone to corruption (bad clientid, and friends) on contemporary,
even enterprise class, hardware.  Now I don't doubt this is due to
stupidities in the underlying hardware - SATA disks' naughty
non-battery write caches or what have you - and XFS is not to blame,
but I feel we maybe need to be more pragmatic about these annoying
realities.

I'm also sure that this is not the first time this design decision has
been challenged, although a search of the list archives implies that
it hasn't been suggested in the forum.  Forgive me if I'm wrong there.

I'm here to make the case for fsck.xfs being enhanced to verify the
journal and invoke xfs_repair -L in the event that it's screwed.  Now,
I'm sure half of you just sprayed coffee at the screen and are already
firing up an angry reply, but bear with me.  Automatic filesystem
repair is a normal, everyday necessity.  It's what non-journaling
filesystems do all the time; the days of offering the sysadmin the
choice of whether to repair this inode count, or that dnode entry are
long gone.  A filesystem with a corrupted journal is no use to me; I'm
not going to be able to repair the journal.  All I'm going to do is
invoke xfs_repair -L and pray.  I'm happy for that, *as an option* (
as it is on all fsck invocations) to happen on boot without my
intervention.  

I'd like that to happen.  I do not accept that fsck.xfs has a null
function.  The filesystem is kept consistent by the journal, but the
journal needs to be verified and the filesystem repaired otherwise.
Otherwise, fsck passes, mount fails, my computer doesn't boot and that
makes me a sad panda.  Thankfully this would be a pretty quick
operation - I'm sure there's a lot of cleverness that could be
incorporated into a binary fsck.xfs that could detect, report on and
repair all sorts of exciting situations, but you can even do it
primitively in shell by simply trying to mount it.  I've included an
example of what I mean at the end.

Hopefully, you'll give this some serious consideration.  I'm quite
sure this is going to end up being a bun-fight issue, but I'm in no
way implying that you didn't think about what you were doing when you
made the decision to make mkfs.xfs do nothing.  I'm just asking that
you consider again whether it now needs to do something, because that
hasn't worked as a strategy, even if that is due to hardware
manufacturers cutting corners.

Thanks,
Mike.

#!/bin/sh -f
#
# Copyright (c) 2006 Silicon Graphics, Inc.  All Rights Reserved.
#

AUTO=false
while getopts ":aApy" c
do
        case $c in
        a|A|p|y)        AUTO=true;;
        esac
done
eval DEV=\${$#}
if [ ! -e $DEV ]; then
        echo "$0: $DEV does not exist"
        exit 8
fi
if $AUTO; then
# rw initrd should allow mkdir but direct mounting of / read-only, we require to have a /mnt already
        mkdir -p /mnt
        if [ ! -d /mnt ]
        then
                echo no /mnt to test XFS journal recovery
                exit 0
        fi
        if mount -t xfs "$DEV" /mnt -o ro,norecovery
        then
                umount /mnt
                echo "$DEV is an xfs filesystem"
                if mount -t xfs "$DEV" /mnt
                then
                        echo "Recovery by journal successful"
                        umount /mnt
                else
                        echo "writable mount of $DEV failed - invoking xfs_repair"
                        xfs_repair -L "$DEV"
                fi
        else
                echo "$DEV appears not to be an xfs filesystem"
        fi
else
        echo "If you wish to check the consistency of an XFS filesystem or"
        echo "repair a damaged filesystem, see xfs_check(8) and xfs_repair(8)."
fi
exit 0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-21 14:23 ` fsck.xfs proposed improvements Mike Ashton
@ 2009-04-21 22:09   ` Russell Cattelan
  2009-04-22  9:45     ` Mike Ashton
  0 siblings, 1 reply; 9+ messages in thread
From: Russell Cattelan @ 2009-04-21 22:09 UTC (permalink / raw)
  To: Mike Ashton; +Cc: xfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Ashton wrote:
> Hello folks,
>
> I've been using XFS as my filesystem of choice for many, many years
> now and for all the years of, er, joy, I have encountered a few
> difficulties with filesystem recovery after machine crashes/hard
> reboots and so on.  Google confirms that I'm not alone in this.
>
> You're all probably perfectly well aware that fsck.xfs is a shell
> script that does nothing much, on the premise that XFS has a journal
> and therefore doesn't suffer from the routine corruption of more
> primitive filesystems.  However, I have found that the journal itself
> is prone to corruption (bad clientid, and friends) on contemporary,
> even enterprise class, hardware.  Now I don't doubt this is due to
> stupidities in the underlying hardware - SATA disks' naughty
> non-battery write caches or what have you - and XFS is not to blame,
> but I feel we maybe need to be more pragmatic about these annoying
> realities.
>
> I'm also sure that this is not the first time this design decision has
> been challenged, although a search of the list archives implies that
> it hasn't been suggested in the forum.  Forgive me if I'm wrong there.
>
> I'm here to make the case for fsck.xfs being enhanced to verify the
> journal and invoke xfs_repair -L in the event that it's screwed.  Now,
> I'm sure half of you just sprayed coffee at the screen and are already
> firing up an angry reply, but bear with me.  Automatic filesystem
> repair is a normal, everyday necessity.  It's what non-journaling
> filesystems do all the time; the days of offering the sysadmin the
> choice of whether to repair this inode count, or that dnode entry are
> long gone.  A filesystem with a corrupted journal is no use to me; I'm
> not going to be able to repair the journal.  All I'm going to do is
> invoke xfs_repair -L and pray.  I'm happy for that, *as an option* (
> as it is on all fsck invocations) to happen on boot without my
> intervention. 
>
> I'd like that to happen.  I do not accept that fsck.xfs has a null
> function.  The filesystem is kept consistent by the journal, but the
> journal needs to be verified and the filesystem repaired otherwise.
> Otherwise, fsck passes, mount fails, my computer doesn't boot and that
> makes me a sad panda.  Thankfully this would be a pretty quick
> operation - I'm sure there's a lot of cleverness that could be
> incorporated into a binary fsck.xfs that could detect, report on and
> repair all sorts of exciting situations, but you can even do it
> primitively in shell by simply trying to mount it.  I've included an
> example of what I mean at the end.
>
> Hopefully, you'll give this some serious consideration.  I'm quite
> sure this is going to end up being a bun-fight issue, but I'm in no
> way implying that you didn't think about what you were doing when you
> made the decision to make mkfs.xfs do nothing.  I'm just asking that
> you consider again whether it now needs to do something, because that
> hasn't worked as a strategy, even if that is due to hardware
> manufacturers cutting corners.
Well step back a bit, fsck.xfs exists simply to satisfy the initial
boot scripts that invokes fsck -t $fs_type.
The reason fsck.xfs does nothing and  should continue to do nothing is
that by the time you have access
to the boot scripts and the fsck.xfs program the root filesystem has
already been mounted. Which means
the root file system has successfully made it through either a clean
mount or a log replay mount, neither of which
needs additional verification.


It would not be unreasonable  to do what you are suggesting in an
initrd startup script,
provided xfs_repair was included in the initrd (which has size and
library requirements).

This would probably be a matter of first implementing it and then
convincing the mkinitrd maintainers to
add the support.

- -Russell Cattelan





>
> Thanks,
> Mike.
>
> #!/bin/sh -f
> #
> # Copyright (c) 2006 Silicon Graphics, Inc.  All Rights Reserved.
> #
>
> AUTO=false
> while getopts ":aApy" c
> do
>         case $c in
>         a|A|p|y)        AUTO=true;;
>         esac
> done
> eval DEV=\${$#}
> if [ ! -e $DEV ]; then
>         echo "$0: $DEV does not exist"
>         exit 8
> fi
> if $AUTO; then
> # rw initrd should allow mkdir but direct mounting of / read-only, we
require to have a /mnt already
>         mkdir -p /mnt
>         if [ ! -d /mnt ]
>         then
>                 echo no /mnt to test XFS journal recovery
>                 exit 0
>         fi
>         if mount -t xfs "$DEV" /mnt -o ro,norecovery
>         then
>                 umount /mnt
>                 echo "$DEV is an xfs filesystem"
>                 if mount -t xfs "$DEV" /mnt
>                 then
>                         echo "Recovery by journal successful"
>                         umount /mnt
>                 else
>                         echo "writable mount of $DEV failed - invoking
xfs_repair"
>                         xfs_repair -L "$DEV"
>                 fi
>         else
>                 echo "$DEV appears not to be an xfs filesystem"
>         fi
> else
>         echo "If you wish to check the consistency of an XFS filesystem or"
>         echo "repair a damaged filesystem, see xfs_check(8) and
xfs_repair(8)."
> fi
> exit 0
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJ7kQeNRmM+OaGhBgRAr5CAJ9jIct6ae0NpY/VRazObuW2C3gKIwCfeItG
hF1kk8ymY6CwLg/N8pNlD1o=
=7hvb
-----END PGP SIGNATURE-----

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-21 22:09   ` Russell Cattelan
@ 2009-04-22  9:45     ` Mike Ashton
  2009-04-22 21:45       ` Andi Kleen
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Ashton @ 2009-04-22  9:45 UTC (permalink / raw)
  To: xfs

On Tue, Apr 21, 2009 at 05:09:34PM -0500, Russell Cattelan wrote:

Hi Russell (and others reading), and thanks for your reply.

> Well step back a bit, fsck.xfs exists simply to satisfy the initial
> boot scripts that invokes fsck -t $fs_type.  The reason fsck.xfs
> does nothing and should continue to do nothing is that by the time
> you have access to the boot scripts and the fsck.xfs program the
> root filesystem has already been mounted. Which means the root file
> system has successfully made it through either a clean mount or a
> log replay mount, neither of which needs additional verification.

Now that's an interesting point; I hadn't seen it quite like that
before.  It's now very clear to me that there's a semantic
inconsistency between xfs and, say, ext2 in that the initial read only
mount of ext2 is more directly analogous to a read-only _norecovery_
mount of xfs.  The filesystem at that stage might be in an
inconsistent state, but there's an expectation that you'll be able to
read fsck (/xfs_repair) from it.  By handling the "fsck stage" at the
time of the initial read only mount, some fragility has been
introduced into the process.  The filesystem now only mounts if it's
in a consistent state (bad!), even though we've redefined what
"consistent" means to refer to journal integrity rather than the
underlying filesystem integrity (good).  With badly behaved hardware,
which seem prevalent, or any bugs which do get into xfs we could
actually end up with xfs being less fault tolerant and less reliable
in general use than other filesystems, which would be a bit of a
shame.

> It would not be unreasonable to do what you are suggesting in an
> initrd startup script, provided xfs_repair was included in the
> initrd (which has size and library requirements).

I think we can do it on direct mounts, but only if we can get to the
bottom of readonly/norecovery semantics.  Obviously we don't
necessarily want readonly mounts to be non recovered by default
(although there is an argument for that).  

Would it be crazy to propose a filesystem flag to control which
default recovery behaviour a filesystem has?  A root filesystem isn't
mounted read-only except a) on boot and b) when being tinkered with by
someone competent, so I think it would be useful to be able to tell
such a filesystem that it shouldn't attempt journal recovery on
readonly mount, which would enable a meaningful use of a meaningful
fsck.  What do you think about that?

> This would probably be a matter of first implementing it and then
> convincing the mkinitrd maintainers to add the support.

I'm a bit out of my depth with the politics of that.  This would be a
different person for each distribution?

Mike.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-22  9:45     ` Mike Ashton
@ 2009-04-22 21:45       ` Andi Kleen
  2009-04-23  8:49         ` Mike Ashton
  0 siblings, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2009-04-22 21:45 UTC (permalink / raw)
  To: Mike Ashton; +Cc: xfs

Mike Ashton <mike@fysh.org> writes:

> With badly behaved hardware,
> which seem prevalent, or any bugs which do get into xfs we could
> actually end up with xfs being less fault tolerant and less reliable
> in general use than other filesystems, which would be a bit of a
> shame.

Most Linux file systems are not very fault tolerant in this sense;
e.g. on ext3 you have have to press return and accept lots of scary
messages to get through fsck.

-Andi
 

-- 
ak@linux.intel.com -- Speaking for myself only.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-22 21:45       ` Andi Kleen
@ 2009-04-23  8:49         ` Mike Ashton
  2009-04-23 12:45           ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Ashton @ 2009-04-23  8:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: xfs

On Wed, Apr 22, 2009 at 11:45:11PM +0200, Andi Kleen wrote:
> Mike Ashton <mike@fysh.org> writes:
> 
> > With badly behaved hardware,
> > which seem prevalent, or any bugs which do get into xfs we could
> > actually end up with xfs being less fault tolerant and less reliable
> > in general use than other filesystems, which would be a bit of a
> > shame.
> 
> Most Linux file systems are not very fault tolerant in this sense;
> e.g. on ext3 you have have to press return and accept lots of scary
> messages to get through fsck.

Perhaps, but anecdotally/subjectively I've never had a ext3 based
system fail to boot because I turned it off and on again.  I've had
this happen with xfs root filesystems about 15 times over the past few
years.  I'm getting to the point where I'm starting to question the
wisdom of choosing xfs for my systems - whether it's actually mature
enough for use in server environments - which given that it's the one
which ought to be a total no-brainer in this respect, is a worry.

I think even if I can't persuade you guys to make official
improvements, I've got enough information to make ad-hoc improvements
to my own systems, but I'm going to have a hard time on the advocacy
front.  xfs rocks, but a system is only as good as its last power cut
(or something).

I'm hopeful that my readonly/norecovery tuning idea might catch
someone's imagination, but we'll have to see.

Mike.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-23  8:49         ` Mike Ashton
@ 2009-04-23 12:45           ` Eric Sandeen
       [not found]             ` <20090423141432.GC16600@fysh.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2009-04-23 12:45 UTC (permalink / raw)
  To: Mike Ashton; +Cc: Andi Kleen, xfs

Mike Ashton wrote:
> On Wed, Apr 22, 2009 at 11:45:11PM +0200, Andi Kleen wrote:
>> Mike Ashton <mike@fysh.org> writes:
>>
>>> With badly behaved hardware,
>>> which seem prevalent, or any bugs which do get into xfs we could
>>> actually end up with xfs being less fault tolerant and less reliable
>>> in general use than other filesystems, which would be a bit of a
>>> shame.
>> Most Linux file systems are not very fault tolerant in this sense;
>> e.g. on ext3 you have have to press return and accept lots of scary
>> messages to get through fsck.
> 
> Perhaps, but anecdotally/subjectively I've never had a ext3 based
> system fail to boot because I turned it off and on again.  

<hand_wave> xfs log replay may be more sensitive... </hand_wave>

> I've had
> this happen with xfs root filesystems about 15 times over the past few
> years.  I'm getting to the point where I'm starting to question the
> wisdom of choosing xfs for my systems - whether it's actually mature
> enough for use in server environments - which given that it's the one
> which ought to be a total no-brainer in this respect, is a worry.

Server environments probably *normally* are in better shape for power
consistency, but still...

> I think even if I can't persuade you guys to make official
> improvements, I've got enough information to make ad-hoc improvements
> to my own systems, but I'm going to have a hard time on the advocacy
> front.  xfs rocks, but a system is only as good as its last power cut
> (or something).
> 
> I'm hopeful that my readonly/norecovery tuning idea might catch
> someone's imagination, but we'll have to see.

It certainly does sound like an interesting idea, but others' concerns
are relevant too.  The issues around how the root filesystem gets
mounted would need to be pretty clearly addressed.  Maybe you can spell
out your original proposal again, with updates to handle that issue?

(as an aside, there have been arguments in the past that readonly mounts
should not do recovery at all - i.e. "mount -o ro" doesn't just mean
that you can only read the filesystem, but that the mount will only ever
read the block device...)

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <20090423141432.GC16600@fysh.org>]

* Re: fsck.xfs proposed improvements
       [not found]             ` <20090423141432.GC16600@fysh.org>
@ 2009-04-23 14:35               ` Mike Ashton
  2009-04-23 16:19                 ` Russell Cattelan
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Ashton @ 2009-04-23 14:35 UTC (permalink / raw)
  To: xfs

On Thu, Apr 23, 2009 at 07:45:25AM -0500, Eric Sandeen wrote:

> It certainly does sound like an interesting idea, but others' concerns
> are relevant too.  The issues around how the root filesystem gets
> mounted would need to be pretty clearly addressed.  Maybe you can spell
> out your original proposal again, with updates to handle that issue?
>
> (as an aside, there have been arguments in the past that readonly mounts
> should not do recovery at all - i.e. "mount -o ro" doesn't just mean
> that you can only read the filesystem, but that the mount will only ever
> read the block device...)

I propose firstly that that behaviour should be configurable by per
filesystem tuning, making it possible to set a root filesystem to
default to norecovery on a read-only mount.  Then non-initrd mounting
of / should always succeed, getting us access to fsck.xfs.

I secondly, and I'm going to broke here, propose that
xfs_check/xfs_repair (as invocations, not the code!) should be
deprecated and both programs should be called fsck.xfs. When called
with that name, they would have the following (familiar) semantics:

fsck.xfs: verify journal integrity.  
	If it's good, return "filesystem is clean" and exit.
        If it's bad, invoke xfs_clean behaviour

fsck.xfs -f:   invoke xfs_clean behaviour even with a good journal

fsck.xfs -a: verify journal integrity
	If it's good, return "filesystem is clean" and exit.
        If it's bad, invoke xfs_repair -L behaviour

(and so on)

This makes fsck.xfs behave analogously to fsck.ext2 and friends, with
it's clean and dirty flag.  The improvement xfs offers over ext2 in
this area is that a filesystem is not only clean if shut down cleanly,
but is also clean if shutdown unclearly but with a usable journal, but
without behaving worse than ext2 by fsck.xfs thinking (incorrectly)
that a filesystem repair will never be needed and giving a filesystem
that won't mount a clean bill of health.

With both these proposals implemented, both initrd and non-initrd boot
processes would correctly handle xfs filesystem checking, using the
xfs journal to give the current excellent general case performance but
provide a safe approach to corrupted journals, without the need for
specific xfs-related care from distribution maintainers.

Thanks,
Mike.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-23 14:35               ` Mike Ashton
@ 2009-04-23 16:19                 ` Russell Cattelan
  2009-04-24  9:21                   ` Mike Ashton
  0 siblings, 1 reply; 9+ messages in thread
From: Russell Cattelan @ 2009-04-23 16:19 UTC (permalink / raw)
  To: Mike Ashton; +Cc: xfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Ashton wrote:
> On Thu, Apr 23, 2009 at 07:45:25AM -0500, Eric Sandeen wrote:
>
>> It certainly does sound like an interesting idea, but others'
>> concerns are relevant too.  The issues around how the root
>> filesystem gets mounted would need to be pretty clearly
>> addressed.  Maybe you can spell out your original proposal again,
>> with updates to handle that issue?
>>
>> (as an aside, there have been arguments in the past that readonly
>> mounts should not do recovery at all - i.e. "mount -o ro" doesn't
>> just mean that you can only read the filesystem, but that the
>> mount will only ever read the block device...)
>
> I propose firstly that that behaviour should be configurable by per
>  filesystem tuning, making it possible to set a root filesystem to
> default to norecovery on a read-only mount.  Then non-initrd
> mounting of / should always succeed, getting us access to fsck.xfs.
>
Traditional thinking with a journaled filesystem has been that if
there is a dirty log then
you do not want to risk mounting the filesystem in an inconsistent
state an thereby risking
a system crash or file system shutdown due to that inconsistent state.
By replaying the log
even on a read only mount the file system is brought back into a known
good state.

So there are risks of mounting without recovery but I'm leaning toward
it might be an acceptable risk in a single user state
that would allow access to the root file system.
>
> I secondly, and I'm going to broke here, propose that
> xfs_check/xfs_repair (as invocations, not the code!) should be
> deprecated and both programs should be called fsck.xfs. When called
>  with that name, they would have the following (familiar)
> semantics:
Well I wouldn't go that far xfs_check is already a wrapper around
xfs_db which is a very
different animal from xfs_repair.
>
> fsck.xfs: verify journal integrity. If it's good, return
> "filesystem is clean" and exit. If it's bad, invoke xfs_clean
> behaviour
>
> fsck.xfs -f:   invoke xfs_clean behaviour even with a good journal
>
> fsck.xfs -a: verify journal integrity If it's good, return
> "filesystem is clean" and exit. If it's bad, invoke xfs_repair -L
> behaviour
>
> (and so on)
Well again step back, most of the time at boot the mount of root
succeeds and the log has
been replayed and the fs is consistent. I don't think changing that to
a mount -norecovery all
the time is a good idea, that is risking every mount to a potentially
inconsistent state for the
rare case that the log is corrupted.

So even if we do a norecovery and then drop the system into single
user due to a corrupted log,
the only option at that point is xfs_repair -L, which is not a
recommended thing to do unless
some manual analysis  is done  and the inevitable data loss is understood.

It would be nice to eventually have an xfs_repair that could replay
the log from userspace
but that has not been implemented yet, that would allow for a clean
repair from userspace.
But again if the log is corrupted it may not be able to handle things
any better than the
kernel log recovery.

Also in the case of a mount -norecover with any subsequent repair
being done, it is probably
best to reboot at that point to ensure there is no bad FS data that
may be in cache.
>
> This makes fsck.xfs behave analogously to fsck.ext2 and friends,
> with it's clean and dirty flag.  The improvement xfs offers over
> ext2 in this area is that a filesystem is not only clean if shut
> down cleanly, but is also clean if shutdown unclearly but with a
> usable journal, but without behaving worse than ext2 by fsck.xfs
> thinking (incorrectly) that a filesystem repair will never be
> needed and giving a filesystem that won't mount a clean bill of
> health.
Given that xfs was suppose to be a fresh way of doing file systems
over the traditional UFS based
filesystems trying to make xfs behave like ext2/ext* is not really a
step forwards.

But I think there could be some improvement made to provide a less
painful way of
recovering a root fs that has a bad log.

>
> With both these proposals implemented, both initrd and non-initrd
> boot processes would correctly handle xfs filesystem checking,
> using the xfs journal to give the current excellent general case
> performance but provide a safe approach to corrupted journals,
> without the need for specific xfs-related care from distribution
> maintainers.
>
> Thanks, Mike.
>
> _______________________________________________ xfs mailing list
> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJ8JUYNRmM+OaGhBgRAtllAJ9Ha2DGfoMalyjnfEggS0YhXL24BQCfZfuc
K5SglBMCSIIfzyjUsjFgTrE=
=fDCQ
-----END PGP SIGNATURE-----

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: fsck.xfs proposed improvements
  2009-04-23 16:19                 ` Russell Cattelan
@ 2009-04-24  9:21                   ` Mike Ashton
  0 siblings, 0 replies; 9+ messages in thread
From: Mike Ashton @ 2009-04-24  9:21 UTC (permalink / raw)
  To: xfs

On Thu, Apr 23, 2009 at 11:19:45AM -0500, Russell Cattelan wrote:

> Traditional thinking with a journaled filesystem has been that if
> there is a dirty log then you do not want to risk mounting the
> filesystem in an inconsistent state an thereby risking a system
> crash or file system shutdown due to that inconsistent state.

Although I don't think you're doing anything more dangerous than
mounting a non-fsck'd non-journaling filesystem read-only, which is
the traditional unix boot method when you're not using initrd, I do
accept that I've introduced a non-zero chance of a system crash in
situations where everything is fine.  I think I've thought of a
compromise.

I propose the addition of a new mount semantic, let's call it
"tryrecovery" for now, which will replay a log if possible or mount
the filesystem in an inconsistent state otherwise.  So you would mark
a filesystem as being a root fs, enabling this behaviour, and the
kernel's attempt to mount its root filesystem would invoke this
behaviour without the explicit knowledge of lilo, grub, kernel
parameters, etc.

I believe this would address both our concerns.  In the general case,
the behaviour will be as it is now; the journal is played, the root
filesystem will be mounted into known a good states and there's no
chance of a crash, but if everything's gone to hell, we allow
fingers-crossed access to the filesystem to be able to get access to
the xfs_repair tool.

> Also in the case of a mount -norecover with any subsequent repair
> being done, it is probably
> best to reboot at that point to ensure there is no bad FS data that
> may be in cache.

A remount to read/write ought to invalidate any cache/buffers for
exactly that reason.

Cheers,
Mike.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-04-24  9:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.0.1240318659.128675.xfs@oss.sgi.com>
2009-04-21 14:23 ` fsck.xfs proposed improvements Mike Ashton
2009-04-21 22:09   ` Russell Cattelan
2009-04-22  9:45     ` Mike Ashton
2009-04-22 21:45       ` Andi Kleen
2009-04-23  8:49         ` Mike Ashton
2009-04-23 12:45           ` Eric Sandeen
     [not found]             ` <20090423141432.GC16600@fysh.org>
2009-04-23 14:35               ` Mike Ashton
2009-04-23 16:19                 ` Russell Cattelan
2009-04-24  9:21                   ` Mike Ashton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.