All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, Vyacheslav.Dubeyko@wdc.com
Subject: Re: [LSF/MM TOPIC] online filesystem repair
Date: Mon, 16 Jan 2017 22:24:53 -0800	[thread overview]
Message-ID: <20170117062453.GJ14038@birch.djwong.org> (raw)
In-Reply-To: <1484524890.27533.16.camel@dubeyko.com>

On Sun, Jan 15, 2017 at 04:01:30PM -0800, Viacheslav Dubeyko wrote:
> On Fri, 2017-01-13 at 23:54 -0800, Darrick J. Wong wrote:
> > Hi,
> > 
> > I've been working on implementing online metadata scrubbing and
> > repair
> > in XFS.��Most of the code is self contained inside XFS, but there's a
> > small amount of interaction with the VFS freezer code that has to
> > happen
> > in order to shut down the filesystem to rebuild the extent backref
> > records.��It might be interesting to discuss the (fairly slight)
> > requirements upon the VFS to support repairs, and/or have a BoF to
> > discuss how to build an online checker if any of the other
> > filesystems
> > are interested in this.
> > 
> 
> How do you imagine a generic way to support repairs for different file
> systems? From one point of view, to have generic way of the online file
> system repairing could be the really great subsystem.

I don't, sadly.  There's not even a way to /check/ all fs metadata in a
"generic" manner -- we can use the standard VFS interfaces to read
all metadata, but this is fraught.  Even if we assume the fs can spot
check obviously garbage values, that's still not the appropriate place
for a full scan.

> But, from another point of view, every file system has own
> architecture, own set of metadata and own way to do fsck
> check/recovering.

Yes, and this wouldn't change.  The particular mechanism of fixing a
piece of metadata will always be fs-dependent, but the thing that I'm
interested in discussing is how do we avoid having these kinds of things
interact badly with the VFS?

> As far as I can judge, there are significant amount of research
> efforts in this direction (Recon [1], [2], for example).

Yes, I remember Recon.  I appreciated the insight that while it's
impossible to block everything for a full scan, it /is/ possible to
check a single object and its relation to other metadata items.  The xfs
scrubber also takes an incremental approach to verifying a filesystem;
we'll lock each metadata object and verify that its relationships with
the other metadata make sense.  So long as we aren't bombarding the fs
with heavy metadata update workloads, of course.

On the repair side of things xfs added reverse-mapping records, which
the repair code uses to regenerate damaged primary metadata.  After we
land inode parent pointers we'll be able to do the same reconstructions
that we can now do for block allocations...

...but there are some sticky problems with repairing the reverse
mappings.  The normal locking order for that part of xfs is sb_writers
-> inode -> ag header -> rmap btree blocks, but to repair we have to
freeze the filesystem against writes so that we can scan all the inodes.

> But we still haven't any real general online file system repair
> subsystem in the Linux kernel.

I think the ocfs2 developers have encoded some ability to repair
metadata over the past year, though it seems limited to fixing some
parts of inodes.  btrfs stores duplicate copies and restores when
necessary, I think.  Unfortunately, fixing disk corruption is something
that's not easily genericized, which means that I don't think we'll ever
achieve a general subsystem.

But we could at least figure out what in the VFS has to change (if
anything) to support this type of usage.

> Do you have some new insight? What's difference of your
> vision? If we have online file system repair subsystem then how file
> system driver will need to interact with the goal to make internal
> repairing?

It's pretty much all private xfs userspace ioctls[1] with a driver
program[2].

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel
[2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel

> 
> Thanks,
> Vyacheslav Dubeyko.
> 
> [1]�http://www.eecg.toronto.edu/~ashvin/publications/recon-fs-consistency-runtime.pdf
> [2]�https://www.researchgate.net/publication/269300836_Managing_the_file_system_from_the_kernel
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, Vyacheslav.Dubeyko@wdc.com
Subject: Re: [LSF/MM TOPIC] online filesystem repair
Date: Mon, 16 Jan 2017 22:24:53 -0800	[thread overview]
Message-ID: <20170117062453.GJ14038@birch.djwong.org> (raw)
In-Reply-To: <1484524890.27533.16.camel@dubeyko.com>

On Sun, Jan 15, 2017 at 04:01:30PM -0800, Viacheslav Dubeyko wrote:
> On Fri, 2017-01-13 at 23:54 -0800, Darrick J. Wong wrote:
> > Hi,
> > 
> > I've been working on implementing online metadata scrubbing and
> > repair
> > in XFS.  Most of the code is self contained inside XFS, but there's a
> > small amount of interaction with the VFS freezer code that has to
> > happen
> > in order to shut down the filesystem to rebuild the extent backref
> > records.  It might be interesting to discuss the (fairly slight)
> > requirements upon the VFS to support repairs, and/or have a BoF to
> > discuss how to build an online checker if any of the other
> > filesystems
> > are interested in this.
> > 
> 
> How do you imagine a generic way to support repairs for different file
> systems? From one point of view, to have generic way of the online file
> system repairing could be the really great subsystem.

I don't, sadly.  There's not even a way to /check/ all fs metadata in a
"generic" manner -- we can use the standard VFS interfaces to read
all metadata, but this is fraught.  Even if we assume the fs can spot
check obviously garbage values, that's still not the appropriate place
for a full scan.

> But, from another point of view, every file system has own
> architecture, own set of metadata and own way to do fsck
> check/recovering.

Yes, and this wouldn't change.  The particular mechanism of fixing a
piece of metadata will always be fs-dependent, but the thing that I'm
interested in discussing is how do we avoid having these kinds of things
interact badly with the VFS?

> As far as I can judge, there are significant amount of research
> efforts in this direction (Recon [1], [2], for example).

Yes, I remember Recon.  I appreciated the insight that while it's
impossible to block everything for a full scan, it /is/ possible to
check a single object and its relation to other metadata items.  The xfs
scrubber also takes an incremental approach to verifying a filesystem;
we'll lock each metadata object and verify that its relationships with
the other metadata make sense.  So long as we aren't bombarding the fs
with heavy metadata update workloads, of course.

On the repair side of things xfs added reverse-mapping records, which
the repair code uses to regenerate damaged primary metadata.  After we
land inode parent pointers we'll be able to do the same reconstructions
that we can now do for block allocations...

...but there are some sticky problems with repairing the reverse
mappings.  The normal locking order for that part of xfs is sb_writers
-> inode -> ag header -> rmap btree blocks, but to repair we have to
freeze the filesystem against writes so that we can scan all the inodes.

> But we still haven't any real general online file system repair
> subsystem in the Linux kernel.

I think the ocfs2 developers have encoded some ability to repair
metadata over the past year, though it seems limited to fixing some
parts of inodes.  btrfs stores duplicate copies and restores when
necessary, I think.  Unfortunately, fixing disk corruption is something
that's not easily genericized, which means that I don't think we'll ever
achieve a general subsystem.

But we could at least figure out what in the VFS has to change (if
anything) to support this type of usage.

> Do you have some new insight? What's difference of your
> vision? If we have online file system repair subsystem then how file
> system driver will need to interact with the goal to make internal
> repairing?

It's pretty much all private xfs userspace ioctls[1] with a driver
program[2].

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel
[2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel

> 
> Thanks,
> Vyacheslav Dubeyko.
> 
> [1] http://www.eecg.toronto.edu/~ashvin/publications/recon-fs-consistency-runtime.pdf
> [2] https://www.researchgate.net/publication/269300836_Managing_the_file_system_from_the_kernel
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-01-17  6:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-14  7:54 [LSF/MM TOPIC] online filesystem repair Darrick J. Wong
2017-01-16  0:01 ` Viacheslav Dubeyko
2017-01-17  6:24   ` Darrick J. Wong [this message]
2017-01-17  6:24     ` Darrick J. Wong
2017-01-17 20:45     ` Andreas Dilger
2017-01-18  0:37     ` Slava Dubeyko
2017-01-25  8:41       ` Darrick J. Wong
2017-01-27 22:06         ` Slava Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170117062453.GJ14038@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=Vyacheslav.Dubeyko@wdc.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=slava@dubeyko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.