All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs_scrub: call for testing
@ 2018-02-02 21:36 Eric Sandeen
  2018-02-02 21:51 ` Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Eric Sandeen @ 2018-02-02 21:36 UTC (permalink / raw)
  To: linux-xfs

Hey all - 

Darrick's done a great job with landing the xfs_scrub code in
upstream kernel v4.15, and now merged on the for-next branch of
xfsprogs to be released in xfsprogs-4.15.0.

As with any big new body of code, there might be some rough
edges despite best efforts.  It'd be great to have people do
some testing at this semi-early stage.

The 10,000ft overview is that the new xfs_scrub command can
/validate/ a lot of what's on disk while the filesystem
is mounted; and the ability to repair will come in the future.

For now, with the 4.15 kernel, functionality is limited to
"scrubbing" meaning that it will simply check for consistency;
in 4.15 there is no facility to repair or optimize/preen the
filesystem.

I'd really value feedback on scrub as it stand at this point -
Is the documentation clear?  Is the output correct?  Do the
tool's arguments make sense?  Does it segfault?  Does it
find real errors?  Does it crash your kernel? Does it
eat your data?

(haha no it won't eat your data)
((haha no can't promise that with 100% certainty))

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-02 21:36 xfs_scrub: call for testing Eric Sandeen
@ 2018-02-02 21:51 ` Darrick J. Wong
  2018-02-05 15:10 ` Emmanuel Florac
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-02-02 21:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Fri, Feb 02, 2018 at 03:36:33PM -0600, Eric Sandeen wrote:
> Hey all - 
> 
> Darrick's done a great job with landing the xfs_scrub code in
> upstream kernel v4.15, and now merged on the for-next branch of
> xfsprogs to be released in xfsprogs-4.15.0.
> 
> As with any big new body of code, there might be some rough
> edges despite best efforts.  It'd be great to have people do
> some testing at this semi-early stage.
> 
> The 10,000ft overview is that the new xfs_scrub command can
> /validate/ a lot of what's on disk while the filesystem
> is mounted; and the ability to repair will come in the future.
> 
> For now, with the 4.15 kernel, functionality is limited to
> "scrubbing" meaning that it will simply check for consistency;
> in 4.15 there is no facility to repair or optimize/preen the
> filesystem.

FWIW the 4.16 kernel enhances scrub to cross-reference metadata with
each other for strengthened checking (not to mention picking up a pile
of bug fixes), so eventually you'll want to move on to that for testing.
Ofc we're not even to -rc1 yet so meh. :)

Longer term, I also have landed the dangerous_repair xfstest group that
uses xfs_db to fuzz every field in a filesystem to see if scrub will
complain and xfs_repair does something about it.  Right now it's a bit
of a mess because it'll trip over scrub/repair not complaining about
fields that have no bad values (think inode timestamps) but I'm working
on a larger analysis of triaging known failures and fixing things that
the tools should catch but don't.

> I'd really value feedback on scrub as it stand at this point -
> Is the documentation clear?  Is the output correct?  Do the
> tool's arguments make sense?  Does it segfault?  Does it
> find real errors?  Does it crash your kernel? Does it
> eat your data?

Yes, please look at those things!


Thanks to Eric for reviewing all the userspace patches, and Dave and
Brian for reviewing all the kernel patches!

--D

> (haha no it won't eat your data)
> ((haha no can't promise that with 100% certainty))
> 
> Thanks,
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-02 21:36 xfs_scrub: call for testing Eric Sandeen
  2018-02-02 21:51 ` Darrick J. Wong
@ 2018-02-05 15:10 ` Emmanuel Florac
  2018-02-05 15:49   ` Eric Sandeen
  2018-02-15 18:18 ` Emmanuel Florac
  2018-04-02  0:10 ` Chris Murphy
  3 siblings, 1 reply; 14+ messages in thread
From: Emmanuel Florac @ 2018-02-05 15:10 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 1807 bytes --]

Le Fri, 2 Feb 2018 15:36:33 -0600
Eric Sandeen <sandeen@sandeen.net> écrivait:

> I'd really value feedback on scrub as it stand at this point -
> Is the documentation clear?  Is the output correct?  Do the
> tool's arguments make sense?  Does it segfault?  Does it
> find real errors?  Does it crash your kernel? Does it
> eat your data?

Wouldn't it be better to remove the parts about repairing the
filesystem in the documentation? The man page states that it *can't*
repair the filesystem, but nonetheless explains under which
circumstances it *won't* be able to repair (in some theoretical future
version with repair capabilities, I suppose). Ditto with the -n and -y
option, I suppose they're both basically noop at the moment? That's
quite unclear what it actually does.

Regarding FITRIM for flash storage, I think most people refers to it as
"TRIM", not the ioctl name FITRIM. Using "TRIM" would probably be more
understandable IMO.

Because I'm such a funny boy, I just wanted to see what happens when
running xfs_scrub on an unsupported kernel. On both a 4.14.x and a
3.18.x it seems about right:

root@bareos16:~# ./xfs_scrub /mnt/raid/
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Error: /mnt/raid: Kernel metadata scrubbing facility is not available.
Info: /mnt/raid: Scrub aborted after phase 1.
/mnt/raid: 2 errors found.

I don't have any system running 4.15 to test its effects, but I'll do
as soon as possible.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

[-- Attachment #2: Signature digitale OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 15:10 ` Emmanuel Florac
@ 2018-02-05 15:49   ` Eric Sandeen
  2018-02-05 16:44     ` Darrick J. Wong
  2018-02-05 17:08     ` Emmanuel Florac
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Sandeen @ 2018-02-05 15:49 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: linux-xfs


[-- Attachment #1.1: Type: text/plain, Size: 2184 bytes --]

On 2/5/18 9:10 AM, Emmanuel Florac wrote:
> Le Fri, 2 Feb 2018 15:36:33 -0600
> Eric Sandeen <sandeen@sandeen.net> écrivait:
> 
>> I'd really value feedback on scrub as it stand at this point -
>> Is the documentation clear?  Is the output correct?  Do the
>> tool's arguments make sense?  Does it segfault?  Does it
>> find real errors?  Does it crash your kernel? Does it
>> eat your data?
> 
> Wouldn't it be better to remove the parts about repairing the
> filesystem in the documentation? The man page states that it *can't*
> repair the filesystem, but nonetheless explains under which
> circumstances it *won't* be able to repair (in some theoretical future
> version with repair capabilities, I suppose). Ditto with the -n and -y
> option, I suppose they're both basically noop at the moment? That's
> quite unclear what it actually does.

I'll take another look at the manpage.  The userspace tool today /can/
do some degree of optimization or repair if the kernel supports it,
so I was reluctant to suggest removing all such language.

So, "-n" is not a no-op, it's a check-only ("scrub") pass vs. the default
no-argument action of "optimizing," or the extra -y action which would repair.
If that's not all clear, I'd appreciate suggestions to clean it up.

> Regarding FITRIM for flash storage, I think most people refers to it as
> "TRIM", not the ioctl name FITRIM. Using "TRIM" would probably be more
> understandable IMO.

Fair point, thanks.

> Because I'm such a funny boy, I just wanted to see what happens when
> running xfs_scrub on an unsupported kernel. On both a 4.14.x and a
> 3.18.x it seems about right:
> 
> root@bareos16:~# ./xfs_scrub /mnt/raid/
> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
> Error: /mnt/raid: Kernel metadata scrubbing facility is not available.
> Info: /mnt/raid: Scrub aborted after phase 1.
> /mnt/raid: 2 errors found.

Yup.  TBH I'm not a fan of listing "your kernel can't scrub" as
"errors found."  I think we should find a way around that.

> I don't have any system running 4.15 to test its effects, but I'll do
> as soon as possible.

Cool, thanks.

-Eric


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 15:49   ` Eric Sandeen
@ 2018-02-05 16:44     ` Darrick J. Wong
  2018-02-05 16:55       ` Eric Sandeen
  2018-02-05 17:08     ` Emmanuel Florac
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2018-02-05 16:44 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Emmanuel Florac, linux-xfs

On Mon, Feb 05, 2018 at 09:49:41AM -0600, Eric Sandeen wrote:
> On 2/5/18 9:10 AM, Emmanuel Florac wrote:
> > Le Fri, 2 Feb 2018 15:36:33 -0600
> > Eric Sandeen <sandeen@sandeen.net> écrivait:
> > 
> >> I'd really value feedback on scrub as it stand at this point -
> >> Is the documentation clear?  Is the output correct?  Do the
> >> tool's arguments make sense?  Does it segfault?  Does it
> >> find real errors?  Does it crash your kernel? Does it
> >> eat your data?
> > 
> > Wouldn't it be better to remove the parts about repairing the
> > filesystem in the documentation? The man page states that it *can't*
> > repair the filesystem, but nonetheless explains under which
> > circumstances it *won't* be able to repair (in some theoretical future
> > version with repair capabilities, I suppose). Ditto with the -n and -y
> > option, I suppose they're both basically noop at the moment? That's
> > quite unclear what it actually does.
> 
> I'll take another look at the manpage.  The userspace tool today /can/
> do some degree of optimization or repair if the kernel supports it,
> so I was reluctant to suggest removing all such language.

Yes.  If check doesn't find any errors and we're in preen or repair mode
then we can trim the free space.  They're not completely no-op...

> So, "-n" is not a no-op, it's a check-only ("scrub") pass vs. the default
> no-argument action of "optimizing," or the extra -y action which would repair.
> If that's not all clear, I'd appreciate suggestions to clean it up.

-n	Only check filesystem metadata.  Do not repair or optimize
	anything.

-y	Check filesystem metadata and try to repair errors.  If the
	errors cannot be fixed online, the filesystem must be taken
	offline and repaired with xfs_repair(8).

> > Regarding FITRIM for flash storage, I think most people refers to it as
> > "TRIM", not the ioctl name FITRIM. Using "TRIM" would probably be more
> > understandable IMO.
> 
> Fair point, thanks.
> 
> > Because I'm such a funny boy, I just wanted to see what happens when
> > running xfs_scrub on an unsupported kernel. On both a 4.14.x and a
> > 3.18.x it seems about right:
> > 
> > root@bareos16:~# ./xfs_scrub /mnt/raid/
> > EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
> > Error: /mnt/raid: Kernel metadata scrubbing facility is not available.
> > Info: /mnt/raid: Scrub aborted after phase 1.
> > /mnt/raid: 2 errors found.
> 
> Yup.  TBH I'm not a fan of listing "your kernel can't scrub" as
> "errors found."  I think we should find a way around that.

What do you mean by that?

--D

> 
> > I don't have any system running 4.15 to test its effects, but I'll do
> > as soon as possible.
> 
> Cool, thanks.
> 
> -Eric
> 




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 16:44     ` Darrick J. Wong
@ 2018-02-05 16:55       ` Eric Sandeen
  2018-02-05 22:40         ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2018-02-05 16:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Emmanuel Florac, linux-xfs



On 2/5/18 10:44 AM, Darrick J. Wong wrote:
>>> root@bareos16:~# ./xfs_scrub /mnt/raid/
>>> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
>>> Error: /mnt/raid: Kernel metadata scrubbing facility is not available.
>>> Info: /mnt/raid: Scrub aborted after phase 1.
>>> /mnt/raid: 2 errors found.
>> Yup.  TBH I'm not a fan of listing "your kernel can't scrub" as
>> "errors found."  I think we should find a way around that.
> What do you mean by that?
> 
> --D
> 

When people run a tool like scrub and they see "errors found"
at the end of the run, I think it's very easy to have that register
as "filesystem errors found" which is not the case here.

If a kernel can't do scrub at all, "2 errors found" is a confusing
message - I'd rather find a way to not conflate filesystem
errors with operational errors (or simply missing capabilities).

As a first cut I might suggest that if required capabilities for
the requested action were not found, we should not even print
the "errors found" line, just the informational text.  i.e.
just reset the error counters in that specific case before exit.

-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 15:49   ` Eric Sandeen
  2018-02-05 16:44     ` Darrick J. Wong
@ 2018-02-05 17:08     ` Emmanuel Florac
  2018-02-05 22:39       ` Darrick J. Wong
  1 sibling, 1 reply; 14+ messages in thread
From: Emmanuel Florac @ 2018-02-05 17:08 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]

Le Mon, 5 Feb 2018 09:49:41 -0600
Eric Sandeen <sandeen@sandeen.net> écrivait:

> > 
> > Wouldn't it be better to remove the parts about repairing the
> > filesystem in the documentation? The man page states that it *can't*
> > repair the filesystem, but nonetheless explains under which
> > circumstances it *won't* be able to repair (in some theoretical
> > future version with repair capabilities, I suppose). Ditto with the
> > -n and -y option, I suppose they're both basically noop at the
> > moment? That's quite unclear what it actually does.  
> 
> I'll take another look at the manpage.  The userspace tool today /can/
> do some degree of optimization or repair if the kernel supports it,
> so I was reluctant to suggest removing all such language.
> 
> So, "-n" is not a no-op, it's a check-only ("scrub") pass vs. the
> default no-argument action of "optimizing," or the extra -y action
> which would repair. If that's not all clear, I'd appreciate
> suggestions to clean it up.
> 

Now I'm wondering: is the default option of "optimizing" really
useful? Wouldn't it be better to simply have a check-only (-n) version,
and a full-fledged version when given no argument? 
Or maybe do a simple optimisation, optionally,  when given the '-y' (or
some other flag) option? 

I say that after having a look at man pages from some comparable
utilities, namely xfs_repair, btrfs_scrub and "zpool scrub", who all
default to "full operation" without options.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

[-- Attachment #2: Signature digitale OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 17:08     ` Emmanuel Florac
@ 2018-02-05 22:39       ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-02-05 22:39 UTC (permalink / raw)
  To: Emmanuel Florac; +Cc: Eric Sandeen, linux-xfs

On Mon, Feb 05, 2018 at 06:08:07PM +0100, Emmanuel Florac wrote:
> Le Mon, 5 Feb 2018 09:49:41 -0600
> Eric Sandeen <sandeen@sandeen.net> écrivait:
> 
> > > 
> > > Wouldn't it be better to remove the parts about repairing the
> > > filesystem in the documentation? The man page states that it *can't*
> > > repair the filesystem, but nonetheless explains under which
> > > circumstances it *won't* be able to repair (in some theoretical
> > > future version with repair capabilities, I suppose). Ditto with the
> > > -n and -y option, I suppose they're both basically noop at the
> > > moment? That's quite unclear what it actually does.  
> > 
> > I'll take another look at the manpage.  The userspace tool today /can/
> > do some degree of optimization or repair if the kernel supports it,
> > so I was reluctant to suggest removing all such language.
> > 
> > So, "-n" is not a no-op, it's a check-only ("scrub") pass vs. the
> > default no-argument action of "optimizing," or the extra -y action
> > which would repair. If that's not all clear, I'd appreciate
> > suggestions to clean it up.
> > 
> 
> Now I'm wondering: is the default option of "optimizing" really
> useful? Wouldn't it be better to simply have a check-only (-n) version,
> and a full-fledged version when given no argument? 
> Or maybe do a simple optimisation, optionally,  when given the '-y' (or
> some other flag) option? 
> 
> I say that after having a look at man pages from some comparable
> utilities, namely xfs_repair, btrfs_scrub and "zpool scrub", who all
> default to "full operation" without options.

I don't care /that/ much about what 'zpool scrub' does, but I do see
your point that from the admin's perspective either we fix everything or
we don't, so there's no need for a -y and we can do what repair does (-n
means dry run, lack of -n means fix it).

--D

> 
> -- 
> ------------------------------------------------------------------------
> Emmanuel Florac     |   Direction technique
>                     |   Intellique
>                     |	<eflorac@intellique.com>
>                     |   +33 1 78 94 84 02
> ------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-05 16:55       ` Eric Sandeen
@ 2018-02-05 22:40         ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-02-05 22:40 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Emmanuel Florac, linux-xfs

On Mon, Feb 05, 2018 at 10:55:22AM -0600, Eric Sandeen wrote:
> 
> 
> On 2/5/18 10:44 AM, Darrick J. Wong wrote:
> >>> root@bareos16:~# ./xfs_scrub /mnt/raid/
> >>> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
> >>> Error: /mnt/raid: Kernel metadata scrubbing facility is not available.
> >>> Info: /mnt/raid: Scrub aborted after phase 1.
> >>> /mnt/raid: 2 errors found.
> >> Yup.  TBH I'm not a fan of listing "your kernel can't scrub" as
> >> "errors found."  I think we should find a way around that.
> > What do you mean by that?
> > 
> > --D
> > 
> 
> When people run a tool like scrub and they see "errors found"
> at the end of the run, I think it's very easy to have that register
> as "filesystem errors found" which is not the case here.
> 
> If a kernel can't do scrub at all, "2 errors found" is a confusing
> message - I'd rather find a way to not conflate filesystem
> errors with operational errors (or simply missing capabilities).
> 
> As a first cut I might suggest that if required capabilities for
> the requested action were not found, we should not even print
> the "errors found" line, just the informational text.  i.e.
> just reset the error counters in that specific case before exit.

Yeah.  As you and I have been batting around on IRC all day, I've
changed most of the non-fs-corruption str_warn/str_error calls into
str_info since they pretty much all abort the scrub anyway (which is
itself recorded as a runtime error).

--D

> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-02 21:36 xfs_scrub: call for testing Eric Sandeen
  2018-02-02 21:51 ` Darrick J. Wong
  2018-02-05 15:10 ` Emmanuel Florac
@ 2018-02-15 18:18 ` Emmanuel Florac
  2018-04-02  0:10 ` Chris Murphy
  3 siblings, 0 replies; 14+ messages in thread
From: Emmanuel Florac @ 2018-02-15 18:18 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

[-- Attachment #1: Type: text/plain, Size: 20158 bytes --]

Le Fri, 2 Feb 2018 15:36:33 -0600
Eric Sandeen <sandeen@sandeen.net> écrivait:

> Hey all - 
> 
> Darrick's done a great job with landing the xfs_scrub code in
> upstream kernel v4.15, and now merged on the for-next branch of
> xfsprogs to be released in xfsprogs-4.15.0.
> 
> As with any big new body of code, there might be some rough
> edges despite best efforts.  It'd be great to have people do
> some testing at this semi-early stage.
> 
> The 10,000ft overview is that the new xfs_scrub command can
> /validate/ a lot of what's on disk while the filesystem
> is mounted; and the ability to repair will come in the future.
> 
> For now, with the 4.15 kernel, functionality is limited to
> "scrubbing" meaning that it will simply check for consistency;
> in 4.15 there is no facility to repair or optimize/preen the
> filesystem.
> 
> I'd really value feedback on scrub as it stand at this point -
> Is the documentation clear?  Is the output correct?  Do the
> tool's arguments make sense?  Does it segfault?  Does it
> find real errors?  Does it crash your kernel? Does it
> eat your data?
> 

It's enabled

~# grep SCRUB /boot/config-4.15.3-storiq64

We live dangerously here:

[   24.217353] XFS (vdb): EXPERIMENTAL reflink feature enabled. Use at
your own risk! [   24.217646] XFS (vdb): Mounting V5 Filesystem
[   24.247828] XFS (vdb): Starting recovery (logdev: internal)
[   24.251312] XFS (vdb): Ending recovery (logdev: internal)


So it works:



CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XFS_ONLINE_SCRUB=y
~# xfs_scrub /mnt/raid/
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Error: /mnt/raid: Kernel metadata optimization facility is not
available.  Use -n to scrub. Info: /mnt/raid: Scrub aborted after phase
1. /mnt/raid: 2 errors found.
~# 
~# xfs_scrub -n /mnt/raid/
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Info: AG 1 superblock: Optimization is possible.
Info: AG 2 superblock: Optimization is possible.
Info: AG 3 superblock: Optimization is possible.
515,5MiB data used;  19,0 inodes used.
372,6MiB data found; 19,0 inodes found.
~# 
~# 

After a dirty unmount:

root@storiq-clef-usb:~# xfs_scrub -n /mnt/raid/
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Info: AG 1 superblock: Optimization is possible.
Info: AG 2 superblock: Optimization is possible.
Info: AG 3 superblock: Optimization is possible.
877,9MiB data used;  27,0 inodes used.
735,0MiB data found; 27,0 inodes found.

After mangling the device with an hex editor:

~# xfs_scrub -n /mnt/raid/
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Info: AG 1 superblock: Optimization is possible.
Info: AG 2 superblock: Optimization is possible.
Info: AG 3 superblock: Optimization is possible.
Error: Inode 96 directory entries: Repairs are required.
Error: Inode 99 inode record: Repairs are required.
Error: Inode 99 data block map: Repairs are required.
Error: Inode 99 attr block map: Repairs are required.
Error: Inode 99 CoW block map: Repairs are required.
Error: Inode 99 extended attributes: Repairs are required.
Error: Inode 99 parent pointer: Repairs are required.
Info: /mnt/raid: Filesystem has errors, skipping connectivity checks.
877,9MiB data used;  27,0 inodes used.
735,0MiB data found; 27,0 inodes found.
/mnt/raid: 7 errors found.  Unmount and run xfs_repair.

I then mangled the file some more, but it didn't mount anymore
(structure needs repair). 

Here is the output of dmesg during these experiments:

[  239.408757] Call Trace:
[  239.408776]  xfs_corruption_error+0x85/0x90 [xfs]
[  239.408794]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.408811]  xfs_iread+0x1bd/0x1f0 [xfs]
[  239.408829]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.408847]  xfs_iget+0x30d/0x700 [xfs]
[  239.408850]  ? kstrtoll+0x22/0x70
[  239.408865]  xfs_scrub_get_inode+0x79/0x180 [xfs]
[  239.408880]  xfs_scrub_setup_inode_bmap+0x11/0xb0 [xfs]
[  239.408895]  xfs_scrub_metadata+0x22d/0x2b0 [xfs]
[  239.408915]  ? xfs_scrub_bmap+0x380/0x380 [xfs]
[  239.408952]  xfs_ioc_scrub_metadata+0x41/0x70 [xfs]
[  239.408975]  xfs_file_ioctl+0x8be/0xa30 [xfs]
[  239.408996]  ? __queue_work+0xf7/0x2a0
[  239.408997]  ? pty_write+0x42/0x50
[  239.408999]  ? __clear_rsb+0x25/0x3d
[  239.409001]  ? __clear_rsb+0x15/0x3d
[  239.409003]  ? __clear_rsb+0x25/0x3d
[  239.409004]  ? __clear_rsb+0x15/0x3d
[  239.409006]  ? __clear_rsb+0x25/0x3d
[  239.409007]  ? __clear_rsb+0x15/0x3d
[  239.409018]  ? __clear_rsb+0x25/0x3d
[  239.409020]  ? __clear_rsb+0x15/0x3d
[  239.409021]  ? __clear_rsb+0x25/0x3d
[  239.409023]  ? __clear_rsb+0x15/0x3d
[  239.409025]  ? __clear_rsb+0x25/0x3d
[  239.409026]  ? __clear_rsb+0x15/0x3d
[  239.409028]  ? __clear_rsb+0x25/0x3d
[  239.409029]  ? __clear_rsb+0x15/0x3d
[  239.409031]  ? __clear_rsb+0x25/0x3d
[  239.409032]  ? __clear_rsb+0x15/0x3d
[  239.409034]  ? __clear_rsb+0x25/0x3d
[  239.409035]  ? __clear_rsb+0x15/0x3d
[  239.409037]  ? __clear_rsb+0x25/0x3d
[  239.409038]  ? __clear_rsb+0x15/0x3d
[  239.409040]  ? __clear_rsb+0x25/0x3d
[  239.409042]  ? __clear_rsb+0x15/0x3d
[  239.409043]  ? __clear_rsb+0x25/0x3d
[  239.409045]  ? __clear_rsb+0x15/0x3d
[  239.409046]  ? __clear_rsb+0x25/0x3d
[  239.409048]  ? __clear_rsb+0x15/0x3d
[  239.409049]  ? __clear_rsb+0x25/0x3d
[  239.409051]  ? __clear_rsb+0x15/0x3d
[  239.409053]  do_vfs_ioctl+0x86/0x5a0
[  239.409055]  ? __schedule+0x214/0x6d0
[  239.409057]  SyS_ioctl+0x36/0x70
[  239.409058]  ? exit_to_usermode_loop+0x6a/0x90
[  239.409060]  do_syscall_64+0x60/0x190
[  239.409062]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  239.409063] RIP: 0033:0x7f9c0acb41c7
[  239.409064] RSP: 002b:00007ffe2d223d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  239.409065] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f9c0acb41c7
[  239.409066] RDX: 00007ffe2d223e90 RSI: 00000000c040583c RDI: 0000000000000003
[  239.409067] RBP: 00007ffe2d223e90 R08: 000000000040e254 R09: 00007f9c0ac1e99a
[  239.409068] R10: 00007f9c0af77460 R11: 0000000000000246 R12: 0000000000000003
[  239.409069] R13: 00007ffe2d226940 R14: 0000000000000001 R15: 00007ffe2d223f60
[  239.409071] XFS (vdb): Corruption detected. Unmount and run xfs_repair
[  239.409240] XFS (vdb): xfs_iread: validation failed for inode 99
[  239.409242] 00000000a80c7e65: 49 4e 41 ed 03 01 00 00 00 00 03 e8 00 00 00 64  INA............d
[  239.409243] 00000000ba03f95e: 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  239.409244] 000000004897c577: 5a 85 bd d9 16 ce 05 cf 59 5d 16 8d 1d e6 0d 5e  Z.......Y].....^
[  239.409245] 00000000a7d0657f: 59 63 97 ca 2e d7 1c f7 00 00 00 00 00 00 01 41  Yc.............A
[  239.409267] XFS (vdb): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x30d/0x700 [xfs]
[  239.409268] CPU: 0 PID: 2080 Comm: xfs_scrub Not tainted 4.15.3-storiq64 #1
[  239.409269] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  239.409270] Call Trace:
[  239.409290]  xfs_corruption_error+0x85/0x90 [xfs]
[  239.409308]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.409325]  xfs_iread+0x1bd/0x1f0 [xfs]
[  239.409348]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.409379]  xfs_iget+0x30d/0x700 [xfs]
[  239.409382]  ? kstrtoll+0x22/0x70
[  239.409397]  xfs_scrub_get_inode+0x79/0x180 [xfs]
[  239.409413]  xfs_scrub_setup_inode_bmap+0x11/0xb0 [xfs]
[  239.409427]  xfs_scrub_metadata+0x22d/0x2b0 [xfs]
[  239.409443]  ? xfs_scrub_bmap+0x380/0x380 [xfs]
[  239.409461]  xfs_ioc_scrub_metadata+0x41/0x70 [xfs]
[  239.409479]  xfs_file_ioctl+0x8be/0xa30 [xfs]
[  239.409481]  ? __queue_work+0xf7/0x2a0
[  239.409482]  ? pty_write+0x42/0x50
[  239.409484]  ? __clear_rsb+0x25/0x3d
[  239.409486]  ? __clear_rsb+0x15/0x3d
[  239.409488]  ? __clear_rsb+0x25/0x3d
[  239.409489]  ? __clear_rsb+0x15/0x3d
[  239.409491]  ? __clear_rsb+0x25/0x3d
[  239.409493]  ? __clear_rsb+0x15/0x3d
[  239.409494]  ? __clear_rsb+0x25/0x3d
[  239.409496]  ? __clear_rsb+0x15/0x3d
[  239.409497]  ? __clear_rsb+0x25/0x3d
[  239.409499]  ? __clear_rsb+0x15/0x3d
[  239.409501]  ? __clear_rsb+0x25/0x3d
[  239.409502]  ? __clear_rsb+0x15/0x3d
[  239.409504]  ? __clear_rsb+0x25/0x3d
[  239.409505]  ? __clear_rsb+0x15/0x3d
[  239.409507]  ? __clear_rsb+0x25/0x3d
[  239.409509]  ? __clear_rsb+0x15/0x3d
[  239.409510]  ? __clear_rsb+0x25/0x3d
[  239.409512]  ? __clear_rsb+0x15/0x3d
[  239.409514]  ? __clear_rsb+0x25/0x3d
[  239.409515]  ? __clear_rsb+0x15/0x3d
[  239.409517]  ? __clear_rsb+0x25/0x3d
[  239.409518]  ? __clear_rsb+0x15/0x3d
[  239.409520]  ? __clear_rsb+0x25/0x3d
[  239.409522]  ? __clear_rsb+0x15/0x3d
[  239.409523]  ? __clear_rsb+0x25/0x3d
[  239.409525]  ? __clear_rsb+0x15/0x3d
[  239.409526]  ? __clear_rsb+0x25/0x3d
[  239.409528]  ? __clear_rsb+0x15/0x3d
[  239.409530]  do_vfs_ioctl+0x86/0x5a0
[  239.409532]  ? __schedule+0x214/0x6d0
[  239.409534]  SyS_ioctl+0x36/0x70
[  239.409535]  ? exit_to_usermode_loop+0x6a/0x90
[  239.409537]  do_syscall_64+0x60/0x190
[  239.409539]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  239.409540] RIP: 0033:0x7f9c0acb41c7
[  239.409541] RSP: 002b:00007ffe2d223d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  239.409542] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f9c0acb41c7
[  239.409543] RDX: 00007ffe2d223e90 RSI: 00000000c040583c RDI: 0000000000000003
[  239.409544] RBP: 00007ffe2d223e90 R08: 000000000040e263 R09: 00007f9c0ac1e99a
[  239.409545] R10: 00007f9c0af77460 R11: 0000000000000246 R12: 0000000000000003
[  239.409546] R13: 00007ffe2d226940 R14: 0000000000000001 R15: 00007ffe2d223f60
[  239.409547] XFS (vdb): Corruption detected. Unmount and run xfs_repair
[  239.409728] XFS (vdb): xfs_iread: validation failed for inode 99
[  239.409730] 00000000a80c7e65: 49 4e 41 ed 03 01 00 00 00 00 03 e8 00 00 00 64  INA............d
[  239.409732] 00000000ba03f95e: 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  239.409733] 000000004897c577: 5a 85 bd d9 16 ce 05 cf 59 5d 16 8d 1d e6 0d 5e  Z.......Y].....^
[  239.409733] 00000000a7d0657f: 59 63 97 ca 2e d7 1c f7 00 00 00 00 00 00 01 41  Yc.............A
[  239.409752] XFS (vdb): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x30d/0x700 [xfs]
[  239.409754] CPU: 0 PID: 2080 Comm: xfs_scrub Not tainted 4.15.3-storiq64 #1
[  239.409755] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  239.409755] Call Trace:
[  239.409779]  xfs_corruption_error+0x85/0x90 [xfs]
[  239.409814]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.409832]  xfs_iread+0x1bd/0x1f0 [xfs]
[  239.409851]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.409869]  xfs_iget+0x30d/0x700 [xfs]
[  239.409885]  xfs_scrub_get_inode+0x79/0x180 [xfs]
[  239.409900]  xfs_scrub_setup_inode_contents+0x11/0x70 [xfs]
[  239.409916]  ? xfs_scrub_setup_xattr+0x3e/0x60 [xfs]
[  239.409930]  xfs_scrub_metadata+0x22d/0x2b0 [xfs]
[  239.409946]  ? xfs_scrub_xattr_rec+0x130/0x130 [xfs]
[  239.409970]  xfs_ioc_scrub_metadata+0x41/0x70 [xfs]
[  239.410005]  xfs_file_ioctl+0x8be/0xa30 [xfs]
[  239.410007]  ? __queue_work+0xf7/0x2a0
[  239.410018]  ? pty_write+0x42/0x50
[  239.410020]  ? __clear_rsb+0x25/0x3d
[  239.410022]  ? __clear_rsb+0x15/0x3d
[  239.410024]  ? __clear_rsb+0x25/0x3d
[  239.410025]  ? __clear_rsb+0x15/0x3d
[  239.410027]  ? __clear_rsb+0x25/0x3d
[  239.410028]  ? __clear_rsb+0x15/0x3d
[  239.410030]  ? __clear_rsb+0x25/0x3d
[  239.410031]  ? __clear_rsb+0x15/0x3d
[  239.410033]  ? __clear_rsb+0x25/0x3d
[  239.410034]  ? __clear_rsb+0x15/0x3d
[  239.410036]  ? __clear_rsb+0x25/0x3d
[  239.410037]  ? __clear_rsb+0x15/0x3d
[  239.410039]  ? __clear_rsb+0x25/0x3d
[  239.410040]  ? __clear_rsb+0x15/0x3d
[  239.410042]  ? __clear_rsb+0x25/0x3d
[  239.410044]  ? __clear_rsb+0x15/0x3d
[  239.410045]  ? __clear_rsb+0x25/0x3d
[  239.410047]  ? __clear_rsb+0x15/0x3d
[  239.410048]  ? __clear_rsb+0x25/0x3d
[  239.410050]  ? __clear_rsb+0x15/0x3d
[  239.410051]  ? __clear_rsb+0x25/0x3d
[  239.410053]  ? __clear_rsb+0x15/0x3d
[  239.410054]  ? __clear_rsb+0x25/0x3d
[  239.410056]  ? __clear_rsb+0x15/0x3d
[  239.410057]  ? __clear_rsb+0x25/0x3d
[  239.410059]  ? __clear_rsb+0x15/0x3d
[  239.410060]  ? __clear_rsb+0x25/0x3d
[  239.410062]  ? __clear_rsb+0x15/0x3d
[  239.410064]  do_vfs_ioctl+0x86/0x5a0
[  239.410066]  ? __schedule+0x214/0x6d0
[  239.410068]  SyS_ioctl+0x36/0x70
[  239.410069]  ? exit_to_usermode_loop+0x6a/0x90
[  239.410071]  do_syscall_64+0x60/0x190
[  239.410073]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  239.410074] RIP: 0033:0x7f9c0acb41c7
[  239.410075] RSP: 002b:00007ffe2d223d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  239.410077] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f9c0acb41c7
[  239.410077] RDX: 00007ffe2d223e90 RSI: 00000000c040583c RDI: 0000000000000003
[  239.410078] RBP: 00007ffe2d223e90 R08: 000000000040e283 R09: 00007f9c0ac1e99a
[  239.410079] R10: 00007f9c0af77460 R11: 0000000000000246 R12: 0000000000000003
[  239.410080] R13: 00007ffe2d226940 R14: 0000000000000001 R15: 00007ffe2d223f60
[  239.410082] XFS (vdb): Corruption detected. Unmount and run xfs_repair
[  239.410256] XFS (vdb): xfs_iread: validation failed for inode 99
[  239.410258] 00000000a80c7e65: 49 4e 41 ed 03 01 00 00 00 00 03 e8 00 00 00 64  INA............d
[  239.410259] 00000000ba03f95e: 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  239.410260] 000000004897c577: 5a 85 bd d9 16 ce 05 cf 59 5d 16 8d 1d e6 0d 5e  Z.......Y].....^
[  239.410261] 00000000a7d0657f: 59 63 97 ca 2e d7 1c f7 00 00 00 00 00 00 01 41  Yc.............A
[  239.410282] XFS (vdb): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x30d/0x700 [xfs]
[  239.410284] CPU: 0 PID: 2080 Comm: xfs_scrub Not tainted 4.15.3-storiq64 #1
[  239.410285] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  239.410285] Call Trace:
[  239.410304]  xfs_corruption_error+0x85/0x90 [xfs]
[  239.410323]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.410340]  xfs_iread+0x1bd/0x1f0 [xfs]
[  239.410362]  ? xfs_iget+0x30d/0x700 [xfs]
[  239.410396]  xfs_iget+0x30d/0x700 [xfs]
[  239.410412]  xfs_scrub_get_inode+0x79/0x180 [xfs]
[  239.410427]  xfs_scrub_setup_inode_contents+0x11/0x70 [xfs]
[  239.410442]  xfs_scrub_metadata+0x22d/0x2b0 [xfs]
[  239.410456]  ? xfs_scrub_parent_validate+0x240/0x240 [xfs]
[  239.410475]  xfs_ioc_scrub_metadata+0x41/0x70 [xfs]
[  239.410494]  xfs_file_ioctl+0x8be/0xa30 [xfs]
[  239.410496]  ? __queue_work+0xf7/0x2a0
[  239.410498]  ? pty_write+0x42/0x50
[  239.410500]  ? __clear_rsb+0x25/0x3d
[  239.410501]  ? __clear_rsb+0x15/0x3d
[  239.410503]  ? __clear_rsb+0x25/0x3d
[  239.410505]  ? __clear_rsb+0x15/0x3d
[  239.410506]  ? __clear_rsb+0x25/0x3d
[  239.410508]  ? __clear_rsb+0x15/0x3d
[  239.410509]  ? __clear_rsb+0x25/0x3d
[  239.410511]  ? __clear_rsb+0x15/0x3d
[  239.410512]  ? __clear_rsb+0x25/0x3d
[  239.410514]  ? __clear_rsb+0x15/0x3d
[  239.410515]  ? __clear_rsb+0x25/0x3d
[  239.410517]  ? __clear_rsb+0x15/0x3d
[  239.410519]  ? __clear_rsb+0x25/0x3d
[  239.410520]  ? __clear_rsb+0x15/0x3d
[  239.410522]  ? __clear_rsb+0x25/0x3d
[  239.410523]  ? __clear_rsb+0x15/0x3d
[  239.410525]  ? __clear_rsb+0x25/0x3d
[  239.410526]  ? __clear_rsb+0x15/0x3d
[  239.410528]  ? __clear_rsb+0x25/0x3d
[  239.410529]  ? __clear_rsb+0x15/0x3d
[  239.410531]  ? __clear_rsb+0x25/0x3d
[  239.410532]  ? __clear_rsb+0x15/0x3d
[  239.410534]  ? __clear_rsb+0x25/0x3d
[  239.410535]  ? __clear_rsb+0x15/0x3d
[  239.410537]  ? __clear_rsb+0x25/0x3d
[  239.410539]  ? __clear_rsb+0x15/0x3d
[  239.410540]  ? __clear_rsb+0x25/0x3d
[  239.410542]  ? __clear_rsb+0x15/0x3d
[  239.410543]  do_vfs_ioctl+0x86/0x5a0
[  239.410546]  ? __schedule+0x214/0x6d0
[  239.410547]  SyS_ioctl+0x36/0x70
[  239.410549]  ? exit_to_usermode_loop+0x6a/0x90
[  239.410550]  do_syscall_64+0x60/0x190
[  239.410552]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  239.410553] RIP: 0033:0x7f9c0acb41c7
[  239.410554] RSP: 002b:00007ffe2d223d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  239.410556] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f9c0acb41c7
[  239.410557] RDX: 00007ffe2d223e90 RSI: 00000000c040583c RDI: 0000000000000003
[  239.410558] RBP: 00007ffe2d223e90 R08: 000000000040e2a5 R09: 00007f9c0ac1e99a
[  239.410558] R10: 00007f9c0af77460 R11: 0000000000000246 R12: 0000000000000003
[  239.410559] R13: 00007ffe2d226940 R14: 0000000000000001 R15: 00007ffe2d223f60
[  239.410561] XFS (vdb): Corruption detected. Unmount and run xfs_repair
[  239.410679] systemd-journald[197]: /dev/kmsg buffer overrun, some messages lost.
[  289.069032] XFS (vdb): Unmounting Filesystem
[  347.669570] XFS (vdb): EXPERIMENTAL reflink feature enabled. Use at your own risk!
[  347.669785] XFS (vdb): Mounting V5 Filesystem
[  347.678771] XFS (vdb): Ending clean mount
[  402.947693] XFS (vdb): Unmounting Filesystem
[  476.483462] XFS (vdb): EXPERIMENTAL reflink feature enabled. Use at your own risk!
[  476.483653] XFS (vdb): Mounting V5 Filesystem
[  476.494540] XFS (vdb): xfs_iread: validation failed for inode 96
[  476.494548] 00000000687389b7: 49 4e 41 ed 03 01 00 00 00 00 00 00 00 00 00 00  INA.............
[  476.494550] 00000000119c2671: 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  476.494552] 00000000f81aa31f: 5a 85 cc 89 3a 7d aa 56 5a 85 cb d6 2c 7a 9d 23  Z...:}.VZ...,z.#
[  476.494553] 0000000073ce204e: 5a 85 cb d6 2c 7a 9d 23 00 00 00 00 00 00 00 3c  Z...,z.#.......<
[  476.494622] XFS (vdb): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x30d/0x700 [xfs]
[  476.494627] CPU: 0 PID: 2450 Comm: mount Not tainted 4.15.3-storiq64 #1
[  476.494629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  476.494631] Call Trace:
[  476.494662]  xfs_corruption_error+0x85/0x90 [xfs]
[  476.494686]  ? xfs_iget+0x30d/0x700 [xfs]
[  476.494707]  xfs_iread+0x1bd/0x1f0 [xfs]
[  476.494730]  ? xfs_iget+0x30d/0x700 [xfs]
[  476.494751]  xfs_iget+0x30d/0x700 [xfs]
[  476.494773]  ? xlog_cil_init_post_recovery+0x27/0x50 [xfs]
[  476.494793]  xfs_mountfs+0x542/0x900 [xfs]
[  476.494815]  xfs_fs_fill_super+0x35d/0x4f0 [xfs]
[  476.494837]  ? xfs_test_remount_options.isra.24+0x50/0x50 [xfs]
[  476.494843]  mount_bdev+0x16d/0x1a0
[  476.494869]  mount_fs+0xc/0x70
[  476.494879]  vfs_kern_mount+0x59/0x110
[  476.494893]  ? __get_fs_type+0x17/0x30
[  476.494896]  do_mount+0x196/0xb50
[  476.494909]  ? memdup_user+0x39/0x60
[  476.494912]  SyS_mount+0x7f/0xc0
[  476.494917]  do_syscall_64+0x60/0x190
[  476.494924]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  476.494928] RIP: 0033:0x7f8a43e06d8a
[  476.494930] RSP: 002b:00007ffc2baaa498 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  476.494933] RAX: ffffffffffffffda RBX: 000000000060b040 RCX: 00007f8a43e06d8a
[  476.494934] RDX: 000000000060f620 RSI: 000000000060df30 RDI: 000000000060b220
[  476.494935] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f8a43d6999a
[  476.494936] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000000000060b220
[  476.494937] R13: 000000000060f620 R14: 0000000000000000 R15: 0000000000000001
[  476.494940] XFS (vdb): Corruption detected. Unmount and run xfs_repair
[  476.494948] XFS (vdb): failed to read root inode


Et voilà :)
-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac@intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------

[-- Attachment #2: Signature digitale OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-02-02 21:36 xfs_scrub: call for testing Eric Sandeen
                   ` (2 preceding siblings ...)
  2018-02-15 18:18 ` Emmanuel Florac
@ 2018-04-02  0:10 ` Chris Murphy
  2018-04-02  2:01   ` Eric Sandeen
  2018-04-02  2:44   ` Darrick J. Wong
  3 siblings, 2 replies; 14+ messages in thread
From: Chris Murphy @ 2018-04-02  0:10 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Fri, Feb 2, 2018 at 2:36 PM, Eric Sandeen <sandeen@sandeen.net> wrote:


>
> I'd really value feedback on scrub as it stand at this point -
> Is the documentation clear?  Is the output correct?  Do the
> tool's arguments make sense?  Does it segfault?  Does it
> find real errors?  Does it crash your kernel? Does it
> eat your data?

xfsprogs-4.15.1-1.fc27.x86_64
I've built mainline 4.16.0-rc7 with CONFIG_XFS_ONLINE_SCRUB=y but I get this:

[chris@f27s ~]$ sudo xfs_scrub -T -v /mnt/fourth
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt/fourth: using 4 threads to scrub.
Info: /mnt/fourth: Kernel metadata repair facility not detected.
Info: /mnt/fourth: Kernel metadata repair facility is not available.
Use -n to scrub.
Info: /mnt/fourth: Scrub aborted after phase 1.
/mnt/fourth: errors found: 1
Memory used: 132k/0k (17k/116k), time:  0.40/ 0.00/ 0.01s
I/O: 84.0KiB in, 0.0B out, 84.0KiB tot
I/O rate: 211.4KiB/s in, 0.0B/s out, 211.4KiB/s tot

Kernel message

mount
[  329.930644] SGI XFS with ACLs, security attributes, scrub, no debug enabled
[  329.940197] XFS (dm-16): Mounting V5 Filesystem
[  330.553888] XFS (dm-16): Ending clean mount

scrub command
[  373.941951] XFS (dm-16): EXPERIMENTAL online scrub feature in use.
Use at your own risk!


[chris@f27s ~]$ grep XFS linux/.config
CONFIG_XFS_FS=m
CONFIG_XFS_QUOTA=y
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
CONFIG_XFS_ONLINE_SCRUB=y
CONFIG_XFS_WARN=y
# CONFIG_XFS_DEBUG is not set
# CONFIG_VXFS_FS is not set
[chris@f27s ~]$


When I try it with just -n then I get entries

Info: AG 2 superblock: Optimization is possible.

that appears to be working but it's also a noop.

Also the -n output has many unicode complaints with full filenames in
the output. e.g.

Info: inode 1815560251 (27/3620923): Unicode name
"Chapter\xc2\xa02.\xc2\xa0Securing Your Network.webloc" in directory
should be normalized as "Chapter 2. Securing Your Network.webloc".

This should be less verbose by default. Perhaps something generic
like, "Some unicode filenames should be normalized, use -v for verbose
output."  I don't want to have to report a scrub with a bunch of
filenames interlaced in the output. Plus that's nothing I can or even
want to do anything about. That file likely came from a Mac using
netatalk or samba. Hopefully the repair mode for scrub doesn't
normalize these files, I don't really want to find out the hard way
that corrected files then can't be properly read over that same
network connection or otherwise have filenames apparently mangled.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-04-02  0:10 ` Chris Murphy
@ 2018-04-02  2:01   ` Eric Sandeen
  2018-04-02  4:23     ` Chris Murphy
  2018-04-02  2:44   ` Darrick J. Wong
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Sandeen @ 2018-04-02  2:01 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-xfs



On 4/1/18 7:10 PM, Chris Murphy wrote:
> On Fri, Feb 2, 2018 at 2:36 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> 
> 
>>
>> I'd really value feedback on scrub as it stand at this point -
>> Is the documentation clear?  Is the output correct?  Do the
>> tool's arguments make sense?  Does it segfault?  Does it
>> find real errors?  Does it crash your kernel? Does it
>> eat your data?
> 
> xfsprogs-4.15.1-1.fc27.x86_64
> I've built mainline 4.16.0-rc7 with CONFIG_XFS_ONLINE_SCRUB=y but I get this:
> 
> [chris@f27s ~]$ sudo xfs_scrub -T -v /mnt/fourth
> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
> Phase 1: Find filesystem geometry.
> /mnt/fourth: using 4 threads to scrub.
> Info: /mnt/fourth: Kernel metadata repair facility not detected.
> Info: /mnt/fourth: Kernel metadata repair facility is not available.
> Use -n to scrub.

Yup, that's all it can do in that kernel.

> Info: /mnt/fourth: Scrub aborted after phase 1.
> /mnt/fourth: errors found: 1

I think we've fixed it now to not call this an error.

> Memory used: 132k/0k (17k/116k), time:  0.40/ 0.00/ 0.01s
> I/O: 84.0KiB in, 0.0B out, 84.0KiB tot
> I/O rate: 211.4KiB/s in, 0.0B/s out, 211.4KiB/s tot
> 
> Kernel message
> 
> mount
> [  329.930644] SGI XFS with ACLs, security attributes, scrub, no debug enabled
> [  329.940197] XFS (dm-16): Mounting V5 Filesystem
> [  330.553888] XFS (dm-16): Ending clean mount
> 
> scrub command
> [  373.941951] XFS (dm-16): EXPERIMENTAL online scrub feature in use.
> Use at your own risk!
> 
> 
> [chris@f27s ~]$ grep XFS linux/.config
> CONFIG_XFS_FS=m
> CONFIG_XFS_QUOTA=y
> CONFIG_XFS_POSIX_ACL=y
> # CONFIG_XFS_RT is not set
> CONFIG_XFS_ONLINE_SCRUB=y
> CONFIG_XFS_WARN=y
> # CONFIG_XFS_DEBUG is not set
> # CONFIG_VXFS_FS is not set
> [chris@f27s ~]$
> 
> 
> When I try it with just -n then I get entries
> 
> Info: AG 2 superblock: Optimization is possible.
> 
> that appears to be working but it's also a noop.

So essentially it found nothing to worry about.  But it seems that
this is not particularly obvious or informative to the user when
presented this way ...

> Also the -n output has many unicode complaints with full filenames in
> the output. e.g.
> 
> Info: inode 1815560251 (27/3620923): Unicode name
> "Chapter\xc2\xa02.\xc2\xa0Securing Your Network.webloc" in directory
> should be normalized as "Chapter 2. Securing Your Network.webloc".

TBH, I don't quite understand the "should" - should according to
who, and why?  It's very difficult to determine which oddly-encoded
names are actually security risks (which is, I think, the intention
here.)

> This should be less verbose by default. Perhaps something generic
> like, "Some unicode filenames should be normalized, use -v for verbose
> output."  I don't want to have to report a scrub with a bunch of
> filenames interlaced in the output. Plus that's nothing I can or even
> want to do anything about. That file likely came from a Mac using
> netatalk or samba. Hopefully the repair mode for scrub doesn't
> normalize these files, I don't really want to find out the hard way
> that corrected files then can't be properly read over that same
> network connection or otherwise have filenames apparently mangled.

Very good points.

Personally, I've been conflicted about the whole name-checking thing.
On the one hand, you don't want to have to make two passes, because
that's not very efficient.  On the other hand, mingling name 'suggestions'
in with possible corruption errors is a good way to miss critical
information...

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-04-02  0:10 ` Chris Murphy
  2018-04-02  2:01   ` Eric Sandeen
@ 2018-04-02  2:44   ` Darrick J. Wong
  1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2018-04-02  2:44 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Eric Sandeen, linux-xfs

On Sun, Apr 01, 2018 at 06:10:26PM -0600, Chris Murphy wrote:
> On Fri, Feb 2, 2018 at 2:36 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> 
> 
> >
> > I'd really value feedback on scrub as it stand at this point -
> > Is the documentation clear?  Is the output correct?  Do the
> > tool's arguments make sense?  Does it segfault?  Does it
> > find real errors?  Does it crash your kernel? Does it
> > eat your data?
> 
> xfsprogs-4.15.1-1.fc27.x86_64
> I've built mainline 4.16.0-rc7 with CONFIG_XFS_ONLINE_SCRUB=y but I get this:
> 
> [chris@f27s ~]$ sudo xfs_scrub -T -v /mnt/fourth
> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
> Phase 1: Find filesystem geometry.
> /mnt/fourth: using 4 threads to scrub.
> Info: /mnt/fourth: Kernel metadata repair facility not detected.
> Info: /mnt/fourth: Kernel metadata repair facility is not available.

Yeah, xfs_scrub without -n requires the online repair code to be in the
kernel, which likely missed 4.17.  Previous drafts of this program would
automatically move to -n mode if repair wasn't found, but Eric argued
that we shouldn't downgrade modes on the user like that.

> Use -n to scrub.
> Info: /mnt/fourth: Scrub aborted after phase 1.
> /mnt/fourth: errors found: 1
> Memory used: 132k/0k (17k/116k), time:  0.40/ 0.00/ 0.01s
> I/O: 84.0KiB in, 0.0B out, 84.0KiB tot
> I/O rate: 211.4KiB/s in, 0.0B/s out, 211.4KiB/s tot
> 
> Kernel message
> 
> mount
> [  329.930644] SGI XFS with ACLs, security attributes, scrub, no debug enabled
> [  329.940197] XFS (dm-16): Mounting V5 Filesystem
> [  330.553888] XFS (dm-16): Ending clean mount
> 
> scrub command
> [  373.941951] XFS (dm-16): EXPERIMENTAL online scrub feature in use.
> Use at your own risk!
> 
> 
> [chris@f27s ~]$ grep XFS linux/.config
> CONFIG_XFS_FS=m
> CONFIG_XFS_QUOTA=y
> CONFIG_XFS_POSIX_ACL=y
> # CONFIG_XFS_RT is not set
> CONFIG_XFS_ONLINE_SCRUB=y
> CONFIG_XFS_WARN=y
> # CONFIG_XFS_DEBUG is not set
> # CONFIG_VXFS_FS is not set
> [chris@f27s ~]$
> 
> 
> When I try it with just -n then I get entries
> 
> Info: AG 2 superblock: Optimization is possible.
> 
> that appears to be working but it's also a noop.

Right, it's observing that some of the secondary sb fields don't match
the primary, which is ok since xfs_repair will set them if it ever
decides to use them to repair the primary.

(And yes it will spew corruption errors if the fields that repair can't/
doesn't fix up don't match.)

> Also the -n output has many unicode complaints with full filenames in
> the output. e.g.
> 
> Info: inode 1815560251 (27/3620923): Unicode name
> "Chapter\xc2\xa02.\xc2\xa0Securing Your Network.webloc" in directory
> should be normalized as "Chapter 2. Securing Your Network.webloc".
> 
> This should be less verbose by default. Perhaps something generic

All that's being replaced in 4.16 with something less chatty (see below).

> like, "Some unicode filenames should be normalized, use -v for verbose
> output."  I don't want to have to report a scrub with a bunch of
> filenames interlaced in the output. Plus that's nothing I can or even
> want to do anything about. That file likely came from a Mac using
> netatalk or samba. Hopefully the repair mode for scrub doesn't
> normalize these files, I don't really want to find out the hard way

xfs_scrub doesn't do anything with weird filenames (or anything tagged
Warning: or Info:), since those aren't fs corruptions; they're merely
things that depending on the circumstance could be fishy or could be
fine and require administrator review.

Hmmm, 0xc2a0, that's two nonbreaking spaces in the filename.  Yeah that
probably did come from a Mac. :)

As for xfs_scrub in 4.16, I'm replacing the Unicode scanning engine with
a more robust one that only complains about filenames if there are
multiple in the same directory that actually look kinda similar.  In
other words it'll shut up unless that directory contains (for example)
"Chapter\xc2\xa02.txt" and "Chapter 2.txt" and they don't both point to
the same inode.

> that corrected files then can't be properly read over that same
> network connection or otherwise have filenames apparently mangled.

Thanks for testing! :)

--D

> 
> 
> -- 
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: xfs_scrub: call for testing
  2018-04-02  2:01   ` Eric Sandeen
@ 2018-04-02  4:23     ` Chris Murphy
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Murphy @ 2018-04-02  4:23 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Chris Murphy, linux-xfs

On Sun, Apr 1, 2018 at 8:01 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>
>
> On 4/1/18 7:10 PM, Chris Murphy wrote:
>> On Fri, Feb 2, 2018 at 2:36 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>
>>
>>>
>>> I'd really value feedback on scrub as it stand at this point -
>>> Is the documentation clear?  Is the output correct?  Do the
>>> tool's arguments make sense?  Does it segfault?  Does it
>>> find real errors?  Does it crash your kernel? Does it
>>> eat your data?
>>
>> xfsprogs-4.15.1-1.fc27.x86_64
>> I've built mainline 4.16.0-rc7 with CONFIG_XFS_ONLINE_SCRUB=y but I get this:
>>
>> [chris@f27s ~]$ sudo xfs_scrub -T -v /mnt/fourth
>> EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
>> Phase 1: Find filesystem geometry.
>> /mnt/fourth: using 4 threads to scrub.
>> Info: /mnt/fourth: Kernel metadata repair facility not detected.
>> Info: /mnt/fourth: Kernel metadata repair facility is not available.
>> Use -n to scrub.
>
> Yup, that's all it can do in that kernel.
>
>> Info: /mnt/fourth: Scrub aborted after phase 1.
>> /mnt/fourth: errors found: 1
>
> I think we've fixed it now to not call this an error.
>
>> Memory used: 132k/0k (17k/116k), time:  0.40/ 0.00/ 0.01s
>> I/O: 84.0KiB in, 0.0B out, 84.0KiB tot
>> I/O rate: 211.4KiB/s in, 0.0B/s out, 211.4KiB/s tot
>>
>> Kernel message
>>
>> mount
>> [  329.930644] SGI XFS with ACLs, security attributes, scrub, no debug enabled
>> [  329.940197] XFS (dm-16): Mounting V5 Filesystem
>> [  330.553888] XFS (dm-16): Ending clean mount
>>
>> scrub command
>> [  373.941951] XFS (dm-16): EXPERIMENTAL online scrub feature in use.
>> Use at your own risk!
>>
>>
>> [chris@f27s ~]$ grep XFS linux/.config
>> CONFIG_XFS_FS=m
>> CONFIG_XFS_QUOTA=y
>> CONFIG_XFS_POSIX_ACL=y
>> # CONFIG_XFS_RT is not set
>> CONFIG_XFS_ONLINE_SCRUB=y
>> CONFIG_XFS_WARN=y
>> # CONFIG_XFS_DEBUG is not set
>> # CONFIG_VXFS_FS is not set
>> [chris@f27s ~]$
>>
>>
>> When I try it with just -n then I get entries
>>
>> Info: AG 2 superblock: Optimization is possible.
>>
>> that appears to be working but it's also a noop.
>
> So essentially it found nothing to worry about.  But it seems that
> this is not particularly obvious or informative to the user when
> presented this way ...
>
>> Also the -n output has many unicode complaints with full filenames in
>> the output. e.g.
>>
>> Info: inode 1815560251 (27/3620923): Unicode name
>> "Chapter\xc2\xa02.\xc2\xa0Securing Your Network.webloc" in directory
>> should be normalized as "Chapter 2. Securing Your Network.webloc".
>
> TBH, I don't quite understand the "should" - should according to
> who, and why?  It's very difficult to determine which oddly-encoded
> names are actually security risks (which is, I think, the intention
> here.)
>
>> This should be less verbose by default. Perhaps something generic
>> like, "Some unicode filenames should be normalized, use -v for verbose
>> output."  I don't want to have to report a scrub with a bunch of
>> filenames interlaced in the output. Plus that's nothing I can or even
>> want to do anything about. That file likely came from a Mac using
>> netatalk or samba. Hopefully the repair mode for scrub doesn't
>> normalize these files, I don't really want to find out the hard way
>> that corrected files then can't be properly read over that same
>> network connection or otherwise have filenames apparently mangled.
>
> Very good points.
>
> Personally, I've been conflicted about the whole name-checking thing.
> On the one hand, you don't want to have to make two passes, because
> that's not very efficient.  On the other hand, mingling name 'suggestions'
> in with possible corruption errors is a good way to miss critical
> information...

At first I agreed, running it twice is a pain, so I thought: Maybe
spit out the filename checking output at the very end, ensuring that
portion can be easily trimmed away from the report? And if there's an
explicit output to file option, it could automatically split these
parts out into two files: scrub error+repair log, and filename
encoding info log.

But then I thought, OK this could be a huge file system and the
problem filename list is so big it's just effectively useless. Why are
they bad in the first place? If the filename non-compliance is bad
enough in XFS terms that it's becoming confused how to store the
filename, presumably we get EAGAIN or even something more useful? Or
heck maybe even the VFS should do these kinds of rejections, if
they're bad enough.

*shrug*


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-04-02  4:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-02 21:36 xfs_scrub: call for testing Eric Sandeen
2018-02-02 21:51 ` Darrick J. Wong
2018-02-05 15:10 ` Emmanuel Florac
2018-02-05 15:49   ` Eric Sandeen
2018-02-05 16:44     ` Darrick J. Wong
2018-02-05 16:55       ` Eric Sandeen
2018-02-05 22:40         ` Darrick J. Wong
2018-02-05 17:08     ` Emmanuel Florac
2018-02-05 22:39       ` Darrick J. Wong
2018-02-15 18:18 ` Emmanuel Florac
2018-04-02  0:10 ` Chris Murphy
2018-04-02  2:01   ` Eric Sandeen
2018-04-02  4:23     ` Chris Murphy
2018-04-02  2:44   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.