All of lore.kernel.org
 help / color / mirror / Atom feed
* Metadata corruption detected at xfs_inode_buf_verify
@ 2017-04-06  0:36 Christian Kujau
  2017-04-06  8:08 ` Carlos Maiolino
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Kujau @ 2017-04-06  0:36 UTC (permalink / raw)
  To: linux-xfs

Hi,

I have a raspberrypi and try to use that as a small file server. After
trying to delete a ~1TB direcory, it locked up and was then power cycled. 
Now I cannot mount the XFS partition any more:

===============================================
$ mount -t xfs /dev/mapper/owc1 /mnt/disk
mount: mount /dev/mapper/owc1 on /mnt/disk failed: Structure needs cleaning

   [producing lots of messages in dmesg, see below]


$ xfs_repair -v /dev/mapper/owc1
Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
        - block cache size set to 26024 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 3455547 tail block 3450129
ERROR: The filesystem has valuable metadata changes in a log which needs 
[...]
===============================================

Mounting with -o ro,norecovery works, and the directory I wanted to remove 
is still there, in parts - only about 200 GB have been deleted.

Full script log and .config: http://nerdbynature.de/bits/4.11.0-rc5/xfs/

Any ideas on how to solve this?

Thanks,
Christian.
-- 
BOFH excuse #431:

Borg implants are failing

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-04-06  0:36 Metadata corruption detected at xfs_inode_buf_verify Christian Kujau
@ 2017-04-06  8:08 ` Carlos Maiolino
  2017-04-06 17:50   ` Christian Kujau
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos Maiolino @ 2017-04-06  8:08 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-xfs

On Wed, Apr 05, 2017 at 05:36:13PM -0700, Christian Kujau wrote:
> Hi,
> 
> I have a raspberrypi and try to use that as a small file server. After
> trying to delete a ~1TB direcory, it locked up and was then power cycled. 
> Now I cannot mount the XFS partition any more:
> 

Rpi will usually power cycle when it overheat, and possibly when an NMI is
triggered, since (I suspect) Rpi has no support for NMIs, so the CPU will just
reset.
If you got a soft lockup, your Rpi might have been power cycled due overhead, or
due unsupported NMI if you have got a hardlockup.

Removing 1TB directory, doesn't say much, you can have a 1TB file, which will be
quite fast to delete, or you can have 1million 1MiB files, which will require
extra processing (and maybe your Rpi couldn't handle that?)


> ===============================================
> $ mount -t xfs /dev/mapper/owc1 /mnt/disk
> mount: mount /dev/mapper/owc1 on /mnt/disk failed: Structure needs cleaning
>
You couldn't mount the FS, so, you couldn't replay whatever were in the XFS
journal.
 
>    [producing lots of messages in dmesg, see below]
> 
> 
> $ xfs_repair -v /dev/mapper/owc1
> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
>         - block cache size set to 26024 entries
> Phase 2 - using internal log
>         - zero log...
> zero_log: head block 3455547 tail block 3450129
> ERROR: The filesystem has valuable metadata changes in a log which needs 
> [...]
> ===============================================
>
 
> Mounting with -o ro,norecovery works, and the directory I wanted to remove 
> is still there, in parts - only about 200 GB have been deleted.
> 
Some of the remaining changes might have been logged but since you couldn't
replay it, you won't see the files being deleted.

> Full script log and .config: http://nerdbynature.de/bits/4.11.0-rc5/xfs/
> 
> Any ideas on how to solve this?
> 
You will need to xfs_repair the filesystem zeroing the logs, then you will be
able to remount it RW and continue with your operations.

See xfs_repair -L option.


> Thanks,
> Christian.
> -- 
> BOFH excuse #431:
> 

-- 
Carlos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-04-06  8:08 ` Carlos Maiolino
@ 2017-04-06 17:50   ` Christian Kujau
  0 siblings, 0 replies; 8+ messages in thread
From: Christian Kujau @ 2017-04-06 17:50 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs

On Thu, 6 Apr 2017, Carlos Maiolino wrote:
> > I have a raspberrypi and try to use that as a small file server. After
> > trying to delete a ~1TB direcory, it locked up and was then power cycled. 
> > Now I cannot mount the XFS partition any more:
> > 
> 
> Rpi will usually power cycle when it overheat, and possibly when an NMI is

Sorry, I should have been more clear: the Rpi locked up during "rm"[0] and 
then I power-cycled the box. And then I noticed the xfs trouble.

> Removing 1TB directory, doesn't say much, you can have a 1TB file, which will be
> quite fast to delete, or you can have 1million 1MiB files, which will require
> extra processing (and maybe your Rpi couldn't handle that?)

It was a 1 TB directory with lots of small and semi-large files, and yes, 
that may have been too much for this Rpi. I spoke to Eric Sandeen on IRC 
and he suspected the same.

> See xfs_repair -L option.

Yes, that's what I did in the end :-\ After the fsck was finished, the 1 
TB directory was still there (the "rm" managed to remove ~300 GB or so) 
and I removed the rest on a larger machine, so it's all good for now.

Thanks for replying,
Christian.

[0] http://nerdbynature.de/bits/4.11.0-rc5/xfs/kern_log.txt

-- 
BOFH excuse #152:

My pony-tail hit the on/off switch on the power strip.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-07-17  7:47 ` Christian Kujau
  2017-07-17 17:48   ` Brian Foster
@ 2017-07-19  6:54   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2017-07-19  6:54 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-xfs

On Mon, Jul 17, 2017 at 12:47:16AM -0700, Christian Kujau wrote:
>  | and I'm pretty sure that the USB mass storage interface/driver does not 
>  | pass through cache flushes or support FUA operations.

The usb-storage driver of course implements cache flushes and FUA
properly - it's just a SCSI transport after all.  But that doesn't mean
that the device itself can't have issue, of which there have been
plenty.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-07-17 17:48   ` Brian Foster
@ 2017-07-17 18:44     ` Christian Kujau
  0 siblings, 0 replies; 8+ messages in thread
From: Christian Kujau @ 2017-07-17 18:44 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, 17 Jul 2017, Brian Foster wrote:
> Hard to say for sure. I do wonder whether this is related to any of the
> issues that Emmanuel ran into in his recent post[1]. Care to post the
> metadump somewhere where it can be downloaded? Note that it usually can
> be compressed to save upload/download time.

OK, so I tried again and this time with the stock version of xfsprogs from 
Debian/stretch (v4.9.0) and it was able to repair the file system:

  http://nerdbynature.de/bits/4.12.0-rc7/screenlog_seagate.txt

When I moved the disk enclosure from my aarch64 RPI system (running 
current Arch Linux) to my x86-64 Debian/stretch box, I used the self 
compiled git checkout of xfsprogs, because the Debian version is always 
somewhat behind, of course.

> [1] http://www.spinics.net/lists/linux-xfs/msg08176.html

Well, in my case it succeeded to move some data to lost+found, as the file 
system did have enough space free.


In other news, the other disk I suspected as "bad" before isn't bad at 
all, turns out that one of the controller ports in this disk enclosure 
("ElitePro Dual USB") may be have a problem, but on the working port both 
disks can be read from start to finish and xfs_repair v4.9.0 was able to 
repair both file systems now :-)

Thanks for your input,
Christian.

PS: I'll send you the link to the metadump off-list, because:
 | xfs_metadump: Filesystem log is dirty; image will contain unobfuscated metadata in log.
-- 
BOFH excuse #29:

It works the way the Wang did, what's the problem

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-07-17  7:47 ` Christian Kujau
@ 2017-07-17 17:48   ` Brian Foster
  2017-07-17 18:44     ` Christian Kujau
  2017-07-19  6:54   ` Christoph Hellwig
  1 sibling, 1 reply; 8+ messages in thread
From: Brian Foster @ 2017-07-17 17:48 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linux-xfs

On Mon, Jul 17, 2017 at 12:47:16AM -0700, Christian Kujau wrote:
> On Sun, 16 Jul 2017, Christian Kujau wrote:
> > so, the disk enclosure attached to my RaspberryPi[0] had "some" kind of 
> > failure last night: one of the disks appears to have some kind of hardware 
> > problem, the other is fine, but the XFS file system cannot be mounted. 
> > Instead of using the RPI to try the repair, I attached the enclosure to an 
> > Intel i7 machine (16 GB RAM) and attempted to mount:
> 
> After a late night chat on #xfs, it seems that the corruption may have 
> happened due to a problem with the storage driver on this RPI and Dave
> commented:
> 
>  | and I'm pretty sure that the USB mass storage interface/driver does not 
>  | pass through cache flushes or support FUA operations.
>  | which would explain why it appears that the inode cluster 
>  | initialisation IO isn't on disk, and it wasn't replayed by log recovery
>  | before inode updates were recovered....
> 
> (Put here so that it can be found in the archives, in case this happens to 
> someone else)
> 
> So, that being said, the now corrupted XFS still cannot be repaired by 
> xfs_repair (compiled from a git checkout two days ago). The full 
> xfs_repair run can be found the in screenlog:
> 
>  http://nerdbynature.de/bits/4.12.0-rc7/screenlog_1.txt
>  The xfs_logprint and xfs_metadump outputs can be provided at request.
> 
> 
> Is this something xfs_repair should be able to fix or is the filesystem
> just too mangled in this case?
> 

Hard to say for sure. I do wonder whether this is related to any of the
issues that Emmanuel ran into in his recent post[1]. Care to post the
metadump somewhere where it can be downloaded? Note that it usually can
be compressed to save upload/download time.

Brian

[1] http://www.spinics.net/lists/linux-xfs/msg08176.html

> Thanks,
> Christian.
> -- 
> BOFH excuse #431:
> 
> Borg implants are failing
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Metadata corruption detected at xfs_inode_buf_verify
  2017-07-16 18:01 Christian Kujau
@ 2017-07-17  7:47 ` Christian Kujau
  2017-07-17 17:48   ` Brian Foster
  2017-07-19  6:54   ` Christoph Hellwig
  0 siblings, 2 replies; 8+ messages in thread
From: Christian Kujau @ 2017-07-17  7:47 UTC (permalink / raw)
  To: linux-xfs

On Sun, 16 Jul 2017, Christian Kujau wrote:
> so, the disk enclosure attached to my RaspberryPi[0] had "some" kind of 
> failure last night: one of the disks appears to have some kind of hardware 
> problem, the other is fine, but the XFS file system cannot be mounted. 
> Instead of using the RPI to try the repair, I attached the enclosure to an 
> Intel i7 machine (16 GB RAM) and attempted to mount:

After a late night chat on #xfs, it seems that the corruption may have 
happened due to a problem with the storage driver on this RPI and Dave
commented:

 | and I'm pretty sure that the USB mass storage interface/driver does not 
 | pass through cache flushes or support FUA operations.
 | which would explain why it appears that the inode cluster 
 | initialisation IO isn't on disk, and it wasn't replayed by log recovery
 | before inode updates were recovered....

(Put here so that it can be found in the archives, in case this happens to 
someone else)

So, that being said, the now corrupted XFS still cannot be repaired by 
xfs_repair (compiled from a git checkout two days ago). The full 
xfs_repair run can be found the in screenlog:

 http://nerdbynature.de/bits/4.12.0-rc7/screenlog_1.txt
 The xfs_logprint and xfs_metadump outputs can be provided at request.


Is this something xfs_repair should be able to fix or is the filesystem
just too mangled in this case?

Thanks,
Christian.
-- 
BOFH excuse #431:

Borg implants are failing

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Metadata corruption detected at xfs_inode_buf_verify
@ 2017-07-16 18:01 Christian Kujau
  2017-07-17  7:47 ` Christian Kujau
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Kujau @ 2017-07-16 18:01 UTC (permalink / raw)
  To: linux-xfs

Hi,

so, the disk enclosure attached to my RaspberryPi[0] had "some" kind of 
failure last night: one of the disks appears to have some kind of hardware 
problem, the other is fine, but the XFS file system cannot be mounted. 
Instead of using the RPI to try the repair, I attached the enclosure to an 
Intel i7 machine (16 GB RAM) and attempted to mount:

====================================================
# uname -r
4.9.0-3-amd64

# mount -t xfs /dev/mapper/owc1 /mnt/
mount: mount /dev/mapper/owc1 on /mnt/ failed: Structure needs cleaning

# dmesg
XFS (dm-1): Mounting V5 Filesystem
XFS (dm-1): Starting recovery (logdev: internal)
XFS (dm-1): Metadata corruption detected at xfs_inode_buf_verify+0x6e/0xf0 [xfs], xfs_inode block 0x5a48d610
XFS (dm-1): Unmount and run xfs_repair 
XFS (dm-1): First 64 bytes of corrupted metadata buffer:
ffff9cb8cc428000: dc 70 f3 22 07 71 ab 49 6c a6 5c 23 c9 b1 31 37  .p.".q.Il.\#..17
ffff9cb8cc428010: 3f db 62 33 54 87 4d 7d 1e 09 cc 4b fb 2c b0 22  ?.b3T.M}...K.,."
ffff9cb8cc428020: a9 54 91 1a 41 40 fe e1 16 7e 82 e1 56 b4 a8 9a  .T..A@...~..V...
ffff9cb8cc428030: 29 67 de c0 75 01 75 77 3a 1b af 5a 60 1c 4c c7  )g..u.uw:..Z`.L.
XFS (dm-1): Metadata corruption detected at xfs_inode_buf_verify+0x6e/0xf0 [xfs], xfs_inode block 0x5a48d610
XFS (dm-1): Unmount and run xfs_repair 
XFS (dm-1): First 64 bytes of corrupted metadata buffer:
[...]
====================================================

But it cannot be repaired either:

====================================================
# xfs_repair -V
xfs_repair version 4.11.0

# time xfs_repair -v /dev/mapper/owc1; echo $?
Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
        - block cache size set to 1425416 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 3282156 tail block 3268500
ERROR: The filesystem has valuable metadata changes in a log which needs 
to be replayed.  Mount the filesystem to replay the log, and unmount it 
before re-running xfs_repair.  If you are unable to mount the filesystem, 
then use the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a 
mount of the filesystem before doing this.

real    0m21.893s
user    0m0.004s
sys     0m0.008s
2
====================================================


I saved the outputs of xfs_logprint, xfs_logprint -t and xfs_metadump and 
then used the -L option to attempt the repair, but this failed (after 
Phase 6) with:


====================================================
# time xfs_repair -v /dev/mapper/owc1; echo $?
[...]
entry ".." in directory inode 4574814913 points to free inode 268568894, 
marking entry to be junked
bad hash table for directory inode 4574814913 (no data entry): rebuilding
rebuilding directory inode 4574814913
Invalid inode number 0x0
xfs_dir_ino_validate: XFS_ERROR_REPORT

fatal error -- couldn't map inode 4574814915, err = 117

real    37m25.619s
user    0m41.780s
sys     0m14.852s
1
====================================================

Full dmesg and command outputs: http://nerdbynature.de/bits/4.12.0-rc7/

The logprint and metadump outputs could be provided at request, I guess.

Any ideas on how to tackle this?

Thanks,
Christian.

[0] https://www.spinics.net/lists/linux-xfs/msg05618.html
-- 
BOFH excuse #183:

filesystem not big enough for Jumbo Kernel Patch

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-07-19  6:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06  0:36 Metadata corruption detected at xfs_inode_buf_verify Christian Kujau
2017-04-06  8:08 ` Carlos Maiolino
2017-04-06 17:50   ` Christian Kujau
2017-07-16 18:01 Christian Kujau
2017-07-17  7:47 ` Christian Kujau
2017-07-17 17:48   ` Brian Foster
2017-07-17 18:44     ` Christian Kujau
2017-07-19  6:54   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.