All of lore.kernel.org
 help / color / mirror / Atom feed
* csum failed on innexistent inode
@ 2016-04-04  7:50 Jérôme Poulin
  2016-04-04  9:42 ` Henk Slager
  2016-04-04 20:17 ` Kai Krakow
  0 siblings, 2 replies; 9+ messages in thread
From: Jérôme Poulin @ 2016-04-04  7:50 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

I have a BTRFS on disks running in RAID10 meta+data, one of the disk
has been going bad and scrub was showing 18 uncorrectable errors
(which is weird in RAID10). I tried using --repair-sector with hdparm
even if it shouldn't be necessary since BTRFS would overwrite the
sector. Repair sector fixed the sector in SMART but BTRFS was still
showing 18 uncorr. errors.

I finally decided to give up this opportunity to test the error
correction property of BTRFS (this is a home system, backed up) and
installed a brand new disk in the machine. After running btrfs
replace, everything was fine, I decided to run btrfs scrub again and I
still have the same 18 uncorrectable errors.

Later on, since I had a new disk with more space, I decided to run a
balance to free up the new space but the balance has stopped with csum
errors too. Here are the output of multiple programs.

How is it possible to get rid of the referenced csum errors if they do
not exist? Also, the expected checksum looks suspiciously the same for
multiple errors. Could it be bad RAM in that case? Can I convince
BTRFS to update the csum?

# btrfs inspect-internal logical-resolve -v 1809149952 /mnt/btrfs/
ioctl ret=-1, error: No such file or directory
# btrfs inspect-internal inode-resolve -v 296 /mnt/btrfs/
ioctl ret=-1, error: No such file or directory


dmesg after first bad sector:
avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
error corrected: ino 1 off 655368716288 (dev /dev/dm-42 sector
2939136)
avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
error corrected: ino 1 off 655368720384 (dev /dev/dm-42 sector
2939144)
avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
error corrected: ino 1 off 655368724480 (dev /dev/dm-42 sector
2939152)
avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
error corrected: ino 1 off 655368728576 (dev /dev/dm-42 sector
2939160)

dmesg after balance:
[1738474.444648] BTRFS warning (device dm-40): csum failed ino 296 off
1809195008 csum 1515428513 expected csum 2566472073
[1738474.444649] BTRFS warning (device dm-40): csum failed ino 296 off
1809084416 csum 4147641019 expected csum 1755301217
[1738474.444702] BTRFS warning (device dm-40): csum failed ino 296 off
1809199104 csum 1927504681 expected csum 2566472073
[1738474.444717] BTRFS warning (device dm-40): csum failed ino 296 off
1809211392 csum 3086571080 expected csum 2566472073
[1738474.444917] BTRFS warning (device dm-40): csum failed ino 296 off
1809084416 csum 4147641019 expected csum 1755301217
[1738474.444962] BTRFS warning (device dm-40): csum failed ino 296 off
1809195008 csum 1515428513 expected csum 2566472073
[1738474.444998] BTRFS warning (device dm-40): csum failed ino 296 off
1809199104 csum 1927504681 expected csum 2566472073
[1738474.445034] BTRFS warning (device dm-40): csum failed ino 296 off
1809211392 csum 3086571080 expected csum 2566472073
[1738474.473286] BTRFS warning (device dm-40): csum failed ino 296 off
1809149952 csum 3254083717 expected csum 2566472073
[1738474.473357] BTRFS warning (device dm-40): csum failed ino 296 off
1809162240 csum 3157020538 expected csum 2566472073

btrfs check:
./btrfs check /dev/mapper/luksbtrfsdata2
Checking filesystem on /dev/mapper/luksbtrfsdata2
UUID: 805f6ad7-1188-448d-aee4-8ddeeb70c8a7
checking extents
bad metadata [1453741768704, 1453741785088) crossing stripe boundary
bad metadata [1454487764992, 1454487781376) crossing stripe boundary
bad metadata [1454828552192, 1454828568576) crossing stripe boundary
bad metadata [1454879735808, 1454879752192) crossing stripe boundary
bad metadata [1455087222784, 1455087239168) crossing stripe boundary
bad metadata [1456269426688, 1456269443072) crossing stripe boundary
bad metadata [1456273227776, 1456273244160) crossing stripe boundary
bad metadata [1456404234240, 1456404250624) crossing stripe boundary
bad metadata [1456418914304, 1456418930688) crossing stripe boundary
checking free space cache
checking fs roots
checking csums
checking root refs
found 689292505473 bytes used err is 0
total csum bytes: 660112536
total tree bytes: 1764098048
total fs tree bytes: 961921024
total extent tree bytes: 79331328
btree space waste bytes: 232774315
file data blocks allocated: 4148513517568
 referenced 972284129280

btrfs scrub:
I don't have the output handy but the dmesg output were pairs of
logical blocks like balance and no errors were corrected.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-04  7:50 csum failed on innexistent inode Jérôme Poulin
@ 2016-04-04  9:42 ` Henk Slager
  2016-04-10 15:34   ` Jérôme Poulin
  2016-04-04 20:17 ` Kai Krakow
  1 sibling, 1 reply; 9+ messages in thread
From: Henk Slager @ 2016-04-04  9:42 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: linux-btrfs

On Mon, Apr 4, 2016 at 9:50 AM, Jérôme Poulin <jeromepoulin@gmail.com> wrote:
> Hi all,
>
> I have a BTRFS on disks running in RAID10 meta+data, one of the disk
> has been going bad and scrub was showing 18 uncorrectable errors
> (which is weird in RAID10). I tried using --repair-sector with hdparm
> even if it shouldn't be necessary since BTRFS would overwrite the
> sector. Repair sector fixed the sector in SMART but BTRFS was still
> showing 18 uncorr. errors.
>
> I finally decided to give up this opportunity to test the error
> correction property of BTRFS (this is a home system, backed up) and
> installed a brand new disk in the machine. After running btrfs
> replace, everything was fine, I decided to run btrfs scrub again and I
> still have the same 18 uncorrectable errors.

You might want this patch:
http://www.spinics.net/lists/linux-btrfs/msg53552.html

As workaround, you can reset the counters on new/healty device with:

btrfs device stats [-z] <path>|<device>

> Later on, since I had a new disk with more space, I decided to run a
> balance to free up the new space but the balance has stopped with csum
> errors too. Here are the output of multiple programs.
>
> How is it possible to get rid of the referenced csum errors if they do
> not exist? Also, the expected checksum looks suspiciously the same for
> multiple errors. Could it be bad RAM in that case? Can I convince
> BTRFS to update the csum?
>
> # btrfs inspect-internal logical-resolve -v 1809149952 /mnt/btrfs/
> ioctl ret=-1, error: No such file or directory
> # btrfs inspect-internal inode-resolve -v 296 /mnt/btrfs/
> ioctl ret=-1, error: No such file or directory
>
>
> dmesg after first bad sector:
> avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
> error corrected: ino 1 off 655368716288 (dev /dev/dm-42 sector
> 2939136)
> avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
> error corrected: ino 1 off 655368720384 (dev /dev/dm-42 sector
> 2939144)
> avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
> error corrected: ino 1 off 655368724480 (dev /dev/dm-42 sector
> 2939152)
> avr 01 18:29:52 p4.i.ticpu.net kernel: BTRFS info (device dm-43): read
> error corrected: ino 1 off 655368728576 (dev /dev/dm-42 sector
> 2939160)
>
> dmesg after balance:
> [1738474.444648] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809195008 csum 1515428513 expected csum 2566472073
> [1738474.444649] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809084416 csum 4147641019 expected csum 1755301217
> [1738474.444702] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809199104 csum 1927504681 expected csum 2566472073
> [1738474.444717] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809211392 csum 3086571080 expected csum 2566472073
> [1738474.444917] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809084416 csum 4147641019 expected csum 1755301217
> [1738474.444962] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809195008 csum 1515428513 expected csum 2566472073
> [1738474.444998] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809199104 csum 1927504681 expected csum 2566472073
> [1738474.445034] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809211392 csum 3086571080 expected csum 2566472073
> [1738474.473286] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809149952 csum 3254083717 expected csum 2566472073
> [1738474.473357] BTRFS warning (device dm-40): csum failed ino 296 off
> 1809162240 csum 3157020538 expected csum 2566472073
>
> btrfs check:
> ./btrfs check /dev/mapper/luksbtrfsdata2
> Checking filesystem on /dev/mapper/luksbtrfsdata2
> UUID: 805f6ad7-1188-448d-aee4-8ddeeb70c8a7
> checking extents
> bad metadata [1453741768704, 1453741785088) crossing stripe boundary
> bad metadata [1454487764992, 1454487781376) crossing stripe boundary
> bad metadata [1454828552192, 1454828568576) crossing stripe boundary
> bad metadata [1454879735808, 1454879752192) crossing stripe boundary
> bad metadata [1455087222784, 1455087239168) crossing stripe boundary
> bad metadata [1456269426688, 1456269443072) crossing stripe boundary
> bad metadata [1456273227776, 1456273244160) crossing stripe boundary
> bad metadata [1456404234240, 1456404250624) crossing stripe boundary
> bad metadata [1456418914304, 1456418930688) crossing stripe boundary

Those are false alerts; This patch handles that:
https://patchwork.kernel.org/patch/8706891/

> checking free space cache
> checking fs roots
> checking csums
> checking root refs
> found 689292505473 bytes used err is 0
> total csum bytes: 660112536
> total tree bytes: 1764098048
> total fs tree bytes: 961921024
> total extent tree bytes: 79331328
> btree space waste bytes: 232774315
> file data blocks allocated: 4148513517568
>  referenced 972284129280
>
> btrfs scrub:
> I don't have the output handy but the dmesg output were pairs of
> logical blocks like balance and no errors were corrected.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-04  7:50 csum failed on innexistent inode Jérôme Poulin
  2016-04-04  9:42 ` Henk Slager
@ 2016-04-04 20:17 ` Kai Krakow
  2016-04-08  4:25   ` Jérôme Poulin
  1 sibling, 1 reply; 9+ messages in thread
From: Kai Krakow @ 2016-04-04 20:17 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 4 Apr 2016 03:50:54 -0400
schrieb Jérôme Poulin <jeromepoulin@gmail.com>:

> How is it possible to get rid of the referenced csum errors if they do
> not exist? Also, the expected checksum looks suspiciously the same for
> multiple errors. Could it be bad RAM in that case? Can I convince
> BTRFS to update the csum?
> 
> # btrfs inspect-internal logical-resolve -v 1809149952 /mnt/btrfs/
> ioctl ret=-1, error: No such file or directory
> # btrfs inspect-internal inode-resolve -v 296 /mnt/btrfs/
> ioctl ret=-1, error: No such file or directory

I fell into that pitfall, too. If you have multiple subvolumes, you
need to pass the correct subvolume path for the inode to properly
resolve.

Maybe that's the case for you?

First, take a look at what "btrfs subvol list /mnt/btrfs" shows you.

-- 
Regards,
Kai

Replies to list-only preferred.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-04 20:17 ` Kai Krakow
@ 2016-04-08  4:25   ` Jérôme Poulin
  0 siblings, 0 replies; 9+ messages in thread
From: Jérôme Poulin @ 2016-04-08  4:25 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-btrfs

On Mon, Apr 4, 2016 at 4:17 PM, Kai Krakow <hurikhan77@gmail.com> wrote:
>
> Am Mon, 4 Apr 2016 03:50:54 -0400
> schrieb Jérôme Poulin <jeromepoulin@gmail.com>:
>
> > How is it possible to get rid of the referenced csum errors if they do
> > not exist? Also, the expected checksum looks suspiciously the same for
> > multiple errors. Could it be bad RAM in that case? Can I convince
> > BTRFS to update the csum?
> >
> > # btrfs inspect-internal logical-resolve -v 1809149952 /mnt/btrfs/
> > ioctl ret=-1, error: No such file or directory
> > # btrfs inspect-internal inode-resolve -v 296 /mnt/btrfs/
> > ioctl ret=-1, error: No such file or directory
>
> I fell into that pitfall, too. If you have multiple subvolumes, you
> need to pass the correct subvolume path for the inode to properly
> resolve.
>
> Maybe that's the case for you?
>

You are absolutely right for the inode case, however, that did not
help for logical-resolve.

The file found by the inode does not seem to be corrupted though.

# btrfs sub li /mnt/btrfs/ | cut -d' ' -f9 | xargs -n1 btrfs inspect
logical-resolve -v 1809149952
ioctl ret=-1, error: No such file or directory
ioctl ret=-1, error: No such file or directory
ioctl ret=-1, error: No such file or directory
ioctl ret=-1, error: No such file or directory
...

# btrfs sub li /mnt/btrfs/ | cut -d' ' -f9 | xargs -n1 btrfs inspect
inode-resolve -v 296
ioctl ret=-1, error: No such file or directory
ioctl ret=0, bytes_left=4018, bytes_missing=0, cnt=1, missed=0
backups/runboy/data/www/dev/.virtualenv/lib/python3.4/_collections_abc.py
ioctl ret=0, bytes_left=4018, bytes_missing=0, cnt=1, missed=0
backups@2016-03-23-01-56/www/dev/.virtualenv/lib/python3.4/_collections_abc.py
ioctl ret=0, bytes_left=4018, bytes_missing=0, cnt=1, missed=0
backups@2016-03-23-02-04/www/dev/.virtualenv/lib/python3.4/_collections_abc.py
ioctl ret=0, bytes_left=4018, bytes_missing=0, cnt=1, missed=0
backups@2016-03-23-05-05/www/dev/.virtualenv/lib/python3.4/_collections_abc.py
ioctl ret=0, bytes_left=4018, bytes_missing=0, cnt=1, missed=0
...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-04  9:42 ` Henk Slager
@ 2016-04-10 15:34   ` Jérôme Poulin
  2016-04-10 17:25     ` Henk Slager
  0 siblings, 1 reply; 9+ messages in thread
From: Jérôme Poulin @ 2016-04-10 15:34 UTC (permalink / raw)
  To: Henk Slager; +Cc: linux-btrfs

On Mon, Apr 4, 2016 at 5:42 AM, Henk Slager <eye1tm@gmail.com> wrote:
>
> You might want this patch:
> http://www.spinics.net/lists/linux-btrfs/msg53552.html
>
> As workaround, you can reset the counters on new/healty device with:
>
> btrfs device stats [-z] <path>|<device>
>

I did reset the stats and launched another scrub, and still, since the
logical blocks are the same on both devices and checksum is different,
is really seems like my problem was originally created when I booted
this computer with bad memory (maybe?), could it have been that the
checksum was saved on disk as bad in the first place and BTRFS doesn't
want to read it back?

Is it possible to reset the checksum on those? I couldn't find what
file or metadata the blocks were pointing too.


> >
> > btrfs check:
> > ./btrfs check /dev/mapper/luksbtrfsdata2
> > Checking filesystem on /dev/mapper/luksbtrfsdata2
> > UUID: 805f6ad7-1188-448d-aee4-8ddeeb70c8a7
> > checking extents
> > bad metadata [1453741768704, 1453741785088) crossing stripe boundary
> > bad metadata [1454487764992, 1454487781376) crossing stripe boundary
> > bad metadata [1454828552192, 1454828568576) crossing stripe boundary
> > bad metadata [1454879735808, 1454879752192) crossing stripe boundary
> > bad metadata [1455087222784, 1455087239168) crossing stripe boundary
> > bad metadata [1456269426688, 1456269443072) crossing stripe boundary
> > bad metadata [1456273227776, 1456273244160) crossing stripe boundary
> > bad metadata [1456404234240, 1456404250624) crossing stripe boundary
> > bad metadata [1456418914304, 1456418930688) crossing stripe boundary
>
> Those are false alerts; This patch handles that:
> https://patchwork.kernel.org/patch/8706891/
>

Since those are fakes, I'll just ignore them for now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-10 15:34   ` Jérôme Poulin
@ 2016-04-10 17:25     ` Henk Slager
  2016-04-11  1:48       ` Jérôme Poulin
  2016-04-11  1:50       ` Jérôme Poulin
  0 siblings, 2 replies; 9+ messages in thread
From: Henk Slager @ 2016-04-10 17:25 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: linux-btrfs

>> You might want this patch:
>> http://www.spinics.net/lists/linux-btrfs/msg53552.html
>>
>> As workaround, you can reset the counters on new/healty device with:
>>
>> btrfs device stats [-z] <path>|<device>
>>
>
> I did reset the stats and launched another scrub, and still, since the
> logical blocks are the same on both devices and checksum is different,
> is really seems like my problem was originally created when I booted
> this computer with bad memory (maybe?), could it have been that the
> checksum was saved on disk as bad in the first place and BTRFS doesn't
> want to read it back?

It was not fully clear what the sequence of events were:
- HW problem
- btrfs SW problem
- 1st scrub
- the --repair-sector with hdparm
- 2nd scrub
- 3rd scrub?

There is also DM between the harddisk and btrfs and I am not sure if
whether the hdparm action did repair or further corrupt things.

How do you know for sure that the contents of the 'logical blocks' are
the same on both devices?

If btrfs wants to read a diskblock and its csum doesn't match, then it
is an I/O error, same effect as an uncorrected badsector in the old
days. But in this case your (former/old) disk might still be OK, as
you suggest it might be due to some other error (HW or SW) that
content and csum don't match. It is hard to traceback based on the
info in the email thread. It looks like replace just copied the
problem and it seems a bottleneck now on filesystem level.

> Is it possible to reset the checksum on those? I couldn't find what
> file or metadata the blocks were pointing too.

Could it be that they in the meantime have been removed?
It might be that you again need to run scrub in order to try to find
the problem spot/files.

Fixing individual csum's has been asked before, I don't remember if
there are people who did fix them by own extra scripts/C-code or
whatever. A brute force method is to recalculate and rewrite all
csums:  btrfs check --init-csum-tree , you probably know that. But
maybe you want a rsync -c compare with backups first. Kernel/tools
versions and btrfs fi us output might also give some hints.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-10 17:25     ` Henk Slager
@ 2016-04-11  1:48       ` Jérôme Poulin
  2016-04-11 11:34         ` Henk Slager
  2016-04-11  1:50       ` Jérôme Poulin
  1 sibling, 1 reply; 9+ messages in thread
From: Jérôme Poulin @ 2016-04-11  1:48 UTC (permalink / raw)
  To: Henk Slager; +Cc: linux-btrfs

Sorry for the confusion, allow me to clarify and I will summarize with
what I learned since I now understand that corruption was present
before disk went bad.

Note that this BTRFS was once on a MD RAID5 on LVM on LUKS before
being moved in-place to LVM on LUKS on BTRFS RAID10. But since balance
worked at the time.

Also note that this computer was booted twice for about 30 minutes
period with bad ram before it was replaced.

I think my checksums errors were present, but unknown to me, before
the hardware disk failure. The bad memory might be the root cause of
this problem but I can't be sure.


On Sun, Apr 10, 2016 at 1:25 PM, Henk Slager <eye1tm@gmail.com> wrote:
> It was not fully clear what the sequence of events were:
> - HW problem
> - btrfs SW problem
> - 1st scrub
> - the --repair-sector with hdparm
> - 2nd scrub
> - 3rd scrub?
>

1. Errors in dmesg and confirmation from smartd that hardware problems
were present.
2. Attempt to repair sector using --repair-sector which reset the
sector to zeroes.
3. Scrub detected errors and fixed some but there were 18 uncorrectable.
4. Disk has been changed using btrfs replace. Corruption still present.
5. Balance attempted but aborts when encountering the first uncorrectable error.
6. Tentative to locate bad sector/inode without success leading to
another scrub with the same errors.
7. Attempt to reset stats and scrub again. Still getting the same errors.
8. New disk added and data profile converted from RAID10 to RAID1,
balance abort on first uncorrectable error.


> There is also DM between the harddisk and btrfs and I am not sure if
> whether the hdparm action did repair or further corrupt things.
>

I confirmed after using --repair-sector that the sector has been reset
to zeroes using --read-sector. I also tried read-sector first which
failed and added an entry to the SMART log. After repair-sector,
read-sector returned the zeroed sector.

> How do you know for sure that the contents of the 'logical blocks' are
> the same on both devices?
>

After a balance, here is what dmesg shows (complete warning output):
BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
4147641019 expected csum 1755301217
BTRFS warning (device dm-36): csum failed ino 330 off 1809195008 csum
1515428513 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809199104 csum
1927504681 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809211392 csum
3086571080 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809149952 csum
3254083717 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809162240 csum
3157020538 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809166336 csum
1092724678 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809178624 csum
4235459038 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809182720 csum
1764946502 expected csum 2566472073
BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
4147641019 expected csum 1755301217


After a scrub (complete error output):
BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
corrupt 1, gen 0
BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
corrupt 2, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334876672 on dev /dev/dm-32
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334987264 on dev /dev/dm-32
BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
corrupt 3, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334991360 on dev /dev/dm-32
BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
corrupt 4, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296335003648 on dev /dev/dm-32
BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
corrupt 1, gen 0
BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
corrupt 2, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334876672 on dev /dev/dm-36
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334987264 on dev /dev/dm-36
BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
corrupt 3, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334991360 on dev /dev/dm-36
BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
corrupt 4, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296335003648 on dev /dev/dm-36
BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
corrupt 1, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334942208 on dev /dev/dm-35
BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
corrupt 2, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334954496 on dev /dev/dm-35
BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
corrupt 3, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334958592 on dev /dev/dm-35
BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
corrupt 4, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334970880 on dev /dev/dm-35
BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
corrupt 5, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334974976 on dev /dev/dm-35
BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
corrupt 1, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334942208 on dev /dev/dm-34
BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
corrupt 2, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334954496 on dev /dev/dm-34
BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
corrupt 3, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334958592 on dev /dev/dm-34
BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
corrupt 4, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334970880 on dev /dev/dm-34
BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
corrupt 5, gen 0
BTRFS error (device dm-36): unable to fixup (regular) error at logical
1296334974976 on dev /dev/dm-34

device stats:
[/dev/mapper/luksbtrfsdata1 /dev/dm-32].corruption_errs 4
[/dev/mapper/luksbtrfsdata6 /dev/dm-36].corruption_errs 4
[/dev/mapper/luksbtrfsdata3 /dev/dm-34].corruption_errs 5
[/dev/mapper/luksbtrfsdata2 /dev/dm-33].corruption_errs 0
[/dev/mapper/luksbtrfsdata5 /dev/dm-35].corruption_errs 5
[/dev/mapper/luksbtrfsdata7 /dev/dm-48].corruption_errs 0



If we combine everything, we notice that...
* dm-32 and dm-36 have the same number of uncorrectable errors.
* dm-34 and dm-35 have the same number of uncorrectable errors.
* Scrub output is not helpful at identifying checksum errors. Balance
output is not useful at identifying the physical device.
* Scrub output confirms where the errors are and each logical sector
appear twice on different devices.
* Balance output also shows each offset twice with VERY suspicious
expected checksums.

A wild guess would be that memory corruption caused the checksums to
be incorrectly written to disk.


> If btrfs wants to read a diskblock and its csum doesn't match, then it
> is an I/O error, same effect as an uncorrected badsector in the old
> days. But in this case your (former/old) disk might still be OK, as
> you suggest it might be due to some other error (HW or SW) that
> content and csum don't match. It is hard to traceback based on the
> info in the email thread. It looks like replace just copied the
> problem and it seems a bottleneck now on filesystem level.
>

It seems like btrfs replace did indeed just copy the problem as-is,
which is good since I could not have removed the old defective disk
otherwise.

>> Is it possible to reset the checksum on those? I couldn't find what
>> file or metadata the blocks were pointing too.
>
> Could it be that they in the meantime have been removed?
> It might be that you again need to run scrub in order to try to find
> the problem spot/files.
>

Scrub / inspect-internal didn't help me find the file or metadata.
Even crazy commands like:
btrfs sub li /mnt/btrfs/ | cut -d' ' -f9 | xargs -n1 btrfs inspect
logical-resolve -v 1296334991360

I tried and md5sum'ed every files in the output with no known
problems, no I/O errors.

> Fixing individual csum's has been asked before, I don't remember if
> there are people who did fix them by own extra scripts/C-code or
> whatever. A brute force method is to recalculate and rewrite all
> csums:  btrfs check --init-csum-tree , you probably know that. But
> maybe you want a rsync -c compare with backups first. Kernel/tools
> versions and btrfs fi us output might also give some hints.

I though about using init-csum-tree but you are right, that wouldn't
allow to identify the problem and which files/meta are affected.

Here is the requested output:

btrfs fi us /mnt/btrfs/
Overall:
    Device size:           6.32TiB
    Device allocated:           1.28TiB
    Device unallocated:           5.04TiB
    Device missing:             0.00B
    Used:               1.27TiB
    Free (estimated):           2.52TiB    (min: 2.52TiB)
    Data ratio:                  2.00
    Metadata ratio:              2.00
    Global reserve:         512.00MiB    (used: 0.00B)

Data,RAID1: Size:76.00GiB, Used:74.13GiB
   /dev/dm-32      52.00GiB
   /dev/dm-36      24.00GiB
   /dev/dm-48      76.00GiB

Data,RAID10: Size:576.00GiB, Used:575.99GiB
   /dev/dm-32     105.00GiB
   /dev/dm-33     117.50GiB
   /dev/dm-34     118.00GiB
   /dev/dm-35     118.00GiB
   /dev/dm-36     117.50GiB

Metadata,RAID10: Size:3.09GiB, Used:1.68GiB
   /dev/dm-32     528.00MiB
   /dev/dm-33     528.00MiB
   /dev/dm-34     528.00MiB
   /dev/dm-35     528.00MiB
   /dev/dm-36     528.00MiB
   /dev/dm-48     528.00MiB

System,RAID10: Size:96.00MiB, Used:112.00KiB
   /dev/dm-32      16.00MiB
   /dev/dm-33      16.00MiB
   /dev/dm-34      16.00MiB
   /dev/dm-35      16.00MiB
   /dev/dm-36      16.00MiB
   /dev/dm-48      16.00MiB

Unallocated:
   /dev/dm-32       2.35TiB
   /dev/dm-33     161.97GiB
   /dev/dm-34     161.47GiB
   /dev/dm-35     161.47GiB
   /dev/dm-36       1.36TiB
   /dev/dm-48       1.42TiB

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-10 17:25     ` Henk Slager
  2016-04-11  1:48       ` Jérôme Poulin
@ 2016-04-11  1:50       ` Jérôme Poulin
  1 sibling, 0 replies; 9+ messages in thread
From: Jérôme Poulin @ 2016-04-11  1:50 UTC (permalink / raw)
  To: Henk Slager; +Cc: linux-btrfs

On Sun, Apr 10, 2016 at 1:25 PM, Henk Slager <eye1tm@gmail.com> wrote:
> Kernel/tools
> versions and btrfs fi us output might also give some hints.


I completely forgot to paste those:
BTRFS: btrfs-progs v4.4
Kernel: 4.6.0-rc2.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: csum failed on innexistent inode
  2016-04-11  1:48       ` Jérôme Poulin
@ 2016-04-11 11:34         ` Henk Slager
  0 siblings, 0 replies; 9+ messages in thread
From: Henk Slager @ 2016-04-11 11:34 UTC (permalink / raw)
  To: Jérôme Poulin; +Cc: linux-btrfs

On Mon, Apr 11, 2016 at 3:48 AM, Jérôme Poulin <jeromepoulin@gmail.com> wrote:
> Sorry for the confusion, allow me to clarify and I will summarize with
> what I learned since I now understand that corruption was present
> before disk went bad.
>
> Note that this BTRFS was once on a MD RAID5 on LVM on LUKS before
> being moved in-place to LVM on LUKS on BTRFS RAID10. But since balance
> worked at the time.

I haven't used LVM for years, but those in-place actions normally work
if size calculations etc are correct. Otherwise you would know
immediately.

> Also note that this computer was booted twice for about 30 minutes
> period with bad ram before it was replaced.

This is very important info. It is clear now that there was bad memory
and that it is just half an hour.

> I think my checksums errors were present, but unknown to me, before
> the hardware disk failure. The bad memory might be the root cause of
> this problem but I can't be sure.

When I look at all the info now and also think of my own experience
with bad ram module and btrfs, I think this bad memory is the root
cause. I have seen btrfs RAID10 correcting a few errors (likely coming
from earlier crashes with btrfs RAID5 on older disks). If it can't
correct, there is something else wrong and likely affecting more
devices than the RAID profile is able to correct.

> On Sun, Apr 10, 2016 at 1:25 PM, Henk Slager <eye1tm@gmail.com> wrote:
>> It was not fully clear what the sequence of events were:
>> - HW problem
>> - btrfs SW problem
>> - 1st scrub
>> - the --repair-sector with hdparm
>> - 2nd scrub
>> - 3rd scrub?
>>
>
> 1. Errors in dmesg and confirmation from smartd that hardware problems
> were present.
> 2. Attempt to repair sector using --repair-sector which reset the
> sector to zeroes.
> 3. Scrub detected errors and fixed some but there were 18 uncorrectable.
> 4. Disk has been changed using btrfs replace. Corruption still present.
> 5. Balance attempted but aborts when encountering the first uncorrectable error.
> 6. Tentative to locate bad sector/inode without success leading to
> another scrub with the same errors.
> 7. Attempt to reset stats and scrub again. Still getting the same errors.
> 8. New disk added and data profile converted from RAID10 to RAID1,
> balance abort on first uncorrectable error.
>
>
>> There is also DM between the harddisk and btrfs and I am not sure if
>> whether the hdparm action did repair or further corrupt things.
>>
>
> I confirmed after using --repair-sector that the sector has been reset
> to zeroes using --read-sector. I also tried read-sector first which
> failed and added an entry to the SMART log. After repair-sector,
> read-sector returned the zeroed sector.
>
>> How do you know for sure that the contents of the 'logical blocks' are
>> the same on both devices?
>>
>
> After a balance, here is what dmesg shows (complete warning output):
> BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
> 4147641019 expected csum 1755301217
> BTRFS warning (device dm-36): csum failed ino 330 off 1809195008 csum
> 1515428513 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809199104 csum
> 1927504681 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809211392 csum
> 3086571080 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809149952 csum
> 3254083717 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809162240 csum
> 3157020538 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809166336 csum
> 1092724678 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809178624 csum
> 4235459038 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809182720 csum
> 1764946502 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
> 4147641019 expected csum 1755301217
>
>
> After a scrub (complete error output):
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 2, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334876672 on dev /dev/dm-32
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334987264 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334991360 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296335003648 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
> corrupt 2, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334876672 on dev /dev/dm-36
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334987264 on dev /dev/dm-36
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334991360 on dev /dev/dm-36
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296335003648 on dev /dev/dm-36
> BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334942208 on dev /dev/dm-35
> BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
> corrupt 2, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334954496 on dev /dev/dm-35
> BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334958592 on dev /dev/dm-35
> BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334970880 on dev /dev/dm-35
> BTRFS error (device dm-36): bdev /dev/dm-35 errs: wr 0, rd 0, flush 0,
> corrupt 5, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334974976 on dev /dev/dm-35
> BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334942208 on dev /dev/dm-34
> BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
> corrupt 2, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334954496 on dev /dev/dm-34
> BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334958592 on dev /dev/dm-34
> BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334970880 on dev /dev/dm-34
> BTRFS error (device dm-36): bdev /dev/dm-34 errs: wr 0, rd 0, flush 0,
> corrupt 5, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334974976 on dev /dev/dm-34
>
> device stats:
> [/dev/mapper/luksbtrfsdata1 /dev/dm-32].corruption_errs 4
> [/dev/mapper/luksbtrfsdata6 /dev/dm-36].corruption_errs 4
> [/dev/mapper/luksbtrfsdata3 /dev/dm-34].corruption_errs 5
> [/dev/mapper/luksbtrfsdata2 /dev/dm-33].corruption_errs 0
> [/dev/mapper/luksbtrfsdata5 /dev/dm-35].corruption_errs 5
> [/dev/mapper/luksbtrfsdata7 /dev/dm-48].corruption_errs 0
>
>
>
> If we combine everything, we notice that...
> * dm-32 and dm-36 have the same number of uncorrectable errors.
> * dm-34 and dm-35 have the same number of uncorrectable errors.
> * Scrub output is not helpful at identifying checksum errors. Balance
> output is not useful at identifying the physical device.
> * Scrub output confirms where the errors are and each logical sector
> appear twice on different devices.
> * Balance output also shows each offset twice with VERY suspicious
> expected checksums.
>
> A wild guess would be that memory corruption caused the checksums to
> be incorrectly written to disk.

As indicated, this is the most obvious reason. I looks like basic RAID
could not do its work as all (2 in this case) block copies got
corrupted.

What I think is that the corruptions are outside the data objects. And
unfortunately difficult to fix (e.g. by doing some file-level
modifications) I think. That balance fails is not good, it also means
that some other action on chunk-level would also fail, so removing a
device for example might not be possible to complete. You'll have to
see if you can avoid re-creation of the fs I think.

It doesn't seem to be a bug in btrfs, but one thing you might try,
just to see if you can fix it without using backup, is to hack the
kernel so that it skips over the checksum error cases in a first step
and then in next step let it correct again hoping that CoW has helped
you. But maybe someone else sees a quick way to fix it.

>> If btrfs wants to read a diskblock and its csum doesn't match, then it
>> is an I/O error, same effect as an uncorrected badsector in the old
>> days. But in this case your (former/old) disk might still be OK, as
>> you suggest it might be due to some other error (HW or SW) that
>> content and csum don't match. It is hard to traceback based on the
>> info in the email thread. It looks like replace just copied the
>> problem and it seems a bottleneck now on filesystem level.
>>
>
> It seems like btrfs replace did indeed just copy the problem as-is,
> which is good since I could not have removed the old defective disk
> otherwise.
>
>>> Is it possible to reset the checksum on those? I couldn't find what
>>> file or metadata the blocks were pointing too.
>>
>> Could it be that they in the meantime have been removed?
>> It might be that you again need to run scrub in order to try to find
>> the problem spot/files.
>>
>
> Scrub / inspect-internal didn't help me find the file or metadata.
> Even crazy commands like:
> btrfs sub li /mnt/btrfs/ | cut -d' ' -f9 | xargs -n1 btrfs inspect
> logical-resolve -v 1296334991360
>
> I tried and md5sum'ed every files in the output with no known
> problems, no I/O errors.
>
>> Fixing individual csum's has been asked before, I don't remember if
>> there are people who did fix them by own extra scripts/C-code or
>> whatever. A brute force method is to recalculate and rewrite all
>> csums:  btrfs check --init-csum-tree , you probably know that. But
>> maybe you want a rsync -c compare with backups first. Kernel/tools
>> versions and btrfs fi us output might also give some hints.
>
> I though about using init-csum-tree but you are right, that wouldn't
> allow to identify the problem and which files/meta are affected.
>
> Here is the requested output:
>
> btrfs fi us /mnt/btrfs/
> Overall:
>     Device size:           6.32TiB
>     Device allocated:           1.28TiB
>     Device unallocated:           5.04TiB
>     Device missing:             0.00B
>     Used:               1.27TiB
>     Free (estimated):           2.52TiB    (min: 2.52TiB)
>     Data ratio:                  2.00
>     Metadata ratio:              2.00
>     Global reserve:         512.00MiB    (used: 0.00B)
>
> Data,RAID1: Size:76.00GiB, Used:74.13GiB
>    /dev/dm-32      52.00GiB
>    /dev/dm-36      24.00GiB
>    /dev/dm-48      76.00GiB
>
> Data,RAID10: Size:576.00GiB, Used:575.99GiB
>    /dev/dm-32     105.00GiB
>    /dev/dm-33     117.50GiB
>    /dev/dm-34     118.00GiB
>    /dev/dm-35     118.00GiB
>    /dev/dm-36     117.50GiB
>
> Metadata,RAID10: Size:3.09GiB, Used:1.68GiB
>    /dev/dm-32     528.00MiB
>    /dev/dm-33     528.00MiB
>    /dev/dm-34     528.00MiB
>    /dev/dm-35     528.00MiB
>    /dev/dm-36     528.00MiB
>    /dev/dm-48     528.00MiB
>
> System,RAID10: Size:96.00MiB, Used:112.00KiB
>    /dev/dm-32      16.00MiB
>    /dev/dm-33      16.00MiB
>    /dev/dm-34      16.00MiB
>    /dev/dm-35      16.00MiB
>    /dev/dm-36      16.00MiB
>    /dev/dm-48      16.00MiB
>
> Unallocated:
>    /dev/dm-32       2.35TiB
>    /dev/dm-33     161.97GiB
>    /dev/dm-34     161.47GiB
>    /dev/dm-35     161.47GiB
>    /dev/dm-36       1.36TiB
>    /dev/dm-48       1.42TiB

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-04-11 11:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-04  7:50 csum failed on innexistent inode Jérôme Poulin
2016-04-04  9:42 ` Henk Slager
2016-04-10 15:34   ` Jérôme Poulin
2016-04-10 17:25     ` Henk Slager
2016-04-11  1:48       ` Jérôme Poulin
2016-04-11 11:34         ` Henk Slager
2016-04-11  1:50       ` Jérôme Poulin
2016-04-04 20:17 ` Kai Krakow
2016-04-08  4:25   ` Jérôme Poulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.