All of lore.kernel.org
 help / color / mirror / Atom feed
* unable to fixup (regular) error
@ 2013-08-14 22:17 Cameron Berkenpas
  0 siblings, 0 replies; 6+ messages in thread
From: Cameron Berkenpas @ 2013-08-14 22:17 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I hope this is the correct mailing list.

I have btrfs running on a 6TB (5.5ish TiB) raid10 array on a 3ware 
9750-4i controller. I decided to run a script and a got 5 checksum 
errors for the same file (errors from dmesg below).

I deleted the file without any issues, reran scrub, and now I don't see 
any errors. The file itself was unimportant as it was from a backup of 
another box that I already have multiple backups of (and the box itself 
is still fine). The disks in the array appear to all be fine and the 
array is also healthy. I also run "verify" regularly. Verify appears to 
be the controller's equivalent to scrub.

Additionally, according to smartctl, things are healthly although it 
seems error logging isn't supported:
Vendor:               LSI
Product:              9750-4i    DISK
Revision:             5.12
User Capacity:        5,999,977,037,824 bytes [5.99 TB]
Logical block size:   512 bytes
Logical Unit id:      0x600050e016538a004567000011970000
Serial number:        9XK0C13D16538A004567
Device type:          disk
Local Time is:        Wed Aug 14 15:15:42 2013 PDT
Device supports SMART and is Disabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Error Counter logging not supported
Device does not support Self Test logging

Any idea what may have happened here? Is this something to worry about?

Thanks,

-Cameron

[101511.280510] btrfs: checksum error at logical 1590664605696 on dev 
/dev/sda3, sector 3119366104, root 681, inode 1668516, offset 3473408, 
length 4096, links 1 (path: path/to/some/file)
[101511.288676] btrfs: bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 
1, gen 0
[101511.291611] btrfs: unable to fixup (regular) error at logical 
1590664605696 on dev /dev/sda3
[101511.390081] btrfs: checksum error at logical 1590664609792 on dev 
/dev/sda3, sector 3119366112, root 681, inode 1668516, offset 3477504, 
length 4096, links 1 (path: path/to/some/file)
[101511.399321] btrfs: bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 
2, gen 0
[101511.402552] btrfs: unable to fixup (regular) error at logical 
1590664609792 on dev /dev/sda3
[101511.406038] btrfs: checksum error at logical 1590664613888 on dev 
/dev/sda3, sector 3119366120, root 681, inode 1668516, offset 3481600, 
length 4096, links 1 (path: path/to/some/file)
[101511.416438] btrfs: bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 
3, gen 0
[101511.420050] btrfs: unable to fixup (regular) error at logical 
1590664613888 on dev /dev/sda3
[101511.424238] btrfs: checksum error at logical 1590664617984 on dev 
/dev/sda3, sector 3119366128, root 681, inode 1668516, offset 3485696, 
length 4096, links 1 (path: path/to/some/file)
[101511.435928] btrfs: bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 
4, gen 0
[101511.440241] btrfs: unable to fixup (regular) error at logical 
1590664617984 on dev /dev/sda3
[101511.523988] btrfs: checksum error at logical 1590664622080 on dev 
/dev/sda3, sector 3119366136, root 681, inode 1668516, offset 3489792, 
length 4096, links 1 (path: path/to/some/file)
[101511.537097] btrfs: bdev /dev/sda3 errs: wr 0, rd 0, flush 0, corrupt 
5, gen 0
[101511.541636] btrfs: unable to fixup (regular) error at logical 
1590664622080 on dev /dev/sda3




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: unable to fixup (regular) error
  2018-11-27  7:18     ` Duncan
@ 2018-11-27  8:28       ` Alexander Fieroch
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Fieroch @ 2018-11-27  8:28 UTC (permalink / raw)
  To: 1i5t5.duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

Actually the data on raid0 is not my data but of my users and they knew 
and accepted the risk for raid0. So in my case it should be ok - I don't 
know the importance of the data files which are affected. I just wanted 
to help finding a possible bug and experiment with a broken btrfs 
filesystem before I recreate it.
For my own files I'd prefer raid5 or raid6...

But thanks Duncan for your explanation! Of course there are people out 
there who do not know the difference between different raid levels and 
should be warned!

Best regards,
Alexander


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5184 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: unable to fixup (regular) error
  2018-11-26 10:23   ` Alexander Fieroch
@ 2018-11-27  7:18     ` Duncan
  2018-11-27  8:28       ` Alexander Fieroch
  0 siblings, 1 reply; 6+ messages in thread
From: Duncan @ 2018-11-27  7:18 UTC (permalink / raw)
  To: linux-btrfs

Alexander Fieroch posted on Mon, 26 Nov 2018 11:23:00 +0100 as excerpted:

> Am 26.11.18 um 09:13 schrieb Qu Wenruo:
>> The corruption itself looks like some disk error, not some btrfs error
>> like transid error.
> 
> You're right! SMART has an increased value for one harddisk on
> reallocated sector count. Sorry, I missed to check this first...
> 
> I'll try to salvage my data...

FWIW as a general note about raid0 for updating your layout...

Because raid0 is less reliable than a single device (failure of any 
device of the raid0 is likely to take it out, and failure of any one of N 
is more likely than failure of any specific single device), admins 
generally consider it useful only for "throw-away" data, that is, data 
that can be lost without issue either because it really /is/ "throw-
away" (internet cache being a common example), or because it is 
considered a "throw-away" copy of the "real" data stored elsewhere, with 
that "real" copy being either the real working copy of which the raid0 is 
simply a faster cache, or with the raid0 being the working copy, but with 
sufficiently frequent backup updates that if the raid0 goes, it won't 
take anything of value with it (read as the effort to replace any data 
lost will be reasonably trivial, likely only a few minutes or hours, at 
worst perhaps a day's worth, of work, depending on how many people's work 
is involved and how much their time is considered to be worth).

So if it's raid0, you shouldn't be needing to worry about trying to 
recover what's on it, and probably shouldn't even be trying to run a 
btrfs check on it at all as it's likely to be more trouble and take more 
time than the throw-away data on it is worth.  If something goes wrong 
with a raid0, just declare it lost, blow it away and recreate fresh, 
restoring from the "real" copy if necessary.  Because for an admin, 
really with any data but particularly for a raid0, it's more a matter of 
when it'll die than if.

If that's inappropriate for the value of the data and status of the 
backups/real-copies, then you should really be reconsidering whether 
raid0 of any sort is appropriate, because it almost certainly is not.


For btrfs, what you might try instead of raid0, is raid1 metadata at 
least, raid0 or single mode data if there's not room enough to do raid1 
data as well.  And the raid1 metadata would have very likely saved the 
filesystem in this case, with some loss of files possible depending on 
where the damage is, but with the second copy of the metadata from the 
good device being used to fill in for and (attempt to, if the bad device 
is actively getting worse it might be a losing battle) repair any 
metadata damage on the bad device, thus giving you a far better chance of 
saving the filesystem as a whole.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: unable to fixup (regular) error
  2018-11-26  8:13 ` Qu Wenruo
@ 2018-11-26 10:23   ` Alexander Fieroch
  2018-11-27  7:18     ` Duncan
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Fieroch @ 2018-11-26 10:23 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 338 bytes --]

Am 26.11.18 um 09:13 schrieb Qu Wenruo:
> The corruption itself looks like some disk error, not some btrfs error
> like transid error.

You're right! SMART has an increased value for one harddisk on 
reallocated sector count. Sorry, I missed to check this first...

I'll try to salvage my data...

Thanks!

Best,
Alexander


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5184 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: unable to fixup (regular) error
  2018-11-26  7:19 Alexander Fieroch
@ 2018-11-26  8:13 ` Qu Wenruo
  2018-11-26 10:23   ` Alexander Fieroch
  0 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2018-11-26  8:13 UTC (permalink / raw)
  To: Alexander Fieroch, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 10653 bytes --]



On 2018/11/26 下午3:19, Alexander Fieroch wrote:
> Hi,
> 
> My data partition with btrfs RAID 0 (/dev/sdc0 and /dev/sdd0) shows
> errors in syslog:
> 
> BTRFS error (device sdc): cleaner transaction attach returned -30
> BTRFS info (device sdc): disk space caching is enabled
> BTRFS info (device sdc): has skinny extents
> BTRFS info (device sdc): bdev /dev/sdc errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 1
> BTRFS info (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
> corrupt 6, gen 2

Generation mismatch means something more serious.

> 
> 
> BTRFS error (device sdc): scrub: tree block 858803990528 spanning
> stripes, ignored. logical=3D858803929088

While the spanning stripes only means scrub code can't really check it
since it crosses stripe boundary.

It's normally nothing to worry, and it's normally caused by old kernel.
Newer kernel will avoid such problem from happening, but for existing
one, it will just skip it.

> BTRFS error (device sdc): scrub: tree block 858803990528 spanning
> stripes, ignored. logical=3D858803994624
> BTRFS warning (device sdc): checksum error at logical 858803961856 on
> dev /dev/sdd, physical 385263894528: metadata leaf (level 0) in tree 7
> BTRFS warning (device sdc): checksum error at logical 858803961856 on
> dev /dev/sdd, physical 385263894528: metadata leaf (level 0) in tree 7

This means some csum tree blocks get corrupted.

> BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 1
> BTRFS error (device sdc): scrub: tree block 858820505600 spanning
> stripes, ignored. logical=3D858820444160
> BTRFS error (device sdc): scrub: tree block 858820505600 spanning
> stripes, ignored. logical=3D858820509696
> BTRFS error (device sdc): unable to fixup (regular) error at logical
> 858803961856 on dev /dev/sdd
> BTRFS error (device sdc): scrub: tree block 858821292032 spanning
> stripes, ignored. logical=3D858821230592
> BTRFS error (device sdc): scrub: tree block 858821292032 spanning
> stripes, ignored. logical=3D858821296128
> BTRFS warning (device sdc): checksum error at logical 858821263360 on
> dev /dev/sdd, physical 385281196032: metadata leaf (level 0) in tree 7
> BTRFS warning (device sdc): checksum error at logical 858821263360 on
> dev /dev/sdd, physical 385281196032: metadata leaf (level 0) in tree 7
> BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
> corrupt 5, gen 1
> BTRFS error (device sdc): unable to fixup (regular) error at logical
> 858821263360 on dev /dev/sdd
> BTRFS warning (device sdc): checksum/header error at logical
> 858820476928 on dev /dev/sdd, physical 385280409600: metadata leaf
> (level 0) in tree 7
> BTRFS warning (device sdc): checksum/header error at logical
> 858820476928 on dev /dev/sdd, physical 385280409600: metadata leaf
> (level 0) in tree 7
> BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
> corrupt 5, gen 2
> BTRFS warning (device sdc): checksum error at logical 858820489216 on
> dev /dev/sdd, physical 385280421888: metadata leaf (level 0) in tree 2
> BTRFS warning (device sdc): checksum error at logical 858820489216 on
> dev /dev/sdd, physical 385280421888: metadata leaf (level 0) in tree 2

This is some error in extent tree, and I'd say it's a serious problem
which may affect later write operation.

> BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
> corrupt 6, gen 2
> BTRFS error (device sdc): unable to fixup (regular) error at logical
> 858820476928 on dev /dev/sdd
> BTRFS error (device sdc): unable to fixup (regular) error at logical
> 858820489216 on dev /dev/sdd0
> 
> 
> $ btrfs filesystem show /mnt/data/
> Label: none  uuid: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
>           Total devices 2 FS bytes used 10.17TiB
>           devid    1 size 5.46TiB used 5.43TiB path /dev/sdc
>           devid    2 size 5.46TiB used 5.43TiB path /dev/sdd
> 
> $ btrfs --version
> btrfs-progs v4.15.1
> 
> $ uname -a
> Linux gpur1 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:48:01 UTC
> 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> $ btrfs dev stats /dev/sdc
> [/dev/sdc].write_io_errs    0
> [/dev/sdc].read_io_errs     0
> [/dev/sdc].flush_io_errs    0
> [/dev/sdc].corruption_errs  3
> [/dev/sdc].generation_errs  1
> 
> $ btrfs dev stats /dev/sdd
> [/dev/sdd].write_io_errs    0
> [/dev/sdd].read_io_errs     0
> [/dev/sdd].flush_io_errs    0
> [/dev/sdd].corruption_errs  3
> [/dev/sdd].generation_errs  1
> 
> $ btrfs fi show
> Label: 'system'  uuid: ae121e8e-d483-45f4-8568-2817f5c5d497
>         Total devices 1 FS bytes used 194.05GiB
>         devid    1 size 228.66GiB used 199.03GiB path /dev/sda3
> Label: none  uuid: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
>         Total devices 2 FS bytes used 10.17TiB
>         devid    1 size 5.46TiB used 5.43TiB path /dev/sdc
>         devid    2 size 5.46TiB used 5.43TiB path /dev/sdd
> 
> $ btrfs fi df /mnt/data/
> Data, RAID0: total=10.84TiB, used=10.15TiB
> System, RAID1: total=8.00MiB, used=896.00KiB
> Metadata, RAID1: total=15.00GiB, used=13.28GiB
> GlobalReserve, single: total=512.00MiB, used=0.00
> 
> $ btrfs scrub start -B /dev/sdc
> ERROR: scrubbing /dev/sdc failed for device id 1: ret=-1, errno=5
> (Input/output error)
> scrub canceled for 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
>          scrub started at Thu Nov 22 07:43:45 2018 and was aborted after
> 02:31:49
>          total bytes scrubbed: 1.58TiB with 10 errors
>          error details: verify=1 csum=3
>          corrected errors: 0, uncorrectable errors: 10, unverified
> errors: 0
> 
> 
> 
> I've tried
> $ btrfs check /dev/sdc
> Checking filesystem on /dev/sdc
> UUID: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
> btrfs check --repairchecking extents

Don't use --repair unless you know what you're doing.

>   ERROR: add_tree_backref failed (extent items shared block): File exists
> ERROR: add_tree_backref failed (extent items tree block): File exists
> ERROR: add_tree_backref failed (extent items tree block): File exists
> /dev/sdc
> ERROR: add_tree_backref failed (non-leaf block): File exists
> 
> ERROR: add_tree_backref failed (non-leaf block): File exists
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> Csum didn't match
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> Csum didn't match
> ref mismatch on [8631607296 77824] extent item 1, found 0
> incorrect local backref count on 8631607296 parent 858803974144 owner 0
> offset 0 found 0 wanted 1 back 0x55f8522a5b10
> backref disk bytenr does not match extent record, bytenr=8631607296, ref
> bytenr=0
> backpointer mismatch on [8631607296 77824]
> owner ref check failed [8631607296 77824]
> ref mismatch on [35613634560 77824] extent item 1, found 0
> incorrect local backref count on 35613634560 parent 858803974144 owner 0
> offset 0 found 0 wanted 1 back 0x55f86d87d810
> backref disk bytenr does not match extent record, bytenr=35613634560,
> ref bytenr=0
> backpointer mismatch on [35613634560 77824]
> owner ref check failed [35613634560 77824]
> ref mismatch on [36010762240 77824] extent item 1, found 0
> [...]
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> checking fs roots
> extent_io.c:605: free_extent_buffer_internal: BUG_ON `eb->refs < 0`
> triggered, value 1
> btrfs(+0x29d87)[0x55f83fd51d87]
> btrfs(+0x2a0b4)[0x55f83fd520b4]
> btrfs(alloc_extent_buffer+0x77)[0x55f83fd527af]
> btrfs(read_tree_block+0x44)[0x55f83fd45802]
> btrfs(btrfs_next_leaf+0x6e)[0x55f83fd43ad9]
> btrfs(count_csum_range+0x1e1)[0x55f83fd89fac]
> btrfs(+0x14b33)[0x55f83fd3cb33]
> btrfs(cmd_check+0x19fb)[0x55f83fd7bfe2]
> btrfs(main+0x143)[0x55f83fd3ec87]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7ff69e81cb97]
> btrfs(_start+0x2a)[0x55f83fd3ecca]
> Aborted (core dumped)
> 
> 
> 
> $ btrfs check --repair /dev/sdc
> enabling repair mode
> Checking filesystem on /dev/sdc
> UUID: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
> Fixed 0 roots.
> checking extents
> ERROR: add_tree_backref failed (extent items shared block): File exists
> ERROR: add_tree_backref failed (extent items tree block): File exists
> ERROR: add_tree_backref failed (extent items tree block): File exists
> ERROR: add_tree_backref failed (non-leaf block): File exists
> ERROR: add_tree_backref failed (non-leaf block): File exists
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
> Csum didn't match
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
> Csum didn't match
> well this shouldn't happen, extent record overlaps but is metadata?
> [858803974144, 16384]
> Aborted (core dumped)
> 
> 
> 
> 
> How can I fix the error?
> Is there any possibility to see which files are affected?

It's not data/files (at least from what I read), it's all about some
essential metadata get corrupted.

Unlike other traditional fs, corruption in extent tree could lead to a
lot of problem and it's pretty hard to fix due to its complexity.

The corruption itself looks like some disk error, not some btrfs error
like transid error.

I recommend to mount the fs RO and salvage your data.
If something even went wrong doing the RO mount, you could go "btrfs
restore".

Since there is something wrong with csum tree, some EIO would be
expected during copy.

Thanks,
Qu

> Please have a look at the full log attached.
> 
> Thanks!
> 
> Best regards,
> Alexander
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* unable to fixup (regular) error
@ 2018-11-26  7:19 Alexander Fieroch
  2018-11-26  8:13 ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Fieroch @ 2018-11-26  7:19 UTC (permalink / raw)
  To: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 8897 bytes --]

Hi,

My data partition with btrfs RAID 0 (/dev/sdc0 and /dev/sdd0) shows
errors in syslog:

BTRFS error (device sdc): cleaner transaction attach returned -30
BTRFS info (device sdc): disk space caching is enabled
BTRFS info (device sdc): has skinny extents
BTRFS info (device sdc): bdev /dev/sdc errs: wr 0, rd 0, flush 0,
corrupt 3, gen 1
BTRFS info (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
corrupt 6, gen 2


BTRFS error (device sdc): scrub: tree block 858803990528 spanning
stripes, ignored. logical=3D858803929088
BTRFS error (device sdc): scrub: tree block 858803990528 spanning
stripes, ignored. logical=3D858803994624
BTRFS warning (device sdc): checksum error at logical 858803961856 on
dev /dev/sdd, physical 385263894528: metadata leaf (level 0) in tree 7
BTRFS warning (device sdc): checksum error at logical 858803961856 on
dev /dev/sdd, physical 385263894528: metadata leaf (level 0) in tree 7
BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
corrupt 4, gen 1
BTRFS error (device sdc): scrub: tree block 858820505600 spanning
stripes, ignored. logical=3D858820444160
BTRFS error (device sdc): scrub: tree block 858820505600 spanning
stripes, ignored. logical=3D858820509696
BTRFS error (device sdc): unable to fixup (regular) error at logical
858803961856 on dev /dev/sdd
BTRFS error (device sdc): scrub: tree block 858821292032 spanning
stripes, ignored. logical=3D858821230592
BTRFS error (device sdc): scrub: tree block 858821292032 spanning
stripes, ignored. logical=3D858821296128
BTRFS warning (device sdc): checksum error at logical 858821263360 on
dev /dev/sdd, physical 385281196032: metadata leaf (level 0) in tree 7
BTRFS warning (device sdc): checksum error at logical 858821263360 on
dev /dev/sdd, physical 385281196032: metadata leaf (level 0) in tree 7
BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
corrupt 5, gen 1
BTRFS error (device sdc): unable to fixup (regular) error at logical
858821263360 on dev /dev/sdd
BTRFS warning (device sdc): checksum/header error at logical
858820476928 on dev /dev/sdd, physical 385280409600: metadata leaf
(level 0) in tree 7
BTRFS warning (device sdc): checksum/header error at logical
858820476928 on dev /dev/sdd, physical 385280409600: metadata leaf
(level 0) in tree 7
BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
corrupt 5, gen 2
BTRFS warning (device sdc): checksum error at logical 858820489216 on
dev /dev/sdd, physical 385280421888: metadata leaf (level 0) in tree 2
BTRFS warning (device sdc): checksum error at logical 858820489216 on
dev /dev/sdd, physical 385280421888: metadata leaf (level 0) in tree 2
BTRFS error (device sdc): bdev /dev/sdd errs: wr 0, rd 0, flush 0,
corrupt 6, gen 2
BTRFS error (device sdc): unable to fixup (regular) error at logical
858820476928 on dev /dev/sdd
BTRFS error (device sdc): unable to fixup (regular) error at logical
858820489216 on dev /dev/sdd0


$ btrfs filesystem show /mnt/data/
Label: none  uuid: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
           Total devices 2 FS bytes used 10.17TiB
           devid    1 size 5.46TiB used 5.43TiB path /dev/sdc
           devid    2 size 5.46TiB used 5.43TiB path /dev/sdd

$ btrfs --version
btrfs-progs v4.15.1

$ uname -a
Linux gpur1 4.15.0-39-generic #42-Ubuntu SMP Tue Oct 23 15:48:01 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux


$ btrfs dev stats /dev/sdc
[/dev/sdc].write_io_errs    0
[/dev/sdc].read_io_errs     0
[/dev/sdc].flush_io_errs    0
[/dev/sdc].corruption_errs  3
[/dev/sdc].generation_errs  1

$ btrfs dev stats /dev/sdd
[/dev/sdd].write_io_errs    0
[/dev/sdd].read_io_errs     0
[/dev/sdd].flush_io_errs    0
[/dev/sdd].corruption_errs  3
[/dev/sdd].generation_errs  1

$ btrfs fi show
Label: 'system'  uuid: ae121e8e-d483-45f4-8568-2817f5c5d497
         Total devices 1 FS bytes used 194.05GiB
         devid    1 size 228.66GiB used 199.03GiB path /dev/sda3
Label: none  uuid: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
         Total devices 2 FS bytes used 10.17TiB
         devid    1 size 5.46TiB used 5.43TiB path /dev/sdc
         devid    2 size 5.46TiB used 5.43TiB path /dev/sdd

$ btrfs fi df /mnt/data/
Data, RAID0: total=10.84TiB, used=10.15TiB
System, RAID1: total=8.00MiB, used=896.00KiB
Metadata, RAID1: total=15.00GiB, used=13.28GiB
GlobalReserve, single: total=512.00MiB, used=0.00

$ btrfs scrub start -B /dev/sdc
ERROR: scrubbing /dev/sdc failed for device id 1: ret=-1, errno=5 
(Input/output error)
scrub canceled for 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
          scrub started at Thu Nov 22 07:43:45 2018 and was aborted 
after 02:31:49
          total bytes scrubbed: 1.58TiB with 10 errors
          error details: verify=1 csum=3
          corrected errors: 0, uncorrectable errors: 10, unverified 
errors: 0



I've tried
$ btrfs check /dev/sdc
Checking filesystem on /dev/sdc
UUID: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
btrfs check --repairchecking extents
   ERROR: add_tree_backref failed (extent items shared block): File exists
ERROR: add_tree_backref failed (extent items tree block): File exists
ERROR: add_tree_backref failed (extent items tree block): File exists
/dev/sdc
ERROR: add_tree_backref failed (non-leaf block): File exists

ERROR: add_tree_backref failed (non-leaf block): File exists
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
Csum didn't match
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
Csum didn't match
ref mismatch on [8631607296 77824] extent item 1, found 0
incorrect local backref count on 8631607296 parent 858803974144 owner 0 
offset 0 found 0 wanted 1 back 0x55f8522a5b10
backref disk bytenr does not match extent record, bytenr=8631607296, ref 
bytenr=0
backpointer mismatch on [8631607296 77824]
owner ref check failed [8631607296 77824]
ref mismatch on [35613634560 77824] extent item 1, found 0
incorrect local backref count on 35613634560 parent 858803974144 owner 0 
offset 0 found 0 wanted 1 back 0x55f86d87d810
backref disk bytenr does not match extent record, bytenr=35613634560, 
ref bytenr=0
backpointer mismatch on [35613634560 77824]
owner ref check failed [35613634560 77824]
ref mismatch on [36010762240 77824] extent item 1, found 0
[...]
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
extent_io.c:605: free_extent_buffer_internal: BUG_ON `eb->refs < 0` 
triggered, value 1
btrfs(+0x29d87)[0x55f83fd51d87]
btrfs(+0x2a0b4)[0x55f83fd520b4]
btrfs(alloc_extent_buffer+0x77)[0x55f83fd527af]
btrfs(read_tree_block+0x44)[0x55f83fd45802]
btrfs(btrfs_next_leaf+0x6e)[0x55f83fd43ad9]
btrfs(count_csum_range+0x1e1)[0x55f83fd89fac]
btrfs(+0x14b33)[0x55f83fd3cb33]
btrfs(cmd_check+0x19fb)[0x55f83fd7bfe2]
btrfs(main+0x143)[0x55f83fd3ec87]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7ff69e81cb97]
btrfs(_start+0x2a)[0x55f83fd3ecca]
Aborted (core dumped)



$ btrfs check --repair /dev/sdc
enabling repair mode
Checking filesystem on /dev/sdc
UUID: 5e6506b0-bf15-4b2e-b5f4-322c44b89db6
Fixed 0 roots.
checking extents
ERROR: add_tree_backref failed (extent items shared block): File exists
ERROR: add_tree_backref failed (extent items tree block): File exists
ERROR: add_tree_backref failed (extent items tree block): File exists
ERROR: add_tree_backref failed (non-leaf block): File exists
ERROR: add_tree_backref failed (non-leaf block): File exists
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
checksum verify failed on 858803961856 found B2C0FAD9 wanted F31F8495
Csum didn't match
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
checksum verify failed on 858821263360 found 15208BF4 wanted D68B2514
Csum didn't match
well this shouldn't happen, extent record overlaps but is metadata? 
[858803974144, 16384]
Aborted (core dumped)




How can I fix the error?
Is there any possibility to see which files are affected?
Please have a look at the full log attached.

Thanks!

Best regards,
Alexander


[-- Attachment #1.2: btrfs-failure.txt.gz --]
[-- Type: application/gzip, Size: 22645 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5184 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-27  8:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-14 22:17 unable to fixup (regular) error Cameron Berkenpas
2018-11-26  7:19 Alexander Fieroch
2018-11-26  8:13 ` Qu Wenruo
2018-11-26 10:23   ` Alexander Fieroch
2018-11-27  7:18     ` Duncan
2018-11-27  8:28       ` Alexander Fieroch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.