* Trying to rescue my data :( @ 2016-06-24 14:52 Steven Haigh 2016-06-24 16:26 ` Steven Haigh 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-24 14:52 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 6633 bytes --] Ok, so I figured that despite what the BTRFS wiki seems to imply, the 'multi parity' support just isn't stable enough to be used. So, I'm trying to revert to what I had before. My setup consist of: * 2 x 3Tb drives + * 3 x 2Tb drives. I've got (had?) about 4.9Tb of data. My idea was to convert the existing setup using a balance to a 'single' setup, delete the 3 x 2Tb drives from the BTRFS system, then create a new mdadm based RAID6 (5 drives degraded to 3), create a new filesystem on that, then copy the data across. So, great - first the balance: $ btrfs balance start -dconvert=single -mconvert=single -f (yes, I know it'll reduce the metadata redundancy). This promptly was followed by a system crash. After a reboot, I can no longer mount the BTRFS in read-write: [ 134.768908] BTRFS info (device xvdd): disk space caching is enabled [ 134.769032] BTRFS: has skinny extents [ 134.769856] BTRFS: failed to read the system array on xvdd [ 134.776055] BTRFS: open_ctree failed [ 143.900055] BTRFS info (device xvdd): allowing degraded mounts [ 143.900152] BTRFS info (device xvdd): not using ssd allocation scheme [ 143.900243] BTRFS info (device xvdd): disk space caching is enabled [ 143.900330] BTRFS: has skinny extents [ 143.901860] BTRFS warning (device xvdd): devid 4 uuid 61ccce61-9787-453e-b793-1b86f8015ee1 is missing [ 146.539467] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed [ 146.552051] BTRFS: open_ctree failed I can mount it read only - but then I also get crashes when it seems to hit a read error: BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 3245290974 wanted 982056704 mirror 0 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 390821102 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 550556475 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 1279883714 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 2566472073 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 1876236691 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 3350537857 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 3319706190 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 2377458007 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 2066127208 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 657140479 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 1239359620 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 1598877324 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 1082738394 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 371906697 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 2156787247 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 3777709399 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 180814340 wanted 982056704 mirror 1 ------------[ cut here ]------------ kernel BUG at fs/btrfs/extent_io.c:2401! invalid opcode: 0000 [#1] SMP Modules linked in: btrfs x86_pkg_temp_thermal coretemp crct10dif_pclmul xor aesni_intel aes_x86_64 lrw gf128mul glue_helper pcspkr raid6_pq ablk_helper cryptd nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xen_netfront crc32c_intel xen_gntalloc xen_evtchn ipv6 autofs4 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 2610978113 wanted 982056704 mirror 1 BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum 59610051 wanted 982056704 mirror 1 CPU: 1 PID: 1273 Comm: kworker/u4:4 Not tainted 4.4.13-1.el7xen.x86_64 #1 Workqueue: btrfs-endio btrfs_endio_helper [btrfs] task: ffff880079ce12c0 ti: ffff880078788000 task.ti: ffff880078788000 RIP: e030:[<ffffffffa039e0e0>] [<ffffffffa039e0e0>] btrfs_check_repairable+0x100/0x110 [btrfs] RSP: e02b:ffff88007878bcc8 EFLAGS: 00010297 RAX: 0000000000000001 RBX: ffff880079db2080 RCX: 0000000000000003 RDX: 0000000000000003 RSI: 000004db13730000 RDI: ffff88007889ef38 RBP: ffff88007878bce0 R08: 000004db01c00000 R09: 000004dbc1c00000 R10: ffff88006bb0c1b8 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88007b213ea8 R14: 0000000000001000 R15: 0000000000000000 FS: 00007fbf2fdc0880(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbf2d96702b CR3: 000000007969f000 CR4: 0000000000042660 Stack: ffffea00019db180 0000000000010000 ffff88007b213f30 ffff88007878bd88 ffffffffa03a0808 ffff880002d15500 ffff88007878bd18 ffff880079ce12c0 ffff88007b213e40 000000000000001f ffff880000000000 ffff88006bb0c048 Call Trace: [<ffffffffa03a0808>] end_bio_extent_readpage+0x428/0x560 [btrfs] [<ffffffff812f40c0>] bio_endio+0x40/0x60 [<ffffffffa0375a6c>] end_workqueue_fn+0x3c/0x40 [btrfs] [<ffffffffa03af3f1>] normal_work_helper+0xc1/0x300 [btrfs] [<ffffffff810a1352>] ? finish_task_switch+0x82/0x280 [<ffffffffa03af702>] btrfs_endio_helper+0x12/0x20 [btrfs] [<ffffffff81093844>] process_one_work+0x154/0x400 [<ffffffff8109438a>] worker_thread+0x11a/0x460 [<ffffffff8165a24f>] ? __schedule+0x2bf/0x880 [<ffffffff81094270>] ? rescuer_thread+0x2f0/0x2f0 [<ffffffff810993f9>] kthread+0xc9/0xe0 [<ffffffff81099330>] ? kthread_park+0x60/0x60 [<ffffffff8165e14f>] ret_from_fork+0x3f/0x70 [<ffffffff81099330>] ? kthread_park+0x60/0x60 Code: 00 31 c0 eb d5 8d 48 02 eb d9 31 c0 45 89 e0 48 c7 c6 a0 f8 3f a0 48 c7 c7 00 05 41 a0 e8 c9 f2 fa e0 31 c0 e9 70 ff ff ff 0f 0b <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 RIP [<ffffffffa039e0e0>] btrfs_check_repairable+0x100/0x110 [btrfs] RSP <ffff88007878bcc8> ------------[ cut here ]------------ <more crashes until the system hangs> So, where to from here? Sadly, I feel there is data loss in my future, but not sure how to minimise this :\ -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 14:52 Trying to rescue my data :( Steven Haigh @ 2016-06-24 16:26 ` Steven Haigh 2016-06-24 16:59 ` ronnie sahlberg 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-24 16:26 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 13359 bytes --] On 25/06/16 00:52, Steven Haigh wrote: > Ok, so I figured that despite what the BTRFS wiki seems to imply, the > 'multi parity' support just isn't stable enough to be used. So, I'm > trying to revert to what I had before. > > My setup consist of: > * 2 x 3Tb drives + > * 3 x 2Tb drives. > > I've got (had?) about 4.9Tb of data. > > My idea was to convert the existing setup using a balance to a 'single' > setup, delete the 3 x 2Tb drives from the BTRFS system, then create a > new mdadm based RAID6 (5 drives degraded to 3), create a new filesystem > on that, then copy the data across. > > So, great - first the balance: > $ btrfs balance start -dconvert=single -mconvert=single -f (yes, I know > it'll reduce the metadata redundancy). > > This promptly was followed by a system crash. > > After a reboot, I can no longer mount the BTRFS in read-write: > [ 134.768908] BTRFS info (device xvdd): disk space caching is enabled > [ 134.769032] BTRFS: has skinny extents > [ 134.769856] BTRFS: failed to read the system array on xvdd > [ 134.776055] BTRFS: open_ctree failed > [ 143.900055] BTRFS info (device xvdd): allowing degraded mounts > [ 143.900152] BTRFS info (device xvdd): not using ssd allocation scheme > [ 143.900243] BTRFS info (device xvdd): disk space caching is enabled > [ 143.900330] BTRFS: has skinny extents > [ 143.901860] BTRFS warning (device xvdd): devid 4 uuid > 61ccce61-9787-453e-b793-1b86f8015ee1 is missing > [ 146.539467] BTRFS: missing devices(1) exceeds the limit(0), writeable > mount is not allowed > [ 146.552051] BTRFS: open_ctree failed > > I can mount it read only - but then I also get crashes when it seems to > hit a read error: > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 > csum 3245290974 wanted 982056704 mirror 0 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 390821102 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 550556475 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 1279883714 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 2566472073 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 1876236691 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 3350537857 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 3319706190 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 2377458007 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 2066127208 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 657140479 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 1239359620 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 1598877324 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 1082738394 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 371906697 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 2156787247 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 3777709399 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 180814340 wanted 982056704 mirror 1 > ------------[ cut here ]------------ > kernel BUG at fs/btrfs/extent_io.c:2401! > invalid opcode: 0000 [#1] SMP > Modules linked in: btrfs x86_pkg_temp_thermal coretemp crct10dif_pclmul > xor aesni_intel aes_x86_64 lrw gf128mul glue_helper pcspkr raid6_pq > ablk_helper cryptd nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables > xen_netfront crc32c_intel xen_gntalloc xen_evtchn ipv6 autofs4 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 2610978113 wanted 982056704 mirror 1 > BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum > 59610051 wanted 982056704 mirror 1 > CPU: 1 PID: 1273 Comm: kworker/u4:4 Not tainted 4.4.13-1.el7xen.x86_64 #1 > Workqueue: btrfs-endio btrfs_endio_helper [btrfs] > task: ffff880079ce12c0 ti: ffff880078788000 task.ti: ffff880078788000 > RIP: e030:[<ffffffffa039e0e0>] [<ffffffffa039e0e0>] > btrfs_check_repairable+0x100/0x110 [btrfs] > RSP: e02b:ffff88007878bcc8 EFLAGS: 00010297 > RAX: 0000000000000001 RBX: ffff880079db2080 RCX: 0000000000000003 > RDX: 0000000000000003 RSI: 000004db13730000 RDI: ffff88007889ef38 > RBP: ffff88007878bce0 R08: 000004db01c00000 R09: 000004dbc1c00000 > R10: ffff88006bb0c1b8 R11: 0000000000000000 R12: 0000000000000000 > R13: ffff88007b213ea8 R14: 0000000000001000 R15: 0000000000000000 > FS: 00007fbf2fdc0880(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbf2d96702b CR3: 000000007969f000 CR4: 0000000000042660 > Stack: > ffffea00019db180 0000000000010000 ffff88007b213f30 ffff88007878bd88 > ffffffffa03a0808 ffff880002d15500 ffff88007878bd18 ffff880079ce12c0 > ffff88007b213e40 000000000000001f ffff880000000000 ffff88006bb0c048 > Call Trace: > [<ffffffffa03a0808>] end_bio_extent_readpage+0x428/0x560 [btrfs] > [<ffffffff812f40c0>] bio_endio+0x40/0x60 > [<ffffffffa0375a6c>] end_workqueue_fn+0x3c/0x40 [btrfs] > [<ffffffffa03af3f1>] normal_work_helper+0xc1/0x300 [btrfs] > [<ffffffff810a1352>] ? finish_task_switch+0x82/0x280 > [<ffffffffa03af702>] btrfs_endio_helper+0x12/0x20 [btrfs] > [<ffffffff81093844>] process_one_work+0x154/0x400 > [<ffffffff8109438a>] worker_thread+0x11a/0x460 > [<ffffffff8165a24f>] ? __schedule+0x2bf/0x880 > [<ffffffff81094270>] ? rescuer_thread+0x2f0/0x2f0 > [<ffffffff810993f9>] kthread+0xc9/0xe0 > [<ffffffff81099330>] ? kthread_park+0x60/0x60 > [<ffffffff8165e14f>] ret_from_fork+0x3f/0x70 > [<ffffffff81099330>] ? kthread_park+0x60/0x60 > Code: 00 31 c0 eb d5 8d 48 02 eb d9 31 c0 45 89 e0 48 c7 c6 a0 f8 3f a0 > 48 c7 c7 00 05 41 a0 e8 c9 f2 fa e0 31 c0 e9 70 ff ff ff 0f 0b <0f> 0b > 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 > RIP [<ffffffffa039e0e0>] btrfs_check_repairable+0x100/0x110 [btrfs] > RSP <ffff88007878bcc8> > ------------[ cut here ]------------ > <more crashes until the system hangs> > > So, where to from here? Sadly, I feel there is data loss in my future, > but not sure how to minimise this :\ > The more I look at this, the more I'm wondering if this is a total corruption scenario: $ btrfs restore -D -l /dev/xvdc warning, device 4 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=11224137564160 Couldn't read chunk tree Could not open root, trying backup super warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=59973363410688 Couldn't read chunk tree Could not open root, trying backup super warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=59973363410688 Couldn't read chunk tree Could not open root, trying backup super $ btrfs restore -D -l /dev/xvdd warning, device 4 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=11224137564160 Couldn't read chunk tree Could not open root, trying backup super warning, device 1 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=0 ERROR: cannot read chunk root Could not open root, trying backup super warning, device 1 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=0 ERROR: cannot read chunk root Could not open root, trying backup super $ btrfs restore -D -l /dev/xvde warning, device 4 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=11224137564160 Couldn't read chunk tree Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 bytenr mismatch, want=11224137170944, have=59973365311232 ERROR: cannot read chunk root Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 bytenr mismatch, want=11224137170944, have=59973365311232 ERROR: cannot read chunk root Could not open root, trying backup super $ btrfs restore -D -l /dev/xvdf warning, device 4 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=11224137564160 Couldn't read chunk tree Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=0 ERROR: cannot read chunk root Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 5 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=0 ERROR: cannot read chunk root Could not open root, trying backup super $ btrfs restore -D -l /dev/xvdg warning, device 4 is missing checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 bytenr mismatch, want=11224137433088, have=11224137564160 Couldn't read chunk tree Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=11224137105408 ERROR: cannot read chunk root Could not open root, trying backup super warning, device 1 is missing warning, device 2 is missing warning, device 4 is missing warning, device 3 is missing bytenr mismatch, want=11224137170944, have=11224137105408 ERROR: cannot read chunk root Could not open root, trying backup super If I mount it read only: $ mount -o nossd,degraded,ro /dev/xvdc /mnt/fileshare/ $ btrfs device usage /mnt/fileshare/ /dev/xvdc, ID: 1 Device size: 2.73TiB Device slack: 0.00B Data,single: 5.00GiB Data,RAID6: 1.60TiB Data,RAID6: 2.75GiB Data,RAID6: 1.00GiB Metadata,RAID6: 2.06GiB System,RAID6: 32.00MiB Unallocated: 1.12TiB /dev/xvdd, ID: 2 Device size: 2.73TiB Device slack: 0.00B Data,single: 1.00GiB Data,RAID6: 1.60TiB Data,RAID6: 7.07GiB Data,RAID6: 1.00GiB Metadata,RAID6: 2.06GiB System,RAID6: 32.00MiB Unallocated: 1.12TiB /dev/xvde, ID: 3 Device size: 1.82TiB Device slack: 0.00B Data,RAID6: 1.60TiB Data,RAID6: 7.07GiB Metadata,RAID6: 2.06GiB System,RAID6: 32.00MiB Unallocated: 213.23GiB /dev/xvdf, ID: 6 Device size: 1.82TiB Device slack: 0.00B Data,RAID6: 882.62GiB Data,RAID6: 1.00GiB Metadata,RAID6: 2.06GiB Unallocated: 977.33GiB /dev/xvdg, ID: 5 Device size: 1.82TiB Device slack: 0.00B Data,RAID6: 1.60TiB Data,RAID6: 7.07GiB Metadata,RAID6: 2.06GiB System,RAID6: 32.00MiB Unallocated: 213.23GiB missing, ID: 4 Device size: 0.00B Device slack: 16.00EiB Data,RAID6: 758.00GiB Data,RAID6: 4.31GiB System,RAID6: 32.00MiB Unallocated: 1.07TiB Hoping this isn't a total loss ;) -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 16:26 ` Steven Haigh @ 2016-06-24 16:59 ` ronnie sahlberg 2016-06-24 17:05 ` Steven Haigh 0 siblings, 1 reply; 20+ messages in thread From: ronnie sahlberg @ 2016-06-24 16:59 UTC (permalink / raw) To: Steven Haigh; +Cc: Btrfs BTRFS What I would do in this situation : 1, Immediately stop writing to these disks/filesystem. ONLY access it in read-only mode until you have salvaged what can be salvaged. 2, get a new 5T UDB drive (they are cheap) and copy file by file off the array. 3, when you hit files that cause panics, make a node of the inode and avoid touching that file again. Will likely take a lot of work and time since I suspect it is a largely manual process. But if the data is important ... Once you have all salvageable data copied to the new drive you can decide on how to proceed. I.e. if you want to try to repair the filesystem (I have low confidence in this for parity raid case) or if you will simply rebuild a new fs from scratch. On Fri, Jun 24, 2016 at 9:26 AM, Steven Haigh <netwiz@crc.id.au> wrote: > On 25/06/16 00:52, Steven Haigh wrote: >> Ok, so I figured that despite what the BTRFS wiki seems to imply, the >> 'multi parity' support just isn't stable enough to be used. So, I'm >> trying to revert to what I had before. >> >> My setup consist of: >> * 2 x 3Tb drives + >> * 3 x 2Tb drives. >> >> I've got (had?) about 4.9Tb of data. >> >> My idea was to convert the existing setup using a balance to a 'single' >> setup, delete the 3 x 2Tb drives from the BTRFS system, then create a >> new mdadm based RAID6 (5 drives degraded to 3), create a new filesystem >> on that, then copy the data across. >> >> So, great - first the balance: >> $ btrfs balance start -dconvert=single -mconvert=single -f (yes, I know >> it'll reduce the metadata redundancy). >> >> This promptly was followed by a system crash. >> >> After a reboot, I can no longer mount the BTRFS in read-write: >> [ 134.768908] BTRFS info (device xvdd): disk space caching is enabled >> [ 134.769032] BTRFS: has skinny extents >> [ 134.769856] BTRFS: failed to read the system array on xvdd >> [ 134.776055] BTRFS: open_ctree failed >> [ 143.900055] BTRFS info (device xvdd): allowing degraded mounts >> [ 143.900152] BTRFS info (device xvdd): not using ssd allocation scheme >> [ 143.900243] BTRFS info (device xvdd): disk space caching is enabled >> [ 143.900330] BTRFS: has skinny extents >> [ 143.901860] BTRFS warning (device xvdd): devid 4 uuid >> 61ccce61-9787-453e-b793-1b86f8015ee1 is missing >> [ 146.539467] BTRFS: missing devices(1) exceeds the limit(0), writeable >> mount is not allowed >> [ 146.552051] BTRFS: open_ctree failed >> >> I can mount it read only - but then I also get crashes when it seems to >> hit a read error: >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 >> csum 3245290974 wanted 982056704 mirror 0 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 390821102 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 550556475 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 1279883714 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 2566472073 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 1876236691 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 3350537857 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 3319706190 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 2377458007 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 2066127208 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 657140479 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 1239359620 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 1598877324 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 1082738394 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 371906697 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 2156787247 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 3777709399 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 180814340 wanted 982056704 mirror 1 >> ------------[ cut here ]------------ >> kernel BUG at fs/btrfs/extent_io.c:2401! >> invalid opcode: 0000 [#1] SMP >> Modules linked in: btrfs x86_pkg_temp_thermal coretemp crct10dif_pclmul >> xor aesni_intel aes_x86_64 lrw gf128mul glue_helper pcspkr raid6_pq >> ablk_helper cryptd nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables >> xen_netfront crc32c_intel xen_gntalloc xen_evtchn ipv6 autofs4 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 2610978113 wanted 982056704 mirror 1 >> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >> 59610051 wanted 982056704 mirror 1 >> CPU: 1 PID: 1273 Comm: kworker/u4:4 Not tainted 4.4.13-1.el7xen.x86_64 #1 >> Workqueue: btrfs-endio btrfs_endio_helper [btrfs] >> task: ffff880079ce12c0 ti: ffff880078788000 task.ti: ffff880078788000 >> RIP: e030:[<ffffffffa039e0e0>] [<ffffffffa039e0e0>] >> btrfs_check_repairable+0x100/0x110 [btrfs] >> RSP: e02b:ffff88007878bcc8 EFLAGS: 00010297 >> RAX: 0000000000000001 RBX: ffff880079db2080 RCX: 0000000000000003 >> RDX: 0000000000000003 RSI: 000004db13730000 RDI: ffff88007889ef38 >> RBP: ffff88007878bce0 R08: 000004db01c00000 R09: 000004dbc1c00000 >> R10: ffff88006bb0c1b8 R11: 0000000000000000 R12: 0000000000000000 >> R13: ffff88007b213ea8 R14: 0000000000001000 R15: 0000000000000000 >> FS: 00007fbf2fdc0880(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 >> CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007fbf2d96702b CR3: 000000007969f000 CR4: 0000000000042660 >> Stack: >> ffffea00019db180 0000000000010000 ffff88007b213f30 ffff88007878bd88 >> ffffffffa03a0808 ffff880002d15500 ffff88007878bd18 ffff880079ce12c0 >> ffff88007b213e40 000000000000001f ffff880000000000 ffff88006bb0c048 >> Call Trace: >> [<ffffffffa03a0808>] end_bio_extent_readpage+0x428/0x560 [btrfs] >> [<ffffffff812f40c0>] bio_endio+0x40/0x60 >> [<ffffffffa0375a6c>] end_workqueue_fn+0x3c/0x40 [btrfs] >> [<ffffffffa03af3f1>] normal_work_helper+0xc1/0x300 [btrfs] >> [<ffffffff810a1352>] ? finish_task_switch+0x82/0x280 >> [<ffffffffa03af702>] btrfs_endio_helper+0x12/0x20 [btrfs] >> [<ffffffff81093844>] process_one_work+0x154/0x400 >> [<ffffffff8109438a>] worker_thread+0x11a/0x460 >> [<ffffffff8165a24f>] ? __schedule+0x2bf/0x880 >> [<ffffffff81094270>] ? rescuer_thread+0x2f0/0x2f0 >> [<ffffffff810993f9>] kthread+0xc9/0xe0 >> [<ffffffff81099330>] ? kthread_park+0x60/0x60 >> [<ffffffff8165e14f>] ret_from_fork+0x3f/0x70 >> [<ffffffff81099330>] ? kthread_park+0x60/0x60 >> Code: 00 31 c0 eb d5 8d 48 02 eb d9 31 c0 45 89 e0 48 c7 c6 a0 f8 3f a0 >> 48 c7 c7 00 05 41 a0 e8 c9 f2 fa e0 31 c0 e9 70 ff ff ff 0f 0b <0f> 0b >> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 >> RIP [<ffffffffa039e0e0>] btrfs_check_repairable+0x100/0x110 [btrfs] >> RSP <ffff88007878bcc8> >> ------------[ cut here ]------------ >> <more crashes until the system hangs> >> >> So, where to from here? Sadly, I feel there is data loss in my future, >> but not sure how to minimise this :\ >> > > The more I look at this, the more I'm wondering if this is a total > corruption scenario: > > $ btrfs restore -D -l /dev/xvdc > warning, device 4 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=11224137564160 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=59973363410688 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=59973363410688 > Couldn't read chunk tree > Could not open root, trying backup super > > $ btrfs restore -D -l /dev/xvdd > warning, device 4 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=11224137564160 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 1 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=0 > ERROR: cannot read chunk root > Could not open root, trying backup super > warning, device 1 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=0 > ERROR: cannot read chunk root > Could not open root, trying backup super > > $ btrfs restore -D -l /dev/xvde > warning, device 4 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=11224137564160 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 > checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 > bytenr mismatch, want=11224137170944, have=59973365311232 > ERROR: cannot read chunk root > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 > checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 > bytenr mismatch, want=11224137170944, have=59973365311232 > ERROR: cannot read chunk root > Could not open root, trying backup super > > $ btrfs restore -D -l /dev/xvdf > warning, device 4 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=11224137564160 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=0 > ERROR: cannot read chunk root > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 5 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=0 > ERROR: cannot read chunk root > Could not open root, trying backup super > > $ btrfs restore -D -l /dev/xvdg > warning, device 4 is missing > checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 > bytenr mismatch, want=11224137433088, have=11224137564160 > Couldn't read chunk tree > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=11224137105408 > ERROR: cannot read chunk root > Could not open root, trying backup super > warning, device 1 is missing > warning, device 2 is missing > warning, device 4 is missing > warning, device 3 is missing > bytenr mismatch, want=11224137170944, have=11224137105408 > ERROR: cannot read chunk root > Could not open root, trying backup super > > If I mount it read only: > $ mount -o nossd,degraded,ro /dev/xvdc /mnt/fileshare/ > > $ btrfs device usage /mnt/fileshare/ > > /dev/xvdc, ID: 1 > Device size: 2.73TiB > Device slack: 0.00B > Data,single: 5.00GiB > Data,RAID6: 1.60TiB > Data,RAID6: 2.75GiB > Data,RAID6: 1.00GiB > Metadata,RAID6: 2.06GiB > System,RAID6: 32.00MiB > Unallocated: 1.12TiB > > /dev/xvdd, ID: 2 > Device size: 2.73TiB > Device slack: 0.00B > Data,single: 1.00GiB > Data,RAID6: 1.60TiB > Data,RAID6: 7.07GiB > Data,RAID6: 1.00GiB > Metadata,RAID6: 2.06GiB > System,RAID6: 32.00MiB > Unallocated: 1.12TiB > > /dev/xvde, ID: 3 > Device size: 1.82TiB > Device slack: 0.00B > Data,RAID6: 1.60TiB > Data,RAID6: 7.07GiB > Metadata,RAID6: 2.06GiB > System,RAID6: 32.00MiB > Unallocated: 213.23GiB > > /dev/xvdf, ID: 6 > Device size: 1.82TiB > Device slack: 0.00B > Data,RAID6: 882.62GiB > Data,RAID6: 1.00GiB > Metadata,RAID6: 2.06GiB > Unallocated: 977.33GiB > > /dev/xvdg, ID: 5 > Device size: 1.82TiB > Device slack: 0.00B > Data,RAID6: 1.60TiB > Data,RAID6: 7.07GiB > Metadata,RAID6: 2.06GiB > System,RAID6: 32.00MiB > Unallocated: 213.23GiB > > missing, ID: 4 > Device size: 0.00B > Device slack: 16.00EiB > Data,RAID6: 758.00GiB > Data,RAID6: 4.31GiB > System,RAID6: 32.00MiB > Unallocated: 1.07TiB > > Hoping this isn't a total loss ;) > > -- > Steven Haigh > > Email: netwiz@crc.id.au > Web: https://www.crc.id.au > Phone: (03) 9001 6090 - 0412 935 897 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 16:59 ` ronnie sahlberg @ 2016-06-24 17:05 ` Steven Haigh 2016-06-24 17:40 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-24 17:05 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 16239 bytes --] On 25/06/16 02:59, ronnie sahlberg wrote: > What I would do in this situation : > > 1, Immediately stop writing to these disks/filesystem. ONLY access it > in read-only mode until you have salvaged what can be salvaged. That's ok - I can't even mount it in RW mode :) > 2, get a new 5T UDB drive (they are cheap) and copy file by file off the array. I've actually got enough combined space to store stuff places in the mean time... > 3, when you hit files that cause panics, make a node of the inode and > avoid touching that file again. What I have in mind here is that a file seems to get CREATED when I copy the file that crashes the system in the target directory. I'm thinking if I 'cp -an source/ target/' that it will make this somewhat easier (it won't overwrite the zero byte file). > Will likely take a lot of work and time since I suspect it is a > largely manual process. But if the data is important ... Yeah - there's only about 80Gb on the array that I *really* care about - the rest is just a bonus if its there - not rage-worthy :P > Once you have all salvageable data copied to the new drive you can > decide on how to proceed. > I.e. if you want to try to repair the filesystem (I have low > confidence in this for parity raid case) or if you will simply rebuild > a new fs from scratch. I honestly think it'll be scorched earth and start again with a new FS. I'm thinking of going back to mdadm for the RAID (which has worked perfectly for years) and using maybe a vanilla BTRFS on top of that block device. Anything else seems like too much work for too little reward - and lack of confidence. > On Fri, Jun 24, 2016 at 9:26 AM, Steven Haigh <netwiz@crc.id.au> wrote: >> On 25/06/16 00:52, Steven Haigh wrote: >>> Ok, so I figured that despite what the BTRFS wiki seems to imply, the >>> 'multi parity' support just isn't stable enough to be used. So, I'm >>> trying to revert to what I had before. >>> >>> My setup consist of: >>> * 2 x 3Tb drives + >>> * 3 x 2Tb drives. >>> >>> I've got (had?) about 4.9Tb of data. >>> >>> My idea was to convert the existing setup using a balance to a 'single' >>> setup, delete the 3 x 2Tb drives from the BTRFS system, then create a >>> new mdadm based RAID6 (5 drives degraded to 3), create a new filesystem >>> on that, then copy the data across. >>> >>> So, great - first the balance: >>> $ btrfs balance start -dconvert=single -mconvert=single -f (yes, I know >>> it'll reduce the metadata redundancy). >>> >>> This promptly was followed by a system crash. >>> >>> After a reboot, I can no longer mount the BTRFS in read-write: >>> [ 134.768908] BTRFS info (device xvdd): disk space caching is enabled >>> [ 134.769032] BTRFS: has skinny extents >>> [ 134.769856] BTRFS: failed to read the system array on xvdd >>> [ 134.776055] BTRFS: open_ctree failed >>> [ 143.900055] BTRFS info (device xvdd): allowing degraded mounts >>> [ 143.900152] BTRFS info (device xvdd): not using ssd allocation scheme >>> [ 143.900243] BTRFS info (device xvdd): disk space caching is enabled >>> [ 143.900330] BTRFS: has skinny extents >>> [ 143.901860] BTRFS warning (device xvdd): devid 4 uuid >>> 61ccce61-9787-453e-b793-1b86f8015ee1 is missing >>> [ 146.539467] BTRFS: missing devices(1) exceeds the limit(0), writeable >>> mount is not allowed >>> [ 146.552051] BTRFS: open_ctree failed >>> >>> I can mount it read only - but then I also get crashes when it seems to >>> hit a read error: >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 >>> csum 3245290974 wanted 982056704 mirror 0 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 390821102 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 550556475 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 1279883714 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 2566472073 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 1876236691 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 3350537857 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 3319706190 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 2377458007 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 2066127208 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 657140479 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 1239359620 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 1598877324 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 1082738394 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 371906697 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 2156787247 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 3777709399 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 180814340 wanted 982056704 mirror 1 >>> ------------[ cut here ]------------ >>> kernel BUG at fs/btrfs/extent_io.c:2401! >>> invalid opcode: 0000 [#1] SMP >>> Modules linked in: btrfs x86_pkg_temp_thermal coretemp crct10dif_pclmul >>> xor aesni_intel aes_x86_64 lrw gf128mul glue_helper pcspkr raid6_pq >>> ablk_helper cryptd nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables >>> xen_netfront crc32c_intel xen_gntalloc xen_evtchn ipv6 autofs4 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 2610978113 wanted 982056704 mirror 1 >>> BTRFS info (device xvdc): csum failed ino 42179 extent 8690008064 csum >>> 59610051 wanted 982056704 mirror 1 >>> CPU: 1 PID: 1273 Comm: kworker/u4:4 Not tainted 4.4.13-1.el7xen.x86_64 #1 >>> Workqueue: btrfs-endio btrfs_endio_helper [btrfs] >>> task: ffff880079ce12c0 ti: ffff880078788000 task.ti: ffff880078788000 >>> RIP: e030:[<ffffffffa039e0e0>] [<ffffffffa039e0e0>] >>> btrfs_check_repairable+0x100/0x110 [btrfs] >>> RSP: e02b:ffff88007878bcc8 EFLAGS: 00010297 >>> RAX: 0000000000000001 RBX: ffff880079db2080 RCX: 0000000000000003 >>> RDX: 0000000000000003 RSI: 000004db13730000 RDI: ffff88007889ef38 >>> RBP: ffff88007878bce0 R08: 000004db01c00000 R09: 000004dbc1c00000 >>> R10: ffff88006bb0c1b8 R11: 0000000000000000 R12: 0000000000000000 >>> R13: ffff88007b213ea8 R14: 0000000000001000 R15: 0000000000000000 >>> FS: 00007fbf2fdc0880(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 >>> CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 00007fbf2d96702b CR3: 000000007969f000 CR4: 0000000000042660 >>> Stack: >>> ffffea00019db180 0000000000010000 ffff88007b213f30 ffff88007878bd88 >>> ffffffffa03a0808 ffff880002d15500 ffff88007878bd18 ffff880079ce12c0 >>> ffff88007b213e40 000000000000001f ffff880000000000 ffff88006bb0c048 >>> Call Trace: >>> [<ffffffffa03a0808>] end_bio_extent_readpage+0x428/0x560 [btrfs] >>> [<ffffffff812f40c0>] bio_endio+0x40/0x60 >>> [<ffffffffa0375a6c>] end_workqueue_fn+0x3c/0x40 [btrfs] >>> [<ffffffffa03af3f1>] normal_work_helper+0xc1/0x300 [btrfs] >>> [<ffffffff810a1352>] ? finish_task_switch+0x82/0x280 >>> [<ffffffffa03af702>] btrfs_endio_helper+0x12/0x20 [btrfs] >>> [<ffffffff81093844>] process_one_work+0x154/0x400 >>> [<ffffffff8109438a>] worker_thread+0x11a/0x460 >>> [<ffffffff8165a24f>] ? __schedule+0x2bf/0x880 >>> [<ffffffff81094270>] ? rescuer_thread+0x2f0/0x2f0 >>> [<ffffffff810993f9>] kthread+0xc9/0xe0 >>> [<ffffffff81099330>] ? kthread_park+0x60/0x60 >>> [<ffffffff8165e14f>] ret_from_fork+0x3f/0x70 >>> [<ffffffff81099330>] ? kthread_park+0x60/0x60 >>> Code: 00 31 c0 eb d5 8d 48 02 eb d9 31 c0 45 89 e0 48 c7 c6 a0 f8 3f a0 >>> 48 c7 c7 00 05 41 a0 e8 c9 f2 fa e0 31 c0 e9 70 ff ff ff 0f 0b <0f> 0b >>> 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 >>> RIP [<ffffffffa039e0e0>] btrfs_check_repairable+0x100/0x110 [btrfs] >>> RSP <ffff88007878bcc8> >>> ------------[ cut here ]------------ >>> <more crashes until the system hangs> >>> >>> So, where to from here? Sadly, I feel there is data loss in my future, >>> but not sure how to minimise this :\ >>> >> >> The more I look at this, the more I'm wondering if this is a total >> corruption scenario: >> >> $ btrfs restore -D -l /dev/xvdc >> warning, device 4 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=11224137564160 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=59973363410688 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=59973363410688 >> Couldn't read chunk tree >> Could not open root, trying backup super >> >> $ btrfs restore -D -l /dev/xvdd >> warning, device 4 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=11224137564160 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=0 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=0 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> >> $ btrfs restore -D -l /dev/xvde >> warning, device 4 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=11224137564160 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 >> checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 >> bytenr mismatch, want=11224137170944, have=59973365311232 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 >> checksum verify failed on 11224137170944 found C9115A93 wanted 14526E28 >> bytenr mismatch, want=11224137170944, have=59973365311232 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> >> $ btrfs restore -D -l /dev/xvdf >> warning, device 4 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=11224137564160 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=0 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 5 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=0 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> >> $ btrfs restore -D -l /dev/xvdg >> warning, device 4 is missing >> checksum verify failed on 11224137433088 found EF5DE164 wanted 62BE2322 >> bytenr mismatch, want=11224137433088, have=11224137564160 >> Couldn't read chunk tree >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=11224137105408 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> warning, device 1 is missing >> warning, device 2 is missing >> warning, device 4 is missing >> warning, device 3 is missing >> bytenr mismatch, want=11224137170944, have=11224137105408 >> ERROR: cannot read chunk root >> Could not open root, trying backup super >> >> If I mount it read only: >> $ mount -o nossd,degraded,ro /dev/xvdc /mnt/fileshare/ >> >> $ btrfs device usage /mnt/fileshare/ >> >> /dev/xvdc, ID: 1 >> Device size: 2.73TiB >> Device slack: 0.00B >> Data,single: 5.00GiB >> Data,RAID6: 1.60TiB >> Data,RAID6: 2.75GiB >> Data,RAID6: 1.00GiB >> Metadata,RAID6: 2.06GiB >> System,RAID6: 32.00MiB >> Unallocated: 1.12TiB >> >> /dev/xvdd, ID: 2 >> Device size: 2.73TiB >> Device slack: 0.00B >> Data,single: 1.00GiB >> Data,RAID6: 1.60TiB >> Data,RAID6: 7.07GiB >> Data,RAID6: 1.00GiB >> Metadata,RAID6: 2.06GiB >> System,RAID6: 32.00MiB >> Unallocated: 1.12TiB >> >> /dev/xvde, ID: 3 >> Device size: 1.82TiB >> Device slack: 0.00B >> Data,RAID6: 1.60TiB >> Data,RAID6: 7.07GiB >> Metadata,RAID6: 2.06GiB >> System,RAID6: 32.00MiB >> Unallocated: 213.23GiB >> >> /dev/xvdf, ID: 6 >> Device size: 1.82TiB >> Device slack: 0.00B >> Data,RAID6: 882.62GiB >> Data,RAID6: 1.00GiB >> Metadata,RAID6: 2.06GiB >> Unallocated: 977.33GiB >> >> /dev/xvdg, ID: 5 >> Device size: 1.82TiB >> Device slack: 0.00B >> Data,RAID6: 1.60TiB >> Data,RAID6: 7.07GiB >> Metadata,RAID6: 2.06GiB >> System,RAID6: 32.00MiB >> Unallocated: 213.23GiB >> >> missing, ID: 4 >> Device size: 0.00B >> Device slack: 16.00EiB >> Data,RAID6: 758.00GiB >> Data,RAID6: 4.31GiB >> System,RAID6: 32.00MiB >> Unallocated: 1.07TiB >> >> Hoping this isn't a total loss ;) >> >> -- >> Steven Haigh >> >> Email: netwiz@crc.id.au >> Web: https://www.crc.id.au >> Phone: (03) 9001 6090 - 0412 935 897 >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 17:05 ` Steven Haigh @ 2016-06-24 17:40 ` Austin S. Hemmelgarn 2016-06-24 17:43 ` Steven Haigh 0 siblings, 1 reply; 20+ messages in thread From: Austin S. Hemmelgarn @ 2016-06-24 17:40 UTC (permalink / raw) To: Steven Haigh, linux-btrfs On 2016-06-24 13:05, Steven Haigh wrote: > On 25/06/16 02:59, ronnie sahlberg wrote: > What I have in mind here is that a file seems to get CREATED when I copy > the file that crashes the system in the target directory. I'm thinking > if I 'cp -an source/ target/' that it will make this somewhat easier (it > won't overwrite the zero byte file). You may want to try with rsync (rsync -vahogSHAXOP should get just about everything possible out of the filesystem except for some security attributes (stuff like SELinux context), and will give you nice information about progress as well). It will keep running in the face of individual read errors, and will only try each file once. It also has the advantage of showing you the transfer rate and exactly where in the directory structure you are, and handles partial copies sanely too (it's more reliable restarting an rsync transfer than a cp one that got interrupted part way through). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 17:40 ` Austin S. Hemmelgarn @ 2016-06-24 17:43 ` Steven Haigh 2016-06-24 17:50 ` Austin S. Hemmelgarn 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-24 17:43 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 2145 bytes --] On 25/06/16 03:40, Austin S. Hemmelgarn wrote: > On 2016-06-24 13:05, Steven Haigh wrote: >> On 25/06/16 02:59, ronnie sahlberg wrote: >> What I have in mind here is that a file seems to get CREATED when I copy >> the file that crashes the system in the target directory. I'm thinking >> if I 'cp -an source/ target/' that it will make this somewhat easier (it >> won't overwrite the zero byte file). > You may want to try with rsync (rsync -vahogSHAXOP should get just about > everything possible out of the filesystem except for some security > attributes (stuff like SELinux context), and will give you nice > information about progress as well). It will keep running in the face > of individual read errors, and will only try each file once. It also > has the advantage of showing you the transfer rate and exactly where in > the directory structure you are, and handles partial copies sanely too > (it's more reliable restarting an rsync transfer than a cp one that got > interrupted part way through). I may try that - I came up with this: #!/bin/bash mount -o ro,nossd,degraded /dev/xvdc /mnt/fileshare/ find /mnt/fileshare/data/Photos/ -type f -print0 | while IFS= read -r -d $'\0' line; do echo "Processing $line" DIR=`dirname "$line"` mkdir -p "/mnt/recover/$DIR" if [ ! -e "/mnt/recover/$line" ]; then echo "Copying $line to /mnt/recover/$line" touch "/mnt/recover/$line" sync cp -f "$line" "/mnt/recover/$line" sync fi done umount /mnt/fileshare I'm slowly picking through the data - and it has crashed a few times... It seems that there are some checksum failures that don't crash the entire system - so that's a good thing to know - not sure if that means that it is correcting the data with parity - or something else. I'll see how much data I can extract with this and go from there - as it may be good enough to call it a success. -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 17:43 ` Steven Haigh @ 2016-06-24 17:50 ` Austin S. Hemmelgarn 2016-06-25 4:19 ` Steven Haigh 0 siblings, 1 reply; 20+ messages in thread From: Austin S. Hemmelgarn @ 2016-06-24 17:50 UTC (permalink / raw) To: Steven Haigh, linux-btrfs On 2016-06-24 13:43, Steven Haigh wrote: > On 25/06/16 03:40, Austin S. Hemmelgarn wrote: >> On 2016-06-24 13:05, Steven Haigh wrote: >>> On 25/06/16 02:59, ronnie sahlberg wrote: >>> What I have in mind here is that a file seems to get CREATED when I copy >>> the file that crashes the system in the target directory. I'm thinking >>> if I 'cp -an source/ target/' that it will make this somewhat easier (it >>> won't overwrite the zero byte file). >> You may want to try with rsync (rsync -vahogSHAXOP should get just about >> everything possible out of the filesystem except for some security >> attributes (stuff like SELinux context), and will give you nice >> information about progress as well). It will keep running in the face >> of individual read errors, and will only try each file once. It also >> has the advantage of showing you the transfer rate and exactly where in >> the directory structure you are, and handles partial copies sanely too >> (it's more reliable restarting an rsync transfer than a cp one that got >> interrupted part way through). > > I may try that - I came up with this: > #!/bin/bash > > mount -o ro,nossd,degraded /dev/xvdc /mnt/fileshare/ > > find /mnt/fileshare/data/Photos/ -type f -print0 | > while IFS= read -r -d $'\0' line; do > echo "Processing $line" > DIR=`dirname "$line"` > mkdir -p "/mnt/recover/$DIR" > if [ ! -e "/mnt/recover/$line" ]; then > echo "Copying $line to /mnt/recover/$line" > touch "/mnt/recover/$line" > sync > cp -f "$line" "/mnt/recover/$line" > sync > fi > done > > umount /mnt/fileshare > > I'm slowly picking through the data - and it has crashed a few times... > It seems that there are some checksum failures that don't crash the > entire system - so that's a good thing to know - not sure if that means > that it is correcting the data with parity - or something else. > > I'll see how much data I can extract with this and go from there - as it > may be good enough to call it a success. > AH, if you're having issues with crashes when you hit errors, you may want to avoid rsync then, it will try to reread any files that don't match in size and mtime, so it would likely just keep crashing on the same file over and over again. Also, looking at the script you've got, that will probably run faster too because it shouldn't need to call stat() on everything like rsync does (because of the size and mtime comparison). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-24 17:50 ` Austin S. Hemmelgarn @ 2016-06-25 4:19 ` Steven Haigh 2016-06-25 16:25 ` Chris Murphy 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-25 4:19 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 3783 bytes --] On 25/06/2016 3:50 AM, Austin S. Hemmelgarn wrote: > On 2016-06-24 13:43, Steven Haigh wrote: >> On 25/06/16 03:40, Austin S. Hemmelgarn wrote: >>> On 2016-06-24 13:05, Steven Haigh wrote: >>>> On 25/06/16 02:59, ronnie sahlberg wrote: >>>> What I have in mind here is that a file seems to get CREATED when I >>>> copy >>>> the file that crashes the system in the target directory. I'm thinking >>>> if I 'cp -an source/ target/' that it will make this somewhat easier >>>> (it >>>> won't overwrite the zero byte file). >>> You may want to try with rsync (rsync -vahogSHAXOP should get just about >>> everything possible out of the filesystem except for some security >>> attributes (stuff like SELinux context), and will give you nice >>> information about progress as well). It will keep running in the face >>> of individual read errors, and will only try each file once. It also >>> has the advantage of showing you the transfer rate and exactly where in >>> the directory structure you are, and handles partial copies sanely too >>> (it's more reliable restarting an rsync transfer than a cp one that got >>> interrupted part way through). >> >> I may try that - I came up with this: >> #!/bin/bash >> >> mount -o ro,nossd,degraded /dev/xvdc /mnt/fileshare/ >> >> find /mnt/fileshare/data/Photos/ -type f -print0 | >> while IFS= read -r -d $'\0' line; do >> echo "Processing $line" >> DIR=`dirname "$line"` >> mkdir -p "/mnt/recover/$DIR" >> if [ ! -e "/mnt/recover/$line" ]; then >> echo "Copying $line to /mnt/recover/$line" >> touch "/mnt/recover/$line" >> sync >> cp -f "$line" "/mnt/recover/$line" >> sync >> fi >> done >> >> umount /mnt/fileshare >> >> I'm slowly picking through the data - and it has crashed a few times... >> It seems that there are some checksum failures that don't crash the >> entire system - so that's a good thing to know - not sure if that means >> that it is correcting the data with parity - or something else. >> >> I'll see how much data I can extract with this and go from there - as it >> may be good enough to call it a success. >> > AH, if you're having issues with crashes when you hit errors, you may > want to avoid rsync then, it will try to reread any files that don't > match in size and mtime, so it would likely just keep crashing on the > same file over and over again. > > Also, looking at the script you've got, that will probably run faster > too because it shouldn't need to call stat() on everything like rsync > does (because of the size and mtime comparison). Well, as a data point, the data is slowly coming off the RAID6 array. Some stuff is just dead and crashes the entire host whenever you try to access it. At the moment, my average uptime is about 2-3 minutes... I've added my recovery rsync script to /etc/rc.local - and I'm just starting / destroying the VM every time it crashes. I'm also rsync'ing the data from that system out to other areas of storage so I can pull off as much data as possible (I don't have a spare 4.4Tb to use). I lost a total of 5 photos out of 83Gb worth - which is good. My music collection doesn't seem to be that lucky - which means lots of time ripping CDs in the future :P I haven't tried the applications / ISOs directory yet - but we'll see how that goes when I get there... The photos were the main thing I was concerned about, the rest is just handy. Interesting though that EVERY crash references: kernel BUG at fs/btrfs/extent_io.c:2401! -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-25 4:19 ` Steven Haigh @ 2016-06-25 16:25 ` Chris Murphy 2016-06-25 16:39 ` Steven Haigh 0 siblings, 1 reply; 20+ messages in thread From: Chris Murphy @ 2016-06-25 16:25 UTC (permalink / raw) To: Steven Haigh; +Cc: Btrfs BTRFS On Fri, Jun 24, 2016 at 10:19 PM, Steven Haigh <netwiz@crc.id.au> wrote: > > Interesting though that EVERY crash references: > kernel BUG at fs/btrfs/extent_io.c:2401! Yeah because you're mounted ro, and if this is 4.4.13 unmodified btrfs from kernel.org then that's the 3rd line: if (head->is_data) { ret = btrfs_del_csums(trans, root, node->bytenr, node->num_bytes); So why/what is it cleaning up if it's mounted ro? Anyway, once you're no longer making forward progress you could try something newer, although it's a coin toss what to try. There are some issues with 4.6.0-4.6.2 but there have been a lot of changes in btrfs/extent_io.c and btrfs/raid56.c between 4.4.13 that you're using and 4.6.2, so you could try that or even build 4.7.rc4 or rc5 by tomorrowish and see how that fairs. It sounds like there's just too much (mostly metadata) corruption for the degraded state to deal with so it may not matter. I'm really skeptical of btrfsck on degraded fs's so I don't think that'll help. -- Chris Murphy ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-25 16:25 ` Chris Murphy @ 2016-06-25 16:39 ` Steven Haigh 2016-06-25 17:14 ` Chris Murphy 2016-06-26 2:30 ` Duncan 0 siblings, 2 replies; 20+ messages in thread From: Steven Haigh @ 2016-06-25 16:39 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS [-- Attachment #1.1: Type: text/plain, Size: 2874 bytes --] On 26/06/16 02:25, Chris Murphy wrote: > On Fri, Jun 24, 2016 at 10:19 PM, Steven Haigh <netwiz@crc.id.au> wrote: > >> >> Interesting though that EVERY crash references: >> kernel BUG at fs/btrfs/extent_io.c:2401! > > Yeah because you're mounted ro, and if this is 4.4.13 unmodified btrfs > from kernel.org then that's the 3rd line: > > if (head->is_data) { > ret = btrfs_del_csums(trans, root, > node->bytenr, > node->num_bytes); > > So why/what is it cleaning up if it's mounted ro? Anyway, once you're > no longer making forward progress you could try something newer, > although it's a coin toss what to try. There are some issues with > 4.6.0-4.6.2 but there have been a lot of changes in btrfs/extent_io.c > and btrfs/raid56.c between 4.4.13 that you're using and 4.6.2, so you > could try that or even build 4.7.rc4 or rc5 by tomorrowish and see how > that fairs. It sounds like there's just too much (mostly metadata) > corruption for the degraded state to deal with so it may not matter. > I'm really skeptical of btrfsck on degraded fs's so I don't think > that'll help. Well, I did end up recovering the data that I cared about. I'm not really keen to ride the BTRFS RAID6 train again any time soon :\ I now have the same as I've had for years - md RAID6 with XFS on top of it. I'm still copying data back to the array from the various sources I had to copy it to so I had enough space to do so. What I find interesting is that the patterns of corruption in the BTRFS RAID6 is quite clustered. I have ~80Gb of MP3s ripped over the years - of that, the corruption would take out 3-4 songs in a row, then the next 10 albums or so were intact. What made recovery VERY hard, is that it got to several situations that just caused a complete system hang. I tried it on bare metal - just in case it was a Xen thing, but it hard hung the entire machine then. In every case, it was a flurry of csum error messages, then instant death. I would have been much happier if the file had been skipped or returned as unavailable instead of having the entire machine crash. I ended up putting the bit of script that I posted earlier in /etc/rc.local - then just kept doing: xl destroy myvm && xl create /etc/xen/myvm -c Wait for the crash, run the above again. All in all, it took me about 350 boots with an average uptime of about 3 minutes to get the data out that I decided to keep. While not a BTRFS loss, I did decide with how long it was going to take to not bother recovering ~3.5Tb of other data that is easily available in other places on the internet. If I really need the Fedora 24 KDE Spin ISO, or the CentOS 6 Install DVD, etc etc I can download it again. -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-25 16:39 ` Steven Haigh @ 2016-06-25 17:14 ` Chris Murphy 2016-06-26 2:30 ` Duncan 1 sibling, 0 replies; 20+ messages in thread From: Chris Murphy @ 2016-06-25 17:14 UTC (permalink / raw) To: Steven Haigh; +Cc: Chris Murphy, Btrfs BTRFS On Sat, Jun 25, 2016 at 10:39 AM, Steven Haigh <netwiz@crc.id.au> wrote: > Well, I did end up recovering the data that I cared about. I'm not > really keen to ride the BTRFS RAID6 train again any time soon :\ > > I now have the same as I've had for years - md RAID6 with XFS on top of > it. I'm still copying data back to the array from the various sources I > had to copy it to so I had enough space to do so. Just make sure you've got each drive's SCT ERC shorter than the kernel SCSI command timer for each block device in /sys/block/device-name/device/timeout or you can very easily end up with the same if not worse problem which is total array collapse. It's more rare to see the problem on mdraid6 because the extra parity ends up papering over the problem caused by this misconfiguration, but it's a misconfiguration that's the default unless you're using enterprise/NAS specific drives with short recoveries set on them by default. The linux-raid@ list is full of problems resulting from this issue. I think the obvious mistake here though is assuming reshapes entail no risk. There's a -f required for a reason. You could have ended up in just as bad situation doing a reshape without a backup of an md or lvm based array. Yes it should work, and if it doesn't it's a bug, but how much data do you want to lose today? > What I find interesting is that the patterns of corruption in the BTRFS > RAID6 is quite clustered. I have ~80Gb of MP3s ripped over the years - > of that, the corruption would take out 3-4 songs in a row, then the next > 10 albums or so were intact. What made recovery VERY hard, is that it > got to several situations that just caused a complete system hang. The data stripe size is 64KiB * (num of disks - 2). So in your case I think that's 64 *3 = 192KiB. That's less than the size of one song, so that means roughly 15 bad stripes in a row. That's less than a block group also. The Btrfs conversion should be safer than methods used by mdadm and lvm because the operation is cow. The raid6 block group is supposed to remain intact and "live" if you will, until the single block group is written to stable media. The full crash set of kernel messages might be useful to find out what was happening that instigated all of this corruption. But even still the subsequent mount should at worst rollback to state of block groups of different profiles where the most recent (failed) conversion is still a raid6 block group intact. So, I'd still say btrfs-image it and host it somewhere, file a bug, cross reference this thread in the bug, and the bug URL in this thread. Might take months or even a year before a dev looks at it, but better than nothing. > > I tried it on bare metal - just in case it was a Xen thing, but it hard > hung the entire machine then. In every case, it was a flurry of csum > error messages, then instant death. I would have been much happier if > the file had been skipped or returned as unavailable instead of having > the entire machine crash. Of course. The unanswered question though is why are there so many csum errors? Are these metadata csum errors, or are they EXTENT_CSUM errors, and how are they becoming wrong? Wrongly read, wrongly written, wrongly recomputed from parity? How did the parity go bad if that's the case? So it needs an autopsy or it just doesn't get better. -- Chris Murphy ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-25 16:39 ` Steven Haigh 2016-06-25 17:14 ` Chris Murphy @ 2016-06-26 2:30 ` Duncan 2016-06-26 3:13 ` Steven Haigh 1 sibling, 1 reply; 20+ messages in thread From: Duncan @ 2016-06-26 2:30 UTC (permalink / raw) To: linux-btrfs Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted: > In every case, it was a flurry of csum error messages, then instant > death. This is very possibly a known bug in btrfs, that occurs even in raid1 where a later scrub repairs all csum errors. While in theory btrfs raid1 should simply pull from the mirrored copy if its first try fails checksum (assuming the second one passes, of course), and it seems to do this just fine if there's only an occasional csum error, if it gets too many at once, it *does* unfortunately crash, despite the second copy being available and being just fine as later demonstrated by the scrub fixing the bad copy from the good one. I'm used to dealing with that here any time I have a bad shutdown (and I'm running live-git kde, which currently has a bug that triggers a system crash if I let it idle and shut off the monitors, so I've been getting crash shutdowns and having to deal with this unfortunately often, recently). Fortunately I keep my root, with all system executables, etc, mounted read-only by default, so it's not affected and I can /almost/ boot normally after such a crash. The problem is /var/log and /home (which has some parts of /var that need to be writable symlinked into / home/var, so / can stay read-only). Something in the normal after-crash boot triggers enough csum errors there that I often crash again. So I have to boot to emergency mode and manually mount the filesystems in question, so nothing's trying to access them until I run the scrub and fix the csum errors. Scrub itself doesn't trigger the crash, thankfully, and once it has repaired all the csum errors due to partial writes on one mirror that either were never made or were properly completed on the other mirror, I can exit emergency mode and complete the normal boot (to the multi-user default target). As there's no more csum errors then because scrub fixed them all, the boot doesn't crash due to too many such errors, and I'm back in business. Tho I believe at least the csum bug that affects me may only trigger if compression is (or perhaps has been in the past) enabled. Since I run compress=lzo everywhere, that would certainly affect me. It would also explain why the bug has remained around for quite some time as well, since presumably the devs don't run with compression on enough for this to have become a personal itch they needed to scratch, thus its remaining untraced and unfixed. So if you weren't using the compress option, your bug is probably different, but either way, the whole thing about too many csum errors at once triggering a system crash sure does sound familiar, here. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Trying to rescue my data :( 2016-06-26 2:30 ` Duncan @ 2016-06-26 3:13 ` Steven Haigh 2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald 0 siblings, 1 reply; 20+ messages in thread From: Steven Haigh @ 2016-06-26 3:13 UTC (permalink / raw) To: linux-btrfs [-- Attachment #1.1: Type: text/plain, Size: 3072 bytes --] On 26/06/16 12:30, Duncan wrote: > Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted: > >> In every case, it was a flurry of csum error messages, then instant >> death. > > This is very possibly a known bug in btrfs, that occurs even in raid1 > where a later scrub repairs all csum errors. While in theory btrfs raid1 > should simply pull from the mirrored copy if its first try fails checksum > (assuming the second one passes, of course), and it seems to do this just > fine if there's only an occasional csum error, if it gets too many at > once, it *does* unfortunately crash, despite the second copy being > available and being just fine as later demonstrated by the scrub fixing > the bad copy from the good one. > > I'm used to dealing with that here any time I have a bad shutdown (and > I'm running live-git kde, which currently has a bug that triggers a > system crash if I let it idle and shut off the monitors, so I've been > getting crash shutdowns and having to deal with this unfortunately often, > recently). Fortunately I keep my root, with all system executables, etc, > mounted read-only by default, so it's not affected and I can /almost/ > boot normally after such a crash. The problem is /var/log and /home > (which has some parts of /var that need to be writable symlinked into / > home/var, so / can stay read-only). Something in the normal after-crash > boot triggers enough csum errors there that I often crash again. > > So I have to boot to emergency mode and manually mount the filesystems in > question, so nothing's trying to access them until I run the scrub and > fix the csum errors. Scrub itself doesn't trigger the crash, thankfully, > and once it has repaired all the csum errors due to partial writes on one > mirror that either were never made or were properly completed on the > other mirror, I can exit emergency mode and complete the normal boot (to > the multi-user default target). As there's no more csum errors then > because scrub fixed them all, the boot doesn't crash due to too many such > errors, and I'm back in business. > > > Tho I believe at least the csum bug that affects me may only trigger if > compression is (or perhaps has been in the past) enabled. Since I run > compress=lzo everywhere, that would certainly affect me. It would also > explain why the bug has remained around for quite some time as well, > since presumably the devs don't run with compression on enough for this > to have become a personal itch they needed to scratch, thus its remaining > untraced and unfixed. > > So if you weren't using the compress option, your bug is probably > different, but either way, the whole thing about too many csum errors at > once triggering a system crash sure does sound familiar, here. Yes, I was running the compress=lzo option as well... Maybe here lays a common problem? -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* compress=lzo safe to use? (was: Re: Trying to rescue my data :() 2016-06-26 3:13 ` Steven Haigh @ 2016-09-11 19:48 ` Martin Steigerwald 2016-09-11 20:06 ` Adam Borowski ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Martin Steigerwald @ 2016-09-11 19:48 UTC (permalink / raw) To: Steven Haigh; +Cc: linux-btrfs Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh: > On 26/06/16 12:30, Duncan wrote: > > Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted: > >> In every case, it was a flurry of csum error messages, then instant > >> death. > > > > This is very possibly a known bug in btrfs, that occurs even in raid1 > > where a later scrub repairs all csum errors. While in theory btrfs raid1 > > should simply pull from the mirrored copy if its first try fails checksum > > (assuming the second one passes, of course), and it seems to do this just > > fine if there's only an occasional csum error, if it gets too many at > > once, it *does* unfortunately crash, despite the second copy being > > available and being just fine as later demonstrated by the scrub fixing > > the bad copy from the good one. > > > > I'm used to dealing with that here any time I have a bad shutdown (and > > I'm running live-git kde, which currently has a bug that triggers a > > system crash if I let it idle and shut off the monitors, so I've been > > getting crash shutdowns and having to deal with this unfortunately often, > > recently). Fortunately I keep my root, with all system executables, etc, > > mounted read-only by default, so it's not affected and I can /almost/ > > boot normally after such a crash. The problem is /var/log and /home > > (which has some parts of /var that need to be writable symlinked into / > > home/var, so / can stay read-only). Something in the normal after-crash > > boot triggers enough csum errors there that I often crash again. > > > > So I have to boot to emergency mode and manually mount the filesystems in > > question, so nothing's trying to access them until I run the scrub and > > fix the csum errors. Scrub itself doesn't trigger the crash, thankfully, > > and once it has repaired all the csum errors due to partial writes on one > > mirror that either were never made or were properly completed on the > > other mirror, I can exit emergency mode and complete the normal boot (to > > the multi-user default target). As there's no more csum errors then > > because scrub fixed them all, the boot doesn't crash due to too many such > > errors, and I'm back in business. > > > > > > Tho I believe at least the csum bug that affects me may only trigger if > > compression is (or perhaps has been in the past) enabled. Since I run > > compress=lzo everywhere, that would certainly affect me. It would also > > explain why the bug has remained around for quite some time as well, > > since presumably the devs don't run with compression on enough for this > > to have become a personal itch they needed to scratch, thus its remaining > > untraced and unfixed. > > > > So if you weren't using the compress option, your bug is probably > > different, but either way, the whole thing about too many csum errors at > > once triggering a system crash sure does sound familiar, here. > > Yes, I was running the compress=lzo option as well... Maybe here lays a > common problem? Hmm… I found this from being referred to by reading Debian wiki page on BTRFS¹. I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6? I just want to assess whether using compress=lzo might be dangerous to use in my setup. Actually right now I like to keep using it, since I think at least one of the SSDs does not compress. And… well… /home and / where I use it are both quite full already. [1] https://wiki.debian.org/Btrfs#WARNINGS Thanks, -- Martin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? (was: Re: Trying to rescue my data :() 2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald @ 2016-09-11 20:06 ` Adam Borowski 2016-09-11 20:27 ` Chris Murphy 2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg 2016-09-12 1:00 ` Steven Haigh 2 siblings, 1 reply; 20+ messages in thread From: Adam Borowski @ 2016-09-11 20:06 UTC (permalink / raw) To: Martin Steigerwald; +Cc: Steven Haigh, linux-btrfs On Sun, Sep 11, 2016 at 09:48:35PM +0200, Martin Steigerwald wrote: > Hmm… I found this from being referred to by reading Debian wiki page on > BTRFS¹. > > I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an > issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6? > > I just want to assess whether using compress=lzo might be dangerous to use in > my setup. Actually right now I like to keep using it, since I think at least > one of the SSDs does not compress. And… well… /home and / where I use it are > both quite full already. > > [1] https://wiki.debian.org/Btrfs#WARNINGS I have used compress=lzo for years, kernels 3.8, 3.13 and 3.14 (a bunch of machines), without a single glitch; heavy snapshotting, single dev only, no quota. Until recently I did never balanced. I did have a case of ENOSPC with <80% full on 4.7 which might or might not be related to compress=lzo. -- Second "wet cat laying down on a powered-on box-less SoC on the desk" close shave in a week. Protect your ARMs, folks! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? (was: Re: Trying to rescue my data :() 2016-09-11 20:06 ` Adam Borowski @ 2016-09-11 20:27 ` Chris Murphy 0 siblings, 0 replies; 20+ messages in thread From: Chris Murphy @ 2016-09-11 20:27 UTC (permalink / raw) To: Adam Borowski; +Cc: Martin Steigerwald, Steven Haigh, Btrfs BTRFS On Sun, Sep 11, 2016 at 2:06 PM, Adam Borowski <kilobyte@angband.pl> wrote: > On Sun, Sep 11, 2016 at 09:48:35PM +0200, Martin Steigerwald wrote: >> Hmm… I found this from being referred to by reading Debian wiki page on >> BTRFS¹. >> >> I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an >> issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6? >> >> I just want to assess whether using compress=lzo might be dangerous to use in >> my setup. Actually right now I like to keep using it, since I think at least >> one of the SSDs does not compress. And… well… /home and / where I use it are >> both quite full already. >> >> [1] https://wiki.debian.org/Btrfs#WARNINGS > > I have used compress=lzo for years, kernels 3.8, 3.13 and 3.14 (a bunch of > machines), without a single glitch; heavy snapshotting, single dev only, no > quota. Until recently I did never balanced. > > I did have a case of ENOSPC with <80% full on 4.7 which might or might not > be related to compress=lzo. I'm not finding it off hand, but Duncan has some experience with this issue, where he'd occasionally have some sort of problem (hand wave), I don't know how serious it was, maybe just scary warnings like a call trace or something, but no actual problem? My recollection is that compression might be making certain edge case problems more difficult to recover from. I don't know why that would be, as metadata itself isn't compressed (the inline data saved in metadata nodes can be compressed). But there you go, if things start going wonky compression might make it more difficult. But that's speculative. And I also don't know if there's any difference between lzo and zlib in this regard either. -- Chris Murphy ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? 2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald 2016-09-11 20:06 ` Adam Borowski @ 2016-09-11 20:49 ` Hans van Kranenburg 2016-09-12 4:36 ` Duncan 2016-09-12 1:00 ` Steven Haigh 2 siblings, 1 reply; 20+ messages in thread From: Hans van Kranenburg @ 2016-09-11 20:49 UTC (permalink / raw) To: Martin Steigerwald, Steven Haigh; +Cc: linux-btrfs On 09/11/2016 09:48 PM, Martin Steigerwald wrote: > Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh: >> On 26/06/16 12:30, Duncan wrote: >>> Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted: >>>> In every case, it was a flurry of csum error messages, then instant >>>> death. >>> >>> This is very possibly a known bug in btrfs, that occurs even in raid1 >>> where a later scrub repairs all csum errors. While in theory btrfs raid1 >>> should simply pull from the mirrored copy if its first try fails checksum >>> (assuming the second one passes, of course), and it seems to do this just >>> fine if there's only an occasional csum error, if it gets too many at >>> once, it *does* unfortunately crash [...] [...] >>> different, but either way, the whole thing about too many csum errors at >>> once triggering a system crash sure does sound familiar, here. >> >> Yes, I was running the compress=lzo option as well... Maybe here lays a >> common problem? > > Hmm… I found this from being referred to by reading Debian wiki page on > BTRFS¹. > > I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found an > issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6? To quote you from the "stability a joke" thread (which I guess this might be related to)... "For me so far even compress=lzo seems to be stable, but well for others it may not." So, you can use a lot of compress without problems for years. Only if your hardware is starting to break in a specific way, causing lots and lots of checksum errors, the kernel might not be able to handle all of them at the same time currently. The compress might be super stable itself, but in this case another part of the filesystem is not perfecty able to handle certain failure scenario's involving it. Another way to find out about "are there issues with compression" is looking in the kernel git history. When searching for "compression" and "corruption", you'll find fixes like these: commit 0305cd5f7fca85dae392b9ba85b116896eb7c1c7 Author: Filipe Manana <fdmanana@suse.com> Date: Fri Oct 16 12:34:25 2015 +0100 Btrfs: fix truncation of compressed and inlined extents commit 808f80b46790f27e145c72112189d6a3be2bc884 Author: Filipe Manana <fdmanana@suse.com> Date: Mon Sep 28 09:56:26 2015 +0100 Btrfs: update fix for read corruption of compressed and shared extents commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7 Author: Filipe Manana <fdmanana@suse.com> Date: Mon Sep 14 09:09:31 2015 +0100 Btrfs: fix read corruption of compressed and shared extents commit 619d8c4ef7c5dd346add55da82c9179cd2e3387e Author: Filipe Manana <fdmanana@suse.com> Date: Sun May 3 01:56:00 2015 +0100 Btrfs: incremental send, fix clone operations for compressed extents These commits fix actual data corruption issues. Still, it might be bugs that you've never seen, even when using a kernel with these bugs for years, because they require a certain "nasty sequence of events" to trigger. But, when using compression you certainly want to have these commits in the kernel you're running right now. And when the bugs caused corruption, using a fixed kernel with not retroactively fix the corrupt data. Hint: "this was fixed in 4.x.y, so run that version or later" is not always the only answer here, because you'll see that fixes like these even show up in kernels like 3.16.y But maybe I should continue by replying on the joke thread instead of typing more here. -- Hans van Kranenburg ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? 2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg @ 2016-09-12 4:36 ` Duncan 2016-09-17 9:30 ` Kai Krakow 0 siblings, 1 reply; 20+ messages in thread From: Duncan @ 2016-09-12 4:36 UTC (permalink / raw) To: linux-btrfs Hans van Kranenburg posted on Sun, 11 Sep 2016 22:49:58 +0200 as excerpted: > So, you can use a lot of compress without problems for years. > > Only if your hardware is starting to break in a specific way, causing > lots and lots of checksum errors, the kernel might not be able to handle > all of them at the same time currently. > > The compress might be super stable itself, but in this case another part > of the filesystem is not perfecty able to handle certain failure > scenario's involving it. Well put. In my case I had problems trigger due to exactly two things, tho there are obviously other ways of triggering the same issues, including a crash in the middle of a commit, with one copy of the raid1 already updated while the other is still being written.: 1) I first discovered the problem when one of my pair of ssds was going bad. Because I had btrfs raid1 and could normally scrub-fix things, and because I had backups anyway, I chose to continue running it for some time, just to see how it handled things, as more and more sectors became unwritable and were replaced by spares. By the end I had several MiB worth of spares in-use, altho smart reported I had only used about 15% of the available spares, but by then it was getting bad enough and the newness had worn off, so I just replaced it and got rid of the hassle. But as a result of the above, I had a *LOT* of practice with btrfs recovery, mostly running scrub. And what I found was that if btrfs raid1 encounters too many checksum errors in compressed data it will crash btrfs and the kernel, even when it *SHOULD* recover from the other device because it has a good copy, as demonstrated by the fact that after a reboot, I could run a scrub and fix everything, no uncorrected errors at all. At first I thought it was just the way btrfs worked -- that it could handle a few checksum errors but not too many at once. I had no idea it was compression related. But nobody else seemed to mention the problem, which I though a bit strange, until someone /did/ mention it, and furthermore, actually tested both compressed and uncompressed btrfs, and found the problem only when btrfs was reading compressed data. If the data wasn't compressed, btrfs went ahead and read the second copy correctly, without crashing the system, every time. The extra kink in this is that at the time, I had a boot-time service setup to cache (via cat > /dev/null) a bunch of files in a particular directory. This particular directory is a cache for news archives, with articles on some groups going back over a decade to 2002, and my news client (pan) is slow to startup with several gigs of cached messages like that, so I had the boot-time service pre-cache everything, so by the time I started X and pan, it would be done or nearly so and I'd not have to wait for pan to startup. The problem was that many of the new files were in this directory, and all that activity tended to hit the going-bad sectors on that ssd rather frequently, making one copy often bad. Additionally, these are mostly text messages, so they compress quite well, meaning compress=lzo would trigger compression on many of them. And because I had it reading them at boot, the kernel tended to overload on checksum errors before it finished booting, far more frequently than it would have otherwise. Of course, that would crash the system before I could get a login in ordered to run btrfs scrub and fix the problem. What I had to do then was boot to rescue mode, with the filesystems mounted but before normal services (including this caching service) ran, run the scrub from there, and then continue boot, which would then work just fine because I'd fixed all the checksum errors. But, as I said I eventually got tired of the hassle and just replaced the failing device. Btrfs replace worked nicely. =:^) 2a) My second trigger is that I've found that with multiple devices, as in multi-device btrfs, but also when I used to run mdraid, don't always resume from suspend-to-RAM very well. Often one device takes longer to wake up than the other(s), and the kernel will try to resume while one still isn't responding properly. (FWIW, I ran into this problem on spinning rust back on mdraid, but I see it now on ssds on btrfs as well, so it seems to be a common issue, which probably remains relatively obscure I'd guess because relatively few people with multi-device btrfs or mdraid do suspend-to-ram.) The result is that btrfs will try to write to the remaining device(s), getting them out of sync with the one that isn't responding properly yet. Ultimately this leads to a crash if I don't catch it and complete a controlled shutdown before that, and sometimes I see the same crash-on- boot-due-to-too-many-checksum-errors problem I saw with #1. I no longer have that caching job running at boot and thus don't see it as often, but it still happens occasionally. Again, once I boot to rescue mode and run scrub, it fixes the problem and I can resume the normal mode boot without further issue. So I pretty much quit suspending to RAM, at least for any longer period, and just shutdown and reboot, now. With systemd and ssds, the boot doesn't take significantly longer anyway, tho it does mean I can't simply resume and pick up where I was, I have to reopen my work, etc. 2b) Closely related to #2a and most recent, since I'm no longer trying to suspend to RAM, I think one of the ssds now has a bad backup capacitor or something, as if I leave it idle for too long it'll fail to respond once I start trying to use it again. Same story, the other device gets writes that the unresponsive device is missing, and eventually if I don't reboot I crash. Upon reboot, again, if there were too many things written to the device that stayed up that didn't make it to the other one, it can trigger a crash due to checksum failure. However, if I can get a command prompt, either because it boots all the way or because I boot to rescue mode, I can run a scrub and update the bad device from the good one, and then everything works fine once again... until the device goes unresponsive, again. Again, I once thought all this was just the stage at which btrfs was, until I found out that it doesn't seem to happen if btrfs compression isn't being used. Something about the way it recovers from checksum errors on compressed data differs from the way it recovers from checksum errors on uncompressed data, and there's a bug in the compressed data processing path. But beyond that, I'm not a dev and it gets a bit fuzzy, which also explains why I've not gone code diving and submitted patches to try to fix it, myself. But if I'm correct, it probably doesn't matter what the compression type is, only how much of it there is. So compress-force would tend to trigger the issue far more frequently than simply compress, unless of course your use-case is a corner-case like my trying to read all those compressible text messages into cache at boot was, but compress (or compress-force) =lzo vs =zlib shouldn't matter. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? 2016-09-12 4:36 ` Duncan @ 2016-09-17 9:30 ` Kai Krakow 0 siblings, 0 replies; 20+ messages in thread From: Kai Krakow @ 2016-09-17 9:30 UTC (permalink / raw) To: linux-btrfs Am Mon, 12 Sep 2016 04:36:07 +0000 (UTC) schrieb Duncan <1i5t5.duncan@cox.net>: > Again, I once thought all this was just the stage at which btrfs was, > until I found out that it doesn't seem to happen if btrfs compression > isn't being used. Something about the way it recovers from checksum > errors on compressed data differs from the way it recovers from > checksum errors on uncompressed data, and there's a bug in the > compressed data processing path. But beyond that, I'm not a dev and > it gets a bit fuzzy, which also explains why I've not gone code > diving and submitted patches to try to fix it, myself. I suspect that may very well come from the decompression routine which crashes - and not from btrfs itself. So essentially, the decompression needs to be fixed instead (which probably slows it down by factors). Only when this is tested and fixed, one should look into why btrfs fails when decompression fails. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: compress=lzo safe to use? 2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald 2016-09-11 20:06 ` Adam Borowski 2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg @ 2016-09-12 1:00 ` Steven Haigh 2 siblings, 0 replies; 20+ messages in thread From: Steven Haigh @ 2016-09-12 1:00 UTC (permalink / raw) To: Martin Steigerwald; +Cc: linux-btrfs On 2016-09-12 05:48, Martin Steigerwald wrote: > Am Sonntag, 26. Juni 2016, 13:13:04 CEST schrieb Steven Haigh: >> On 26/06/16 12:30, Duncan wrote: >> > Steven Haigh posted on Sun, 26 Jun 2016 02:39:23 +1000 as excerpted: >> >> In every case, it was a flurry of csum error messages, then instant >> >> death. >> > >> > This is very possibly a known bug in btrfs, that occurs even in raid1 >> > where a later scrub repairs all csum errors. While in theory btrfs raid1 >> > should simply pull from the mirrored copy if its first try fails checksum >> > (assuming the second one passes, of course), and it seems to do this just >> > fine if there's only an occasional csum error, if it gets too many at >> > once, it *does* unfortunately crash, despite the second copy being >> > available and being just fine as later demonstrated by the scrub fixing >> > the bad copy from the good one. >> > >> > I'm used to dealing with that here any time I have a bad shutdown (and >> > I'm running live-git kde, which currently has a bug that triggers a >> > system crash if I let it idle and shut off the monitors, so I've been >> > getting crash shutdowns and having to deal with this unfortunately often, >> > recently). Fortunately I keep my root, with all system executables, etc, >> > mounted read-only by default, so it's not affected and I can /almost/ >> > boot normally after such a crash. The problem is /var/log and /home >> > (which has some parts of /var that need to be writable symlinked into / >> > home/var, so / can stay read-only). Something in the normal after-crash >> > boot triggers enough csum errors there that I often crash again. >> > >> > So I have to boot to emergency mode and manually mount the filesystems in >> > question, so nothing's trying to access them until I run the scrub and >> > fix the csum errors. Scrub itself doesn't trigger the crash, thankfully, >> > and once it has repaired all the csum errors due to partial writes on one >> > mirror that either were never made or were properly completed on the >> > other mirror, I can exit emergency mode and complete the normal boot (to >> > the multi-user default target). As there's no more csum errors then >> > because scrub fixed them all, the boot doesn't crash due to too many such >> > errors, and I'm back in business. >> > >> > >> > Tho I believe at least the csum bug that affects me may only trigger if >> > compression is (or perhaps has been in the past) enabled. Since I run >> > compress=lzo everywhere, that would certainly affect me. It would also >> > explain why the bug has remained around for quite some time as well, >> > since presumably the devs don't run with compression on enough for this >> > to have become a personal itch they needed to scratch, thus its remaining >> > untraced and unfixed. >> > >> > So if you weren't using the compress option, your bug is probably >> > different, but either way, the whole thing about too many csum errors at >> > once triggering a system crash sure does sound familiar, here. >> >> Yes, I was running the compress=lzo option as well... Maybe here lays >> a >> common problem? > > Hmm… I found this from being referred to by reading Debian wiki page on > BTRFS¹. > > I use compress=lzo on BTRFS RAID 1 since April 2014 and I never found > an > issue. Steven, your filesystem wasn´t RAID 1 but RAID 5 or 6? Yes, I was using RAID6 - and it has had a track record of eating data. There's lots of problems with the implementation / correctness of RAID5/6 parity - which I'm pretty sure haven't been nailed down yet. The recommendation at the moment is just not to use RAID5 or RAID6 modes of BTRFS. The last I heard, if you were using RAID5/6 in BTRFS, the recommended action was to migrate your data to a different profile or a different FS. > I just want to assess whether using compress=lzo might be dangerous to > use in > my setup. Actually right now I like to keep using it, since I think at > least > one of the SSDs does not compress. And… well… /home and / where I use > it are > both quite full already. I don't believe the compress=lzo option by itself was a problem - but it *may* have an impact in the RAID5/6 parity problems? I'd be guessing here, but am happy to be corrected. -- Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-09-17 9:31 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-24 14:52 Trying to rescue my data :( Steven Haigh 2016-06-24 16:26 ` Steven Haigh 2016-06-24 16:59 ` ronnie sahlberg 2016-06-24 17:05 ` Steven Haigh 2016-06-24 17:40 ` Austin S. Hemmelgarn 2016-06-24 17:43 ` Steven Haigh 2016-06-24 17:50 ` Austin S. Hemmelgarn 2016-06-25 4:19 ` Steven Haigh 2016-06-25 16:25 ` Chris Murphy 2016-06-25 16:39 ` Steven Haigh 2016-06-25 17:14 ` Chris Murphy 2016-06-26 2:30 ` Duncan 2016-06-26 3:13 ` Steven Haigh 2016-09-11 19:48 ` compress=lzo safe to use? (was: Re: Trying to rescue my data :() Martin Steigerwald 2016-09-11 20:06 ` Adam Borowski 2016-09-11 20:27 ` Chris Murphy 2016-09-11 20:49 ` compress=lzo safe to use? Hans van Kranenburg 2016-09-12 4:36 ` Duncan 2016-09-17 9:30 ` Kai Krakow 2016-09-12 1:00 ` Steven Haigh
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.