* 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean @ 2017-06-20 14:39 Marc MERLIN 2017-06-20 15:23 ` Hugo Mills 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-20 14:39 UTC (permalink / raw) To: linux-btrfs My filesystem got remounted read only, and yet after a lengthy btrfs check --repair, it ran clean. Any idea what went wrong? [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 [847312.529660] BTRFS: Transaction aborted (error -17) [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists [847313.247668] BTRFS info (device dm-1): forced readonly gargamel:~# btrfs check --repair /dev/mapper/dshelf2 enabling repair mode Checking filesystem on /dev/mapper/dshelf2 UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede checking extents Fixed 0 roots. checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 5544539336704 bytes used, no error found total csum bytes: 5344305964 total tree bytes: 70455754752 total fs tree bytes: 58427670528 total extent tree bytes: 5372461056 btree space waste bytes: 10620592981 file data blocks allocated: 7735818444800 referenced 6155805896704 this is how it went read only: [846332.977964] ------------[ cut here ]------------ [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 [846333.402648] CPU: 4 PID: 4095 Comm: btrfs-transacti Tainted: G U 4.11.3-amd64-preempt-sysrq-20170406 #5 [846333.434917] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 [846333.463597] Call Trace: [846333.469942] usb 2-1-port4: device 2-1.4 not suspended yet [846333.489639] dump_stack+0x61/0x7d [846333.500480] __warn+0xc2/0xdd [846333.510956] warn_slowpath_null+0x1d/0x1f [846333.524103] tree_insert_offset+0x78/0xb1 [846333.537337] link_free_space+0x2c/0x41 [846333.549991] __btrfs_add_free_space+0x89/0x3aa [846333.564236] ? kmem_cache_free+0x3d/0x92 [846333.577702] btrfs_add_free_space+0x1d/0x1f [846333.591179] unpin_extent_range+0xf3/0x2b0 [846333.605220] btrfs_finish_extent_commit+0xda/0x1d4 [846333.621324] btrfs_commit_transaction+0x629/0x79a [846333.637205] ? add_wait_queue+0x44/0x44 [846333.649680] transaction_kthread+0xe2/0x178 [846333.663201] ? btrfs_cleanup_transaction+0x3e8/0x3e8 [846333.679033] kthread+0xfb/0x100 [846333.690261] ? init_completion+0x24/0x24 [846333.703239] ? do_fast_syscall_32+0xb7/0xfe [846333.717649] ret_from_fork+0x2c/0x40 [846333.729656] ---[ end trace 27aa532d1886e536 ]--- [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 [847312.529660] BTRFS: Transaction aborted (error -17) [847312.912784] CPU: 6 PID: 4094 Comm: btrfs-cleaner Tainted: G U W 4.11.3-amd64-preempt-sysrq-20170406 #5 [847312.913132] usb 2-1-port4: device 2-1.4 not suspended yet [847312.962394] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 [847312.990936] Call Trace: [847312.999347] dump_stack+0x61/0x7d [847313.010383] __warn+0xc2/0xdd [847313.020351] warn_slowpath_fmt+0x5a/0x76 [847313.033274] btrfs_run_delayed_refs+0xb1/0x1cc [847313.047655] btrfs_should_end_transaction+0x50/0x57 [847313.063910] btrfs_drop_snapshot+0x38a/0x6c4 [847313.078619] ? btrfs_kill_all_delayed_nodes+0x5f/0xd7 [847313.094916] ? _raw_spin_lock+0x15/0x17 [847313.108325] btrfs_clean_one_deleted_snapshot+0xce/0xdc [847313.125493] cleaner_kthread+0x91/0x14b [847313.138228] ? btrfs_destroy_pinned_extent+0xd2/0xd2 [847313.154308] kthread+0xfb/0x100 [847313.164900] ? init_completion+0x24/0x24 [847313.177781] ? do_fast_syscall_32+0xb7/0xfe [847313.191490] ret_from_fork+0x2c/0x40 [847313.203432] ---[ end trace 27aa532d1886e537 ]--- [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists [847313.247668] BTRFS info (device dm-1): forced readonly [849789.173126] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229 [849789.218675] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229 [863279.783590] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.827526] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.857797] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.888096] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.918393] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.948740] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863279.979033] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863280.009362] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863280.040438] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 [863280.070966] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN @ 2017-06-20 15:23 ` Hugo Mills 2017-06-20 15:26 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Hugo Mills @ 2017-06-20 15:23 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 5818 bytes --] On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote: > My filesystem got remounted read only, and yet after a lengthy > btrfs check --repair, it ran clean. > > Any idea what went wrong? > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 > [847312.529660] BTRFS: Transaction aborted (error -17) > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists Error 17 is EEXIST, so I'd guess (and it is a guess) that it's trying to add a free space cache record for some space that already has such a record. This might also match with: [...] > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 [...] > cache and super generation don't match, space cache will be invalidated [...] I'd try clearing the cache (mount with -o clear_cache, once), and then letting it rebuild. Hugo. > checking fs roots > checking csums > checking root refs > found 5544539336704 bytes used, no error found > total csum bytes: 5344305964 > total tree bytes: 70455754752 > total fs tree bytes: 58427670528 > total extent tree bytes: 5372461056 > btree space waste bytes: 10620592981 > file data blocks allocated: 7735818444800 > referenced 6155805896704 > > > this is how it went read only: > [846332.977964] ------------[ cut here ]------------ > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 > [846333.402648] CPU: 4 PID: 4095 Comm: btrfs-transacti Tainted: G U 4.11.3-amd64-preempt-sysrq-20170406 #5 > [846333.434917] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 > [846333.463597] Call Trace: > [846333.469942] usb 2-1-port4: device 2-1.4 not suspended yet > [846333.489639] dump_stack+0x61/0x7d > [846333.500480] __warn+0xc2/0xdd > [846333.510956] warn_slowpath_null+0x1d/0x1f > [846333.524103] tree_insert_offset+0x78/0xb1 > [846333.537337] link_free_space+0x2c/0x41 > [846333.549991] __btrfs_add_free_space+0x89/0x3aa > [846333.564236] ? kmem_cache_free+0x3d/0x92 > [846333.577702] btrfs_add_free_space+0x1d/0x1f > [846333.591179] unpin_extent_range+0xf3/0x2b0 > [846333.605220] btrfs_finish_extent_commit+0xda/0x1d4 > [846333.621324] btrfs_commit_transaction+0x629/0x79a > [846333.637205] ? add_wait_queue+0x44/0x44 > [846333.649680] transaction_kthread+0xe2/0x178 > [846333.663201] ? btrfs_cleanup_transaction+0x3e8/0x3e8 > [846333.679033] kthread+0xfb/0x100 > [846333.690261] ? init_completion+0x24/0x24 > [846333.703239] ? do_fast_syscall_32+0xb7/0xfe > [846333.717649] ret_from_fork+0x2c/0x40 > [846333.729656] ---[ end trace 27aa532d1886e536 ]--- > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 > > [847312.529660] BTRFS: Transaction aborted (error -17) > [847312.912784] CPU: 6 PID: 4094 Comm: btrfs-cleaner Tainted: G U W 4.11.3-amd64-preempt-sysrq-20170406 #5 > [847312.913132] usb 2-1-port4: device 2-1.4 not suspended yet > [847312.962394] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 > [847312.990936] Call Trace: > [847312.999347] dump_stack+0x61/0x7d > [847313.010383] __warn+0xc2/0xdd > [847313.020351] warn_slowpath_fmt+0x5a/0x76 > [847313.033274] btrfs_run_delayed_refs+0xb1/0x1cc > [847313.047655] btrfs_should_end_transaction+0x50/0x57 > [847313.063910] btrfs_drop_snapshot+0x38a/0x6c4 > [847313.078619] ? btrfs_kill_all_delayed_nodes+0x5f/0xd7 > [847313.094916] ? _raw_spin_lock+0x15/0x17 > [847313.108325] btrfs_clean_one_deleted_snapshot+0xce/0xdc > [847313.125493] cleaner_kthread+0x91/0x14b > [847313.138228] ? btrfs_destroy_pinned_extent+0xd2/0xd2 > [847313.154308] kthread+0xfb/0x100 > [847313.164900] ? init_completion+0x24/0x24 > [847313.177781] ? do_fast_syscall_32+0xb7/0xfe > [847313.191490] ret_from_fork+0x2c/0x40 > [847313.203432] ---[ end trace 27aa532d1886e537 ]--- > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists > [847313.247668] BTRFS info (device dm-1): forced readonly > > [849789.173126] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229 > [849789.218675] BTRFS error (device dm-1): parent transid verify failed on 1935589703680 wanted 37959 found 3229 > > [863279.783590] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.827526] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.857797] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.888096] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.918393] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.948740] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863279.979033] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863280.009362] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863280.040438] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > [863280.070966] BTRFS error (device dm-1): parent transid verify failed on 1932065177600 wanted 37959 found 3634 > -- Hugo Mills | I believe that it's closely correlated with the hugo@... carfax.org.uk | aeroswine coefficient http://carfax.org.uk/ | PGP: E2AB1DE4 | Adrian Bridgett [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 15:23 ` Hugo Mills @ 2017-06-20 15:26 ` Marc MERLIN 2017-06-20 15:36 ` Hugo Mills 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-20 15:26 UTC (permalink / raw) To: Hugo Mills, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1607 bytes --] On Tue, Jun 20, 2017 at 03:23:54PM +0000, Hugo Mills wrote: > On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote: > > My filesystem got remounted read only, and yet after a lengthy > > btrfs check --repair, it ran clean. > > > > Any idea what went wrong? > > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 > > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 > > [847312.529660] BTRFS: Transaction aborted (error -17) > > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists > > Error 17 is EEXIST, so I'd guess (and it is a guess) that it's > trying to add a free space cache record for some space that already > has such a record. This might also match with: Thanks for having a look. Is it a bug, or is it a problem with my storage subsystem? > [...] > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > [...] > > cache and super generation don't match, space cache will be invalidated > [...] > > I'd try clearing the cache (mount with -o clear_cache, once), and > then letting it rebuild. "space cache will be invalidated " => doesn't that mean that my cache was already cleared by check --repair, or are you saying I need to clear it again? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 15:26 ` Marc MERLIN @ 2017-06-20 15:36 ` Hugo Mills 2017-06-20 15:44 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Hugo Mills @ 2017-06-20 15:36 UTC (permalink / raw) To: Marc MERLIN; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1999 bytes --] On Tue, Jun 20, 2017 at 08:26:48AM -0700, Marc MERLIN wrote: > On Tue, Jun 20, 2017 at 03:23:54PM +0000, Hugo Mills wrote: > > On Tue, Jun 20, 2017 at 07:39:16AM -0700, Marc MERLIN wrote: > > > My filesystem got remounted read only, and yet after a lengthy > > > btrfs check --repair, it ran clean. > > > > > > Any idea what went wrong? > > > [846332.992285] WARNING: CPU: 4 PID: 4095 at fs/btrfs/free-space-cache.c:1476 tree_insert_offset+0x78/0xb1 > > > [846333.744721] BTRFS critical (device dm-1): unable to add free space :-17 > > > [847312.529660] BTRFS: Transaction aborted (error -17) > > > [847313.218391] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists > > > > Error 17 is EEXIST, so I'd guess (and it is a guess) that it's > > trying to add a free space cache record for some space that already > > has such a record. This might also match with: > > Thanks for having a look. Is it a bug, or is it a problem with my storage > subsystem? Well, I'd say it's probably a problem with some inconsistent data on the disk. How that data got there is another matter -- it may be due to a bug which wrote the inconsistent data some time ago, and has only now been found out. > > [...] > > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > > [...] > > > cache and super generation don't match, space cache will be invalidated > > [...] > > > > I'd try clearing the cache (mount with -o clear_cache, once), and > > then letting it rebuild. > > "space cache will be invalidated " => doesn't that mean that my cache was > already cleared by check --repair, or are you saying I need to clear it > again? I'm never quite sure about that one. :) It can't hurt to clear it manually as well. Hugo. -- Hugo Mills | I believe that it's closely correlated with the hugo@... carfax.org.uk | aeroswine coefficient http://carfax.org.uk/ | PGP: E2AB1DE4 | Adrian Bridgett [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 15:36 ` Hugo Mills @ 2017-06-20 15:44 ` Marc MERLIN 2017-06-20 23:12 ` Marc MERLIN 2017-06-21 3:26 ` Chris Murphy 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-20 15:44 UTC (permalink / raw) To: Hugo Mills, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 1797 bytes --] On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote: > > Thanks for having a look. Is it a bug, or is it a problem with my storage > > subsystem? > > Well, I'd say it's probably a problem with some inconsistent data > on the disk. How that data got there is another matter -- it may be > due to a bug which wrote the inconsistent data some time ago, and has > only now been found out. Understood. > > "space cache will be invalidated " => doesn't that mean that my cache was > > already cleared by check --repair, or are you saying I need to clear it > > again? > > I'm never quite sure about that one. :) > > It can't hurt to clear it manually as well. Sounds good, done. In the meantime, I ran into this again: https://bugzilla.kernel.org/show_bug.cgi?id=195863 btrfs check of a big filesystem kills the kernel due to OOM (but btrfs userspace is not OOM killed) Is it achievable at all for btrfs check to realize that it's taking all the available RAM in kernel space, is about to crash the system, and cancel the check before the system crashes? I've already confirmed that it doesn't use swap. I've just had to order new RAM to upgrade my machine from 24GB to 32GB, but 32GB is max for that hardware, so hopefully the lowmem repair stuff will work before I hit the 32GB limit next time. In the meantime, though, it really shouldn't crash your system (potentially causing more damage in the process because you end up with an unclean shutdown). Can anyone look at this? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 15:44 ` Marc MERLIN @ 2017-06-20 23:12 ` Marc MERLIN 2017-06-20 23:58 ` Marc MERLIN ` (2 more replies) 2017-06-21 3:26 ` Chris Murphy 1 sibling, 3 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-20 23:12 UTC (permalink / raw) To: Hugo Mills, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 3513 bytes --] On Tue, Jun 20, 2017 at 08:44:29AM -0700, Marc MERLIN wrote: > On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote: > > > Thanks for having a look. Is it a bug, or is it a problem with my storage > > > subsystem? > > > > Well, I'd say it's probably a problem with some inconsistent data > > on the disk. How that data got there is another matter -- it may be > > due to a bug which wrote the inconsistent data some time ago, and has > > only now been found out. > > Understood. > > > > "space cache will be invalidated " => doesn't that mean that my cache was > > > already cleared by check --repair, or are you saying I need to clear it > > > again? > > > > I'm never quite sure about that one. :) > > > > It can't hurt to clear it manually as well. > > Sounds good, done. Except it didn't help :( It worked for a while, and failed again. It looks like I'm hitting a persistent bug :( [ 86.383988] BTRFS: device label dshelf2 devid 1 transid 37975 /dev/mapper/dshelf2 [ 98.232529] BTRFS info (device dm-1): use lzo compression [ 98.251982] BTRFS info (device dm-1): disk space caching is enabled [ 98.274847] BTRFS info (device dm-1): has skinny extents [ 104.171597] BTRFS info (device dm-1): detected SSD devices, enabling SSD mode [ 165.429894] BTRFS error (device dm-1): Duplicate entries in free space cache, dumping [ 165.455673] BTRFS warning (device dm-1): failed to load free space cache for block group 2039601954816, rebuilding it now [ 234.221435] BTRFS warning (device dm-1): block group 2837392130048 has wrong amount of free space [ 234.249264] BTRFS warning (device dm-1): failed to load free space cache for block group 2837392130048, rebuilding it now [ 234.636396] BTRFS warning (device dm-1): block group 2885173641216 has wrong amount of free space [ 234.664015] BTRFS warning (device dm-1): failed to load free space cache for block group 2885173641216, rebuilding it now [ 242.042940] BTRFS warning (device dm-1): block group 3116565004288 has wrong amount of free space [ 242.071207] BTRFS warning (device dm-1): failed to load free space cache for block group 3116565004288, rebuilding it now [ 273.910918] BTRFS warning (device dm-1): block group 3209980542976 has wrong amount of free space [ 273.937625] BTRFS warning (device dm-1): failed to load free space cache for block group 3209980542976, rebuilding it now [ 298.578615] BTRFS warning (device dm-1): block group 2305889927168 has wrong amount of free space [ 298.605250] BTRFS warning (device dm-1): failed to load free space cache for block group 2305889927168, rebuilding it now [ 873.265687] BTRFS: Transaction aborted (error -17) [ 873.948245] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2961: errno=-17 Object already exists [ 873.978884] BTRFS info (device dm-1): forced readonly Given that check --repair ran clean when I ran it yesterday after this first happened, and I then ran mount -o clear_cache , the cache got rebuilt, and I got the problem again, this is not looking good, seems like a persistent bug :-/ I'm now going to remount this with nospace_cache to see if your guess about space_cache was correct. Other suggestions also welcome :) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 23:12 ` Marc MERLIN @ 2017-06-20 23:58 ` Marc MERLIN 2017-06-21 3:31 ` Chris Murphy 2017-06-21 12:04 ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan 2 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-20 23:58 UTC (permalink / raw) To: Hugo Mills, linux-btrfs [-- Attachment #1: Type: text/plain, Size: 4740 bytes --] On Tue, Jun 20, 2017 at 04:12:03PM -0700, Marc MERLIN wrote: > Given that check --repair ran clean when I ran it yesterday after this first happened, > and I then ran mount -o clear_cache , the cache got rebuilt, and I got the problem again, > this is not looking good, seems like a persistent bug :-/ > > I'm now going to remount this with nospace_cache to see if your guess about > space_cache was correct. Now, it seems that disabling the cache is causing some serious hangs: [ 2055.473113] INFO: task kworker/u16:17:7579 blocked for more than 120 seconds. [ 2055.496148] Tainted: G U 4.11.6-amd64-preempt-sysrq-20170406 #6 [ 2055.520611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2055.545675] kworker/u16:17 D 0 7579 2 0x00000080 [ 2055.563626] Workqueue: writeback wb_workfn (flush-btrfs-4) [ 2055.581458] Call Trace: [ 2055.590154] __schedule+0x4ef/0x627 [ 2055.602830] schedule+0x89/0x9a [ 2055.613618] io_schedule+0x16/0x38 [ 2055.625324] wait_on_page_bit_common+0xd8/0x151 [ 2055.640413] ? inode_to_bdi+0x35/0x35 [ 2055.653701] __lock_page+0x40/0x42 [ 2055.665431] lock_page+0x19/0x1c [ 2055.676315] extent_write_cache_pages.constprop.31+0x173/0x368 [ 2055.695049] ? update_load_avg+0x227/0x3c6 [ 2055.708592] ? update_load_avg+0x3b1/0x3c6 [ 2055.722340] ? list_add+0x1a/0x34 [ 2055.733520] ? cfs_rq_throttled.isra.24+0xd/0x1d [ 2055.748503] ? update_cfs_shares+0x2e/0xcf [ 2055.761891] extent_writepages+0x5b/0x80 [ 2055.774854] ? __percpu_counter_compare+0x29/0x72 [ 2055.790054] ? insert_reserved_file_extent.constprop.41+0x28e/0x28e [ 2055.809869] btrfs_writepages+0x28/0x2a [ 2055.822516] do_writepages+0x20/0x29 [ 2055.834251] __writeback_single_inode+0x8a/0x328 [ 2055.849159] ? inode_cgwb_enabled+0xd/0x3b [ 2055.862521] writeback_sb_inodes+0x22e/0x400 [ 2055.876310] __writeback_inodes_wb+0x6e/0xb0 [ 2055.890057] wb_writeback+0x163/0x2ca [ 2055.902436] wb_workfn+0x1f7/0x2bf [ 2055.913520] ? wb_workfn+0x1f7/0x2bf [ 2055.925090] ? __switch_to+0x2c8/0x45f [ 2055.937184] process_one_work+0x193/0x2b0 [ 2055.950034] ? rescuer_thread+0x2b1/0x2b1 [ 2055.962833] worker_thread+0x1e9/0x2c1 [ 2055.974826] ? rescuer_thread+0x2b1/0x2b1 [ 2055.988016] kthread+0xfb/0x100 [ 2055.998183] ? init_completion+0x24/0x24 [ 2056.010902] ? do_syscall_64+0x77/0x7d [ 2056.022802] ret_from_fork+0x2c/0x40 [ 2056.034224] INFO: task rsync:27554 blocked for more than 120 seconds. [ 2056.054213] Tainted: G U 4.11.6-amd64-preempt-sysrq-20170406 #6 [ 2056.077611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2056.101705] rsync D 0 27554 27526 0x20020080 [ 2056.119102] Call Trace: [ 2056.127019] __schedule+0x4ef/0x627 [ 2056.138385] schedule+0x89/0x9a [ 2056.148616] io_schedule+0x16/0x38 [ 2056.159682] wait_on_page_bit_common+0xd8/0x151 [ 2056.173787] ? inode_to_bdi+0x35/0x35 [ 2056.185336] __lock_page+0x40/0x42 [ 2056.196176] lock_page+0x19/0x1c [ 2056.206420] extent_write_cache_pages.constprop.31+0x173/0x368 [ 2056.224786] ? _raw_read_unlock+0xe/0x1e [ 2056.237221] ? btrfs_set_lock_blocking_rw+0x9a/0x9d [ 2056.252388] extent_writepages+0x5b/0x80 [ 2056.264687] ? insert_reserved_file_extent.constprop.41+0x28e/0x28e [ 2056.284051] btrfs_writepages+0x28/0x2a [ 2056.296117] do_writepages+0x20/0x29 [ 2056.307426] __filemap_fdatawrite_range+0x97/0xc3 [ 2056.322374] filemap_flush+0x1c/0x1e [ 2056.333627] btrfs_rename2+0x894/0xf6f [ 2056.345376] ? capable_wrt_inode_uidgid+0x3f/0x4e [ 2056.359977] ? generic_permission+0x11e/0x175 [ 2056.373719] vfs_rename+0x234/0x391 [ 2056.384805] ? vfs_rename+0x234/0x391 [ 2056.396341] SYSC_renameat2+0x327/0x448 [ 2056.408349] SyS_rename+0x1e/0x20 [ 2056.418806] do_fast_syscall_32+0xb7/0xfe [ 2056.431325] entry_SYSENTER_compat+0x4c/0x5b [ 2056.444642] RIP: 0023:0xf76feb39 [ 2056.454861] RSP: 002b:00000000ffe177bc EFLAGS: 00000292 ORIG_RAX: 0000000000000026 [ 2056.478081] RAX: ffffffffffffffda RBX: 00000000ffe18890 RCX: 00000000ffe1a890 [ 2056.500019] RDX: 0000000000000001 RSI: 00000000ffe1a890 RDI: 0000000000000003 [ 2056.521948] RBP: 00000000ffe177f8 R08: 0000000000000000 R09: 0000000000000000 [ 2056.543858] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 2056.565809] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 291 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 23:12 ` Marc MERLIN 2017-06-20 23:58 ` Marc MERLIN @ 2017-06-21 3:31 ` Chris Murphy 2017-06-21 3:43 ` Marc MERLIN 2017-06-21 12:04 ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan 2 siblings, 1 reply; 77+ messages in thread From: Chris Murphy @ 2017-06-21 3:31 UTC (permalink / raw) To: Marc MERLIN; +Cc: Hugo Mills, Btrfs BTRFS On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote: > I'm now going to remount this with nospace_cache to see if your guess about > space_cache was correct. > Other suggestions also welcome :) What results do you get with lowmem mode? It won't repair without additional patches, but might give a dev a clue what's going on. I regularly see normal mode check finds no problems, and lowmem mode finds problems. Lowmem mode is a total rewrite so it's a different implementation and can find things normal mode won't. -- Chris Murphy ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-21 3:31 ` Chris Murphy @ 2017-06-21 3:43 ` Marc MERLIN 2017-06-21 15:13 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-21 3:43 UTC (permalink / raw) To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote: > On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote: > > > I'm now going to remount this with nospace_cache to see if your guess about > > space_cache was correct. > > Other suggestions also welcome :) > > What results do you get with lowmem mode? It won't repair without > additional patches, but might give a dev a clue what's going on. I > regularly see normal mode check finds no problems, and lowmem mode > finds problems. Lowmem mode is a total rewrite so it's a different > implementation and can find things normal mode won't. Oh, I kind of forgot that lowmem mode looked for more things than regular mode. I will run this tonight and see what it says. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-21 3:43 ` Marc MERLIN @ 2017-06-21 15:13 ` Marc MERLIN 2017-06-21 23:22 ` Chris Murphy 2017-06-22 2:22 ` Qu Wenruo 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-21 15:13 UTC (permalink / raw) To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote: > On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote: > > On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote: > > > > > I'm now going to remount this with nospace_cache to see if your guess about > > > space_cache was correct. > > > Other suggestions also welcome :) > > > > What results do you get with lowmem mode? It won't repair without > > additional patches, but might give a dev a clue what's going on. I > > regularly see normal mode check finds no problems, and lowmem mode > > finds problems. Lowmem mode is a total rewrite so it's a different > > implementation and can find things normal mode won't. > > Oh, I kind of forgot that lowmem mode looked for more things than regular > mode. > I will run this tonight and see what it says. It's probably still a ways from being finished given how slow lowmem is in comparison, but sadly it found a bunch of problems which regular mode didn't find. I'm pretty bummed. I just spent way too long recreating this filesystem and the multiple btrfs send/receive relationships from other machines. Too a bit over a week :( It looks like the errors are not major (especially if the regular mode doesn't even see them), but without lowmem --repair, I'm kind of screwed. I'm wondering if I could/should leave those errors unfixed until lowmem --repair finally happens, or whether I'm looking at spending another week rebuilding this filesystem :-/ gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2 ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2 ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2 ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3 ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3 ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3 ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4 ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10 ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3 ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859 ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19 ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33 ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21 ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21 ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6 ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5 ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3 ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4 ERROR: errors found in extent allocation tree or chunk allocation cache and super generation don't match, space cache will be invalidated ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt (not finished, still going on) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-21 15:13 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN @ 2017-06-21 23:22 ` Chris Murphy 2017-06-22 0:48 ` Marc MERLIN 2017-06-22 2:22 ` Qu Wenruo 1 sibling, 1 reply; 77+ messages in thread From: Chris Murphy @ 2017-06-21 23:22 UTC (permalink / raw) To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS, Qu Wenruo On Wed, Jun 21, 2017 at 9:13 AM, Marc MERLIN <marc@merlins.org> wrote: > On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote: >> On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote: >> > On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote: >> > >> > > I'm now going to remount this with nospace_cache to see if your guess about >> > > space_cache was correct. >> > > Other suggestions also welcome :) >> > >> > What results do you get with lowmem mode? It won't repair without >> > additional patches, but might give a dev a clue what's going on. I >> > regularly see normal mode check finds no problems, and lowmem mode >> > finds problems. Lowmem mode is a total rewrite so it's a different >> > implementation and can find things normal mode won't. >> >> Oh, I kind of forgot that lowmem mode looked for more things than regular >> mode. >> I will run this tonight and see what it says. > > It's probably still a ways from being finished given how slow lowmem is in > comparison, but sadly it found a bunch of problems which regular mode didn't > find. > > I'm pretty bummed. I just spent way too long recreating this filesystem and > the multiple btrfs send/receive relationships from other machines. Too a bit > over a week :( > > It looks like the errors are not major (especially if the regular mode > doesn't even see them), but without lowmem --repair, I'm kind of screwed. > > I'm wondering if I could/should leave those errors unfixed until lowmem --repair > finally happens, or whether I'm looking at spending another week rebuilding > this filesystem :-/ > > > gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 > Checking filesystem on /dev/mapper/dshelf2 > UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede > ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 > ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2 > ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2 > ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2 > ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3 > ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3 > ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3 > ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4 > ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10 > ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3 > ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859 > ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19 > ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33 > ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21 > ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21 > ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6 > ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5 > ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3 > ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4 > ERROR: errors found in extent allocation tree or chunk allocation > cache and super generation don't match, space cache will be invalidated > ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt > ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt > ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt > ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt > ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt > ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt > ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt > > (not finished, still going on) I don't know what it means. Maybe Qu has some idea. He might want a btrfs-image of this file system to see if it's a bug. There are still some bugs found with lowmem mode, so these could be bogus messages. But the file system clearly has problems, the question is why does such a new file system have these kinds of problems that can't be fixed by normal repair because they aren't even being detected; or maybe there is no problem on disk per se, the problem might be a bug. In which case, off chance going back to a substantially older kernel might help. Maybe the latest 4.9 series kernel? -- Chris Murphy ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-21 23:22 ` Chris Murphy @ 2017-06-22 0:48 ` Marc MERLIN 0 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-22 0:48 UTC (permalink / raw) To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS, Qu Wenruo On Wed, Jun 21, 2017 at 05:22:15PM -0600, Chris Murphy wrote: > I don't know what it means. Maybe Qu has some idea. He might want a > btrfs-image of this file system to see if it's a bug. There are still > some bugs found with lowmem mode, so these could be bogus messages. > But the file system clearly has problems, the question is why does > such a new file system have these kinds of problems that can't be > fixed by normal repair because they aren't even being detected; or > maybe there is no problem on disk per se, the problem might be a bug. Yes, that's indeed the question I was asking myself too :) Now, I did have a couple of drives that got kicked out of a (mdadm, not btrfs) raid array, causing the array to go away while btrfs was trying to write to it, but my understanding of btrfs write journalling is that the new data that was being written should have been discarded and I should have ended up at the previous good state. AFAIK, I'm pretty sure I didn't get any block layer corruption this time, I just got a drive effectively pulled from a running array (well 2, one went to degraded, and the 2nd one killed the array. I re-added them carefully and correctly in the right order and mdadm rebuilt what it needed using the extent tree) For what it's worth, I've had no end of trouble with Sata SAS cards and their 4 sata cables in one: https://www.amazon.com/gp/product/B0050SLTPC/ref=oh_aui_search_detailpage?ie=UTF8&psc=1 https://www.amazon.com/gp/product/B013G4EMH8/ref=oh_aui_search_detailpage?ie=UTF8&psc=1 I have it stable now, but those cables are super sensitive and have caused drives to get kicked out if they weren't air canned first, and plugged in just right :-/ > In which case, off chance going back to a substantially older kernel > might help. Maybe the latest 4.9 series kernel? If there is reasonable evidence that it will help, I can give it a shot. Qu, or anyone, given that btrfs-image is going to take a long time (maybe a day or more), given that I have to use at least -s before I can share the image, and if I need -ss, then it's even slower from what I remember. Basically please suggest the fastest image algorithm I can use. It's a quad core HT machine, so should I use btrfs-image -c0 -t8 -s /dev/ image (I'm assuing -c9 will not be faster and that -ss will be even slower) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-21 15:13 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN 2017-06-21 23:22 ` Chris Murphy @ 2017-06-22 2:22 ` Qu Wenruo 2017-06-22 2:53 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Qu Wenruo @ 2017-06-22 2:22 UTC (permalink / raw) To: Marc MERLIN, Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS At 06/21/2017 11:13 PM, Marc MERLIN wrote: > On Tue, Jun 20, 2017 at 08:43:52PM -0700, Marc MERLIN wrote: >> On Tue, Jun 20, 2017 at 09:31:42PM -0600, Chris Murphy wrote: >>> On Tue, Jun 20, 2017 at 5:12 PM, Marc MERLIN <marc@merlins.org> wrote: >>> >>>> I'm now going to remount this with nospace_cache to see if your guess about >>>> space_cache was correct. >>>> Other suggestions also welcome :) >>> >>> What results do you get with lowmem mode? It won't repair without >>> additional patches, but might give a dev a clue what's going on. I >>> regularly see normal mode check finds no problems, and lowmem mode >>> finds problems. Lowmem mode is a total rewrite so it's a different >>> implementation and can find things normal mode won't. >> >> Oh, I kind of forgot that lowmem mode looked for more things than regular >> mode. >> I will run this tonight and see what it says. > > It's probably still a ways from being finished given how slow lowmem is in > comparison, but sadly it found a bunch of problems which regular mode didn't > find. > > I'm pretty bummed. I just spent way too long recreating this filesystem and > the multiple btrfs send/receive relationships from other machines. Too a bit > over a week :( > > It looks like the errors are not major (especially if the regular mode > doesn't even see them), but without lowmem --repair, I'm kind of screwed. > > I'm wondering if I could/should leave those errors unfixed until lowmem --repair > finally happens, or whether I'm looking at spending another week rebuilding > this filesystem :-/ > > > gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 > Checking filesystem on /dev/mapper/dshelf2 > UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede > ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 This means that in extent tree, btrfs says there is only one referring to this extent, but lowmem mode find 4. It would provide great help if you could dump extent tree for it. # btrfs-debug-tree <dev> | grep -C 10 3886187384832 > ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2 > ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2 > ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2 > ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5 > ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3 > ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3 > ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3 > ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4 > ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10 > ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3 > ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864 > ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859 > ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19 > ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33 > ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21 > ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21 > ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6 > ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6 > ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5 > ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3 > ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4 > ERROR: errors found in extent allocation tree or chunk allocation > cache and super generation don't match, space cache will be invalidated > ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt This means that, for root 3857, inode 108864, file offset 4096, there is a gap before that extent. In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, this should be a problem. I wonder if this is a problem caused by inlined compressed file extent. This can also be dumped by the following command. # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864 Thanks, Qu > ERROR: root 3857 EXTENT_DATA[133050 4096] interrupt > ERROR: root 3857 EXTENT_DATA[388570 4096] interrupt > ERROR: root 3857 EXTENT_DATA[729583 4096] interrupt > ERROR: root 3857 EXTENT_DATA[984778 4096] interrupt > ERROR: root 3857 EXTENT_DATA[997394 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1002954 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1007491 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111463 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111506 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1111536 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1134500 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1136498 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1175965 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1185977 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1190919 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1201340 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230370 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1230530 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1235960 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1248784 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1271827 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1295242 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1406074 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1410780 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1412938 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1413532 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1421245 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1423365 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1425985 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1429229 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1430615 4096] interrupt > ERROR: root 3857 EXTENT_DATA[1443769 4096] interrupt > ERROR: root 3860 EXTENT_DATA[599089 4096] interrupt > > (not finished, still going on) > > Marc > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-22 2:22 ` Qu Wenruo @ 2017-06-22 2:53 ` Marc MERLIN 2017-06-22 4:08 ` Qu Wenruo 2017-06-22 4:08 ` Qu Wenruo 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-22 2:53 UTC (permalink / raw) To: Qu Wenruo; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 3830 bytes --] Ok, first it finished (almost 24H) (...) ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt ERROR: errors found in fs roots found 5544779108352 bytes used, error(s) found total csum bytes: 5344523140 total tree bytes: 71323041792 total fs tree bytes: 59288403968 total extent tree bytes: 5378260992 btree space waste bytes: 10912166856 file data blocks allocated: 7830914256896 referenced 6244104495104 Thanks for your reply Qu On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote: > >gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 > >Checking filesystem on /dev/mapper/dshelf2 > >UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede > >ERROR: extent[3886187384832, 81920] referencer count mismatch (root: > >11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 > > This means that in extent tree, btrfs says there is only one referring > to this extent, but lowmem mode find 4. > > It would provide great help if you could dump extent tree for it. > # btrfs-debug-tree <dev> | grep -C 10 3886187384832 extent data backref root 11712 objectid 375444 offset 1851572224 count 1 extent data backref root 11276 objectid 375444 offset 1851572224 count 1 extent data backref root 11058 objectid 375444 offset 1851572224 count 1 extent data backref root 11494 objectid 375444 offset 1851572224 count 1 item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140 extent refs 4 gen 32382 flags DATA extent data backref root 11712 objectid 375444 offset 1851596800 count 1 extent data backref root 11276 objectid 375444 offset 1851596800 count 1 extent data backref root 11058 objectid 375444 offset 1851596800 count 1 extent data backref root 11494 objectid 375444 offset 1851596800 count 1 item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169 extent refs 16 gen 32382 flags DATA extent data backref root 11712 objectid 375444 offset 1851654144 count 4 extent data backref root 11276 objectid 375444 offset 1851654144 count 4 extent data backref root 11058 objectid 375444 offset 1851654144 count 3 extent data backref root 11494 objectid 375444 offset 1851654144 count 4 extent data backref root 11930 objectid 375444 offset 1851654144 count 1 item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169 extent refs 5 gen 32382 flags DATA extent data backref root 11712 objectid 375444 offset 1851744256 count 1 extent data backref root 11276 objectid 375444 offset 1851744256 count 1 > >ERROR: errors found in extent allocation tree or chunk allocation > >cache and super generation don't match, space cache will be invalidated > >ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt > > This means that, for root 3857, inode 108864, file offset 4096, there is > a gap before that extent. > In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, > this should be a problem. > > I wonder if this is a problem caused by inlined compressed file extent. > > This can also be dumped by the following command. > # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864 This one is much bigger (192KB), I've bzipped and attached it. Thanks for having a look, I appreciate it. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ [-- Attachment #2: out.bz2 --] [-- Type: application/octet-stream, Size: 23826 bytes --] ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-22 2:53 ` Marc MERLIN @ 2017-06-22 4:08 ` Qu Wenruo 2017-06-23 4:06 ` Marc MERLIN 2017-06-22 4:08 ` Qu Wenruo 1 sibling, 1 reply; 77+ messages in thread From: Qu Wenruo @ 2017-06-22 4:08 UTC (permalink / raw) To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS At 06/22/2017 10:53 AM, Marc MERLIN wrote: > Ok, first it finished (almost 24H) > > (...) > ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt > ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt > ERROR: errors found in fs roots > found 5544779108352 bytes used, error(s) found > total csum bytes: 5344523140 > total tree bytes: 71323041792 > total fs tree bytes: 59288403968 > total extent tree bytes: 5378260992 > btree space waste bytes: 10912166856 > file data blocks allocated: 7830914256896 > referenced 6244104495104 > > Thanks for your reply Qu > > On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote: >>> gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 >>> Checking filesystem on /dev/mapper/dshelf2 >>> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede >>> ERROR: extent[3886187384832, 81920] referencer count mismatch (root: >>> 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 >> >> This means that in extent tree, btrfs says there is only one referring >> to this extent, but lowmem mode find 4. >> >> It would provide great help if you could dump extent tree for it. >> # btrfs-debug-tree <dev> | grep -C 10 3886187384832 > > extent data backref root 11712 objectid 375444 offset 1851572224 count 1 > extent data backref root 11276 objectid 375444 offset 1851572224 count 1 > extent data backref root 11058 objectid 375444 offset 1851572224 count 1 > extent data backref root 11494 objectid 375444 offset 1851572224 count 1 > item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140 > extent refs 4 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851596800 count 1 > extent data backref root 11276 objectid 375444 offset 1851596800 count 1 > extent data backref root 11058 objectid 375444 offset 1851596800 count 1 > extent data backref root 11494 objectid 375444 offset 1851596800 count 1 > item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169 > extent refs 16 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851654144 count 4 > extent data backref root 11276 objectid 375444 offset 1851654144 count 4 > extent data backref root 11058 objectid 375444 offset 1851654144 count 3 > extent data backref root 11494 objectid 375444 offset 1851654144 count 4 > extent data backref root 11930 objectid 375444 offset 1851654144 count 1 > item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169 > extent refs 5 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851744256 count 1 > extent data backref root 11276 objectid 375444 offset 1851744256 count 1 Well, there is only the output from extent tree. I was also expecting output from subvolue (11930) tree. It could be done by # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832 But please pay attention that, this dump may contain filenames, feel free to mask the filenames. Thanks, Qu > > >>> ERROR: errors found in extent allocation tree or chunk allocation >>> cache and super generation don't match, space cache will be invalidated >>> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt >> >> This means that, for root 3857, inode 108864, file offset 4096, there is >> a gap before that extent. >> In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, >> this should be a problem. >> >> I wonder if this is a problem caused by inlined compressed file extent. >> >> This can also be dumped by the following command. >> # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864 > > This one is much bigger (192KB), I've bzipped and attached it. Thanks for this one. And it is caused by inlined compressed extent. Lu Fengqi will send patch fixing it. Thanks, Qu > > Thanks for having a look, I appreciate it. > > Marc > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-22 4:08 ` Qu Wenruo @ 2017-06-23 4:06 ` Marc MERLIN 2017-06-23 8:54 ` Lu Fengqi 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-23 4:06 UTC (permalink / raw) To: Qu Wenruo; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS On Thu, Jun 22, 2017 at 12:08:44PM +0800, Qu Wenruo wrote: > > On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote: > > > > gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 > > > > Checking filesystem on /dev/mapper/dshelf2 > > > > UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede > > > > ERROR: extent[3886187384832, 81920] referencer count mismatch (root: > > > > 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 > > > > > > This means that in extent tree, btrfs says there is only one referring > > > to this extent, but lowmem mode find 4. > > > > > > It would provide great help if you could dump extent tree for it. > > > # btrfs-debug-tree <dev> | grep -C 10 3886187384832 > > extent data backref root 11712 objectid 375444 offset 1851572224 count 1 > > extent data backref root 11276 objectid 375444 offset 1851572224 count 1 > > extent data backref root 11058 objectid 375444 offset 1851572224 count 1 > > extent data backref root 11494 objectid 375444 offset 1851572224 count 1 > > item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140 > > extent refs 4 gen 32382 flags DATA > > extent data backref root 11712 objectid 375444 offset 1851596800 count 1 > > extent data backref root 11276 objectid 375444 offset 1851596800 count 1 > > extent data backref root 11058 objectid 375444 offset 1851596800 count 1 > > extent data backref root 11494 objectid 375444 offset 1851596800 count 1 > > item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169 > > extent refs 16 gen 32382 flags DATA > > extent data backref root 11712 objectid 375444 offset 1851654144 count 4 > > extent data backref root 11276 objectid 375444 offset 1851654144 count 4 > > extent data backref root 11058 objectid 375444 offset 1851654144 count 3 > > extent data backref root 11494 objectid 375444 offset 1851654144 count 4 > > extent data backref root 11930 objectid 375444 offset 1851654144 count 1 > > item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169 > > extent refs 5 gen 32382 flags DATA > > extent data backref root 11712 objectid 375444 offset 1851744256 count 1 > > extent data backref root 11276 objectid 375444 offset 1851744256 count 1 > > Well, there is only the output from extent tree. > > I was also expecting output from subvolue (11930) tree. > > It could be done by > # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832 > > But please pay attention that, this dump may contain filenames, feel free to > mask the filenames. There you go: gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832 dshelf1@ dshelf2@ extent compression 0 (none) item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53 extent data disk byte 5613689888768 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 5613689888768 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293291008 nr 16384 extent data offset 0 nr 16384 ram 16384 extent compression 0 (none) item 202 key (375444 EXTENT_DATA 1851670528) itemoff 5524 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 16384 nr 8192 ram 81920 extent compression 0 (none) item 203 key (375444 EXTENT_DATA 1851678720) itemoff 5471 itemsize 53 generation 33534 type 1 (regular) extent data disk byte 5540480962560 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 204 key (375444 EXTENT_DATA 1851686912) itemoff 5418 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293307392 nr 16384 extent data offset 8192 nr 8192 ram 16384 extent compression 0 (none) item 205 key (375444 EXTENT_DATA 1851695104) itemoff 5365 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 40960 nr 8192 ram 81920 extent compression 0 (none) item 206 key (375444 EXTENT_DATA 1851703296) itemoff 5312 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293323776 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 207 key (375444 EXTENT_DATA 1851711488) itemoff 5259 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 57344 nr 8192 ram 81920 extent compression 0 (none) leaf 5715801047040 items 105 free space 8093 generation 36595 owner 11930 fs uuid 85441c59-ad11-4b25-b1fe-974f9e4acede chunk uuid ed705b7b-2fa6-43f6-a4a1-941c8463ee68 item 0 key (375444 EXTENT_DATA 1851719680) itemoff 16230 itemsize 53 generation 34868 type 1 (regular) extent data disk byte 5591266127872 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 1 key (375444 EXTENT_DATA 1851727872) itemoff 16177 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 73728 nr 8192 ram 81920 extent compression 0 (none) item 2 key (375444 EXTENT_DATA 1851736064) itemoff 16124 itemsize 53 generation 31782 type 1 (regular) extent data disk byte 5922189430784 nr 106496 extent data offset 81920 nr 8192 ram 106496 extent compression 0 (none) item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187466752 nr 16384 > Thanks for this one. > And it is caused by inlined compressed extent. > > Lu Fengqi will send patch fixing it. I got the patch and will test it, thank you. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-23 4:06 ` Marc MERLIN @ 2017-06-23 8:54 ` Lu Fengqi 2017-06-23 16:17 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Lu Fengqi @ 2017-06-23 8:54 UTC (permalink / raw) To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On 2017年06月23日 12:06, Marc MERLIN wrote: >> Well, there is only the output from extent tree. >> >> I was also expecting output from subvolue (11930) tree. >> >> It could be done by >> # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832 >> I apologize if this was not made clear. >> But please pay attention that, this dump may contain filenames, feel free to >> mask the filenames. > > There you go: > gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832 Could you dump file tree (11930) by the following command. # btrfs-debug-tree -t 11930 /dev/mapper/dsh | grep -C 10 3886187384832 I wonder if this extent was referenced by this file tree four times. Hoping that this will not cause you too much trouble. -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-23 8:54 ` Lu Fengqi @ 2017-06-23 16:17 ` Marc MERLIN 2017-06-24 2:34 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-23 16:17 UTC (permalink / raw) To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On Fri, Jun 23, 2017 at 04:54:01PM +0800, Lu Fengqi wrote: > On 2017年06月23日 12:06, Marc MERLIN wrote: > > > Well, there is only the output from extent tree. > > > > > > I was also expecting output from subvolue (11930) tree. > > > > > > It could be done by > > > # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832 > > > > I apologize if this was not made clear. > > > > But please pay attention that, this dump may contain filenames, feel free to > > > mask the filenames. > > There you go: > > gargamel:~# btrfs-debug-tree /dev/mapper/dsh | grep -C 10 3886187384832 > > Could you dump file tree (11930) by the following command. > # btrfs-debug-tree -t 11930 /dev/mapper/dsh | grep -C 10 3886187384832 Sure thing, there you go: extent data disk byte 5613689888768 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 201 key (375444 EXTENT_DATA 1851654144) itemoff 5577 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293291008 nr 16384 extent data offset 0 nr 16384 ram 16384 extent compression 0 (none) item 202 key (375444 EXTENT_DATA 1851670528) itemoff 5524 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 16384 nr 8192 ram 81920 extent compression 0 (none) item 203 key (375444 EXTENT_DATA 1851678720) itemoff 5471 itemsize 53 generation 33534 type 1 (regular) extent data disk byte 5540480962560 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 204 key (375444 EXTENT_DATA 1851686912) itemoff 5418 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293307392 nr 16384 extent data offset 8192 nr 8192 ram 16384 extent compression 0 (none) item 205 key (375444 EXTENT_DATA 1851695104) itemoff 5365 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 40960 nr 8192 ram 81920 extent compression 0 (none) item 206 key (375444 EXTENT_DATA 1851703296) itemoff 5312 itemsize 53 generation 32961 type 1 (regular) extent data disk byte 4686293323776 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 207 key (375444 EXTENT_DATA 1851711488) itemoff 5259 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 57344 nr 8192 ram 81920 extent compression 0 (none) leaf 5715801047040 items 105 free space 8093 generation 36595 owner 11930 leaf 5715801047040 flags 0x1(WRITTEN) backref revision 1 fs uuid 85441c59-ad11-4b25-b1fe-974f9e4acede chunk uuid ed705b7b-2fa6-43f6-a4a1-941c8463ee68 item 0 key (375444 EXTENT_DATA 1851719680) itemoff 16230 itemsize 53 generation 34868 type 1 (regular) extent data disk byte 5591266127872 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 (none) item 1 key (375444 EXTENT_DATA 1851727872) itemoff 16177 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187384832 nr 81920 extent data offset 73728 nr 8192 ram 81920 extent compression 0 (none) item 2 key (375444 EXTENT_DATA 1851736064) itemoff 16124 itemsize 53 generation 31782 type 1 (regular) extent data disk byte 5922189430784 nr 106496 extent data offset 81920 nr 8192 ram 106496 extent compression 0 (none) item 3 key (375444 EXTENT_DATA 1851744256) itemoff 16071 itemsize 53 generation 32382 type 1 (regular) extent data disk byte 3886187466752 nr 16384 Thanks for looking at this. I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll post the full results when it's done. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-23 16:17 ` Marc MERLIN @ 2017-06-24 2:34 ` Marc MERLIN 2017-06-26 10:46 ` Lu Fengqi 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-24 2:34 UTC (permalink / raw) To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On Fri, Jun 23, 2017 at 09:17:50AM -0700, Marc MERLIN wrote: > Thanks for looking at this. > I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll > post the full results when it's done. Ok, here is the output of the check with btrfs-progs freshly synced from git, including Lu's just added patch. Obviously while I'm happy to give further debug info on why my filesystem is in that state and while check --repair sees nothing to repair, suggestions on how to clean those warnings up, unless they are not going to affect filesystem operation, would be greatly appreciated :) Thanks, Marc Checking filesystem on /dev/mapper/dshelf2 UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede ERROR: extent[3886187384832, 81920] referencer count mismatch (root: 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11712, owner: 863395, offset: 79659008) wanted: 1, have: 2 ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11494, owner: 863395, offset: 79659008) wanted: 1, have: 2 ERROR: extent[3886189391872, 122880] referencer count mismatch (root: 11930, owner: 863395, offset: 79659008) wanted: 1, have: 2 ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12424, owner: 6945, offset: 2083389440) wanted: 1, have: 2 ERROR: extent[3933249708032, 69632] referencer count mismatch (root: 12172, owner: 6945, offset: 2083389440) wanted: 1, have: 2 ERROR: extent[4571729862656, 876544] referencer count mismatch (root: 11058, owner: 375442, offset: 907706368) wanted: 6, have: 21 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11712, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11276, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11058, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4641490833408, 270336] referencer count mismatch (root: 11494, owner: 375444, offset: 1848672256) wanted: 3, have: 5 ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11712, owner: 375444, offset: 1848705024) wanted: 1, have: 3 ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11276, owner: 375444, offset: 1848705024) wanted: 1, have: 3 ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11058, owner: 375444, offset: 1848705024) wanted: 1, have: 3 ERROR: extent[4658555617280, 122880] referencer count mismatch (root: 11494, owner: 375444, offset: 1848705024) wanted: 1, have: 3 ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11712, owner: 863395, offset: 79523840) wanted: 1, have: 3 ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11494, owner: 863395, offset: 79523840) wanted: 1, have: 3 ERROR: extent[4677858123776, 417792] referencer count mismatch (root: 11930, owner: 863395, offset: 79523840) wanted: 1, have: 3 ERROR: extent[4698380947456, 409600] referencer count mismatch (root: 11930, owner: 375444, offset: 1851596800) wanted: 3, have: 4 ERROR: extent[4720470421504, 667648] referencer count mismatch (root: 11058, owner: 3463478, offset: 2334720) wanted: 2, have: 10 ERROR: extent[4783941246976, 65536] referencer count mismatch (root: 9365, owner: 24493, offset: 4562944) wanted: 2, have: 3 ERROR: extent[5077564477440, 106496] referencer count mismatch (root: 9370, owner: 1602694, offset: 734756864) wanted: 1, have: 2 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11712, owner: 375441, offset: 910999552) wanted: 16, have: 1864 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11276, owner: 375441, offset: 910999552) wanted: 867, have: 1865 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11058, owner: 375441, offset: 910999552) wanted: 126, have: 1872 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11494, owner: 375441, offset: 910999552) wanted: 866, have: 1864 ERROR: extent[5136306929664, 131489792] referencer count mismatch (root: 11930, owner: 375441, offset: 910999552) wanted: 861, have: 1859 ERROR: extent[5136649891840, 66781184] referencer count mismatch (root: 11058, owner: 375442, offset: 192659456) wanted: 5, have: 19 ERROR: extent[5136879157248, 134217728] referencer count mismatch (root: 11930, owner: 375442, offset: 394543104) wanted: 10, have: 33 ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11058, owner: 375442, offset: 875233280) wanted: 1, have: 21 ERROR: extent[5137380671488, 80945152] referencer count mismatch (root: 11930, owner: 375442, offset: 875233280) wanted: 11, have: 21 ERROR: extent[5138641395712, 524288] referencer count mismatch (root: 11494, owner: 375445, offset: 39845888) wanted: 1, have: 3 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11712, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11276, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190245990400, 53248] referencer count mismatch (root: 11494, owner: 863395, offset: 51118080) wanted: 1, have: 4 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11494, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190274174976, 77824] referencer count mismatch (root: 11930, owner: 863395, offset: 74952704) wanted: 3, have: 5 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11712, owner: 863395, offset: 77705216) wanted: 1, have: 6 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11276, owner: 863395, offset: 77705216) wanted: 5, have: 6 ERROR: extent[5190275895296, 77824] referencer count mismatch (root: 11058, owner: 863395, offset: 77705216) wanted: 5, have: 6 ERROR: extent[5427326599168, 61440] referencer count mismatch (root: 9370, owner: 1225712, offset: 29753344) wanted: 1, have: 5 ERROR: extent[5456623030272, 24576] referencer count mismatch (root: 11058, owner: 2278892, offset: 786432) wanted: 2, have: 3 ERROR: extent[5851251269632, 134217728] referencer count mismatch (root: 9370, owner: 1602695, offset: 534061056) wanted: 3, have: 4 ERROR: errors found in extent allocation tree or chunk allocation cache and super generation don't match, space cache will be invalidated ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt ERROR: errors found in fs roots found 5544779108352 bytes used, error(s) found total csum bytes: 5344523140 total tree bytes: 71323041792 total fs tree bytes: 59288403968 total extent tree bytes: 5378260992 btree space waste bytes: 10912166856 file data blocks allocated: 7830914256896 referenced 6244104495104 -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-24 2:34 ` Marc MERLIN @ 2017-06-26 10:46 ` Lu Fengqi 2017-06-27 23:11 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Lu Fengqi @ 2017-06-26 10:46 UTC (permalink / raw) To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On 2017年06月24日 10:34, Marc MERLIN wrote: > On Fri, Jun 23, 2017 at 09:17:50AM -0700, Marc MERLIN wrote: >> Thanks for looking at this. >> I have applied your patch and I'm still re-running check in lowmem. It takes about 24H so I'll >> post the full results when it's done. > > Ok, here is the output of the check with btrfs-progs freshly synced from > git, including Lu's just added patch. > > Obviously while I'm happy to give further debug info on why my filesystem is in that state and > while check --repair sees nothing to repair, suggestions on how to clean those warnings up, unless they are not going to affect filesystem operation, would be greatly appreciated :) > > Thanks, > Marc Thanks for the updated information. I'm sorry that the false alert make you feel nervous. > > ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt > ERROR: errors found in fs roots However, this looks like another problem. Could you dump this file tree by the following command? # btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706 -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-26 10:46 ` Lu Fengqi @ 2017-06-27 23:11 ` Marc MERLIN 2017-06-28 7:10 ` Lu Fengqi 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-27 23:11 UTC (permalink / raw) To: Lu Fengqi; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On Mon, Jun 26, 2017 at 06:46:16PM +0800, Lu Fengqi wrote: > Thanks for the updated information. I'm sorry that the false alert make > you feel nervous. If you can help me find out whether those are real errors that I need to fix (and can't yet since there is no --repair), or whether they are not real problems, I can ignore them as long as the other check --mode normal runs clean (it does), we'll be good :) > >ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt > >ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt > >ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt > >ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt > >ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt > >ERROR: errors found in fs roots > > However, this looks like another problem. Could you dump this file tree > by the following command? > # btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706 argamel:~# btrfs-debug-tree -t 3862 /dev/mapper/dshelf2 | grep -C 10 18170706 transid 522 data_len 0 name_len 45 name: 007b01c69a8d_9582a070_425a45d7@mindspring.com item 89 key (835232 DIR_ITEM 1181375325) itemoff 10304 itemsize 64 location key (877605 INODE_ITEM 0) type FILE transid 530 data_len 0 name_len 34 name: PO.5078346.7051988@codeweavers.com item 90 key (835232 DIR_ITEM 1181489720) itemoff 10226 itemsize 78 location key (873230 INODE_ITEM 0) type FILE transid 529 data_len 0 name_len 48 name: 8-8356087-5CruUNCbuG3Kg9zuO@mail1.fireflypro.com item 91 key (835232 DIR_ITEM 1181707066) itemoff 10148 itemsize 78 location key (869906 INODE_ITEM 0) type FILE transid 528 data_len 0 name_len 48 name: 7-7026088-1uLzZJuFzYD6h4rzV@max.firmalliance.biz item 92 key (835232 DIR_ITEM 1181727135) itemoff 10084 itemsize 64 location key (877380 INODE_ITEM 0) type FILE transid 530 data_len 0 name_len 34 name: NJ.5943286.7059518@codeweavers.com item 93 key (835232 DIR_ITEM 1181873033) itemoff 10038 itemsize 46 location key (859092 INODE_ITEM 0) type FILE transid 526 data_len 0 name_len 16 name: mdadm_detail.0 item 83 key (2640780 DIR_ITEM 3316050734) itemoff 12739 itemsize 39 location key (15689752 INODE_ITEM 0) type FILE transid 8178 data_len 0 name_len 9 name: sda3.dd.0 item 84 key (2640780 DIR_ITEM 3349213389) itemoff 12697 itemsize 42 location key (2667656 INODE_ITEM 0) type FILE transid 885 data_len 0 name_len 12 name: sdb2.dd.1.gz item 85 key (2640780 DIR_ITEM 3351742419) itemoff 12663 itemsize 34 location key (18170706 INODE_ITEM 0) type FILE transid 37866 data_len 0 name_len 4 name: dm-0 item 86 key (2640780 DIR_ITEM 3354578455) itemoff 12624 itemsize 39 location key (13847590 INODE_ITEM 0) type FILE transid 2387 data_len 0 name_len 9 name: sda7.3.gz item 87 key (2640780 DIR_ITEM 3361267344) itemoff 12586 itemsize 38 location key (2667594 INODE_ITEM 0) type FILE transid 885 data_len 0 name_len 8 name: .profile -- name: sdc1.dd.4.gz item 70 key (2640780 DIR_INDEX 1685) itemoff 13162 itemsize 42 location key (17548883 INODE_ITEM 0) type FILE transid 34469 data_len 0 name_len 12 name: sdc1.dd.5.gz item 71 key (2640780 DIR_INDEX 1687) itemoff 13120 itemsize 42 location key (17548884 INODE_ITEM 0) type FILE transid 34469 data_len 0 name_len 12 name: sdc1.dd.6.gz item 72 key (2640780 DIR_INDEX 2039) itemoff 13086 itemsize 34 location key (18170706 INODE_ITEM 0) type FILE transid 37866 data_len 0 name_len 4 name: dm-0 item 73 key (2640780 DIR_INDEX 2041) itemoff 13051 itemsize 35 location key (18170707 INODE_ITEM 0) type FILE transid 37866 data_len 0 name_len 5 name: fdisk item 74 key (2640780 DIR_INDEX 2043) itemoff 13007 itemsize 44 location key (18170708 INODE_ITEM 0) type FILE transid 37866 data_len 0 name_len 14 name: mdadm_detail.0 Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-27 23:11 ` Marc MERLIN @ 2017-06-28 7:10 ` Lu Fengqi 2017-06-28 14:43 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Lu Fengqi @ 2017-06-28 7:10 UTC (permalink / raw) To: Marc MERLIN; +Cc: Qu Wenruo, Chris Murphy, Hugo Mills, Btrfs BTRFS On Tue, Jun 27, 2017 at 04:11:46PM -0700, Marc MERLIN wrote: >On Mon, Jun 26, 2017 at 06:46:16PM +0800, Lu Fengqi wrote: >> Thanks for the updated information. I'm sorry that the false alert make >> you feel nervous. > >If you can help me find out whether those are real errors that I need to fix >(and can't yet since there is no --repair), or whether they are not real >problems, I can ignore them as long as the other check --mode normal runs >clean (it does), we'll be good :) :) > >> >ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt >> >ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt >> >ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt >> >ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt >> >ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt >> >ERROR: errors found in fs roots >> >> However, this looks like another problem. Could you dump this file tree >> by the following command? >> # btrfs-debug-tree -t 3862 <dev> | grep -C 10 18170706 > Because the output is abnormal, except for the relevant DIR_ITEM and DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA. I wonder if the file system is online when this command is executed? If so, please re-execute it offline again; if not, could you apply my patches re-check it again? >argamel:~# btrfs-debug-tree -t 3862 /dev/mapper/dshelf2 | grep -C 10 18170706 > transid 522 data_len 0 name_len 45 > name: 007b01c69a8d_9582a070_425a45d7@mindspring.com > item 89 key (835232 DIR_ITEM 1181375325) itemoff 10304 itemsize 64 > location key (877605 INODE_ITEM 0) type FILE > transid 530 data_len 0 name_len 34 > name: PO.5078346.7051988@codeweavers.com > item 90 key (835232 DIR_ITEM 1181489720) itemoff 10226 itemsize 78 > location key (873230 INODE_ITEM 0) type FILE > transid 529 data_len 0 name_len 48 > name: 8-8356087-5CruUNCbuG3Kg9zuO@mail1.fireflypro.com > item 91 key (835232 DIR_ITEM 1181707066) itemoff 10148 itemsize 78 > location key (869906 INODE_ITEM 0) type FILE > transid 528 data_len 0 name_len 48 > name: 7-7026088-1uLzZJuFzYD6h4rzV@max.firmalliance.biz > item 92 key (835232 DIR_ITEM 1181727135) itemoff 10084 itemsize 64 > location key (877380 INODE_ITEM 0) type FILE > transid 530 data_len 0 name_len 34 > name: NJ.5943286.7059518@codeweavers.com > item 93 key (835232 DIR_ITEM 1181873033) itemoff 10038 itemsize 46 > location key (859092 INODE_ITEM 0) type FILE > transid 526 data_len 0 name_len 16 > name: mdadm_detail.0 > item 83 key (2640780 DIR_ITEM 3316050734) itemoff 12739 itemsize 39 > location key (15689752 INODE_ITEM 0) type FILE > transid 8178 data_len 0 name_len 9 > name: sda3.dd.0 > item 84 key (2640780 DIR_ITEM 3349213389) itemoff 12697 itemsize 42 > location key (2667656 INODE_ITEM 0) type FILE > transid 885 data_len 0 name_len 12 > name: sdb2.dd.1.gz > item 85 key (2640780 DIR_ITEM 3351742419) itemoff 12663 itemsize 34 > location key (18170706 INODE_ITEM 0) type FILE > transid 37866 data_len 0 name_len 4 > name: dm-0 > item 86 key (2640780 DIR_ITEM 3354578455) itemoff 12624 itemsize 39 > location key (13847590 INODE_ITEM 0) type FILE > transid 2387 data_len 0 name_len 9 > name: sda7.3.gz > item 87 key (2640780 DIR_ITEM 3361267344) itemoff 12586 itemsize 38 > location key (2667594 INODE_ITEM 0) type FILE > transid 885 data_len 0 name_len 8 > name: .profile >-- > name: sdc1.dd.4.gz > item 70 key (2640780 DIR_INDEX 1685) itemoff 13162 itemsize 42 > location key (17548883 INODE_ITEM 0) type FILE > transid 34469 data_len 0 name_len 12 > name: sdc1.dd.5.gz > item 71 key (2640780 DIR_INDEX 1687) itemoff 13120 itemsize 42 > location key (17548884 INODE_ITEM 0) type FILE > transid 34469 data_len 0 name_len 12 > name: sdc1.dd.6.gz > item 72 key (2640780 DIR_INDEX 2039) itemoff 13086 itemsize 34 > location key (18170706 INODE_ITEM 0) type FILE > transid 37866 data_len 0 name_len 4 > name: dm-0 > item 73 key (2640780 DIR_INDEX 2041) itemoff 13051 itemsize 35 > location key (18170707 INODE_ITEM 0) type FILE > transid 37866 data_len 0 name_len 5 > name: fdisk > item 74 key (2640780 DIR_INDEX 2043) itemoff 13007 itemsize 44 > location key (18170708 INODE_ITEM 0) type FILE > transid 37866 data_len 0 name_len 14 > name: mdadm_detail.0 > >Marc >-- >"A mouse is a device used to point at the xterm you want to type in" - A.S.R. >Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking >Home page: http://marc.merlins.org/ > > -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-28 7:10 ` Lu Fengqi @ 2017-06-28 14:43 ` Marc MERLIN 2017-05-01 17:06 ` 4.11 relocate crash, null pointer Marc MERLIN 2017-06-29 13:36 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-28 14:43 UTC (permalink / raw) To: Lu Fengqi; +Cc: Qu Wenruo, Btrfs BTRFS [-- Attachment #1: Type: text/plain, Size: 740 bytes --] [cc trimmed] On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote: > Because the output is abnormal, except for the relevant DIR_ITEM and > DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA. > I wonder if the file system is online when this command is executed? If > so, please re-execute it offline again; if not, could you apply my > patches re-check it again? The filesystem was offline and I had those 2 patches applied. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 [-- Attachment #2: p1.patch --] [-- Type: text/x-diff, Size: 4038 bytes --] >From lufq.fnst@cn.fujitsu.com Mon Jun 26 03:37:46 2017 Received: from [59.151.112.132] (port=50126 helo=heian.cn.fujitsu.com) by mail1.merlins.org with esmtp (Exim 4.87 #1) id 1dPROn-0001kT-Ud for <marc@merlins.org>; Mon, 26 Jun 2017 03:37:46 -0700 X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; d="scan'208";a="20491849" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 26 Jun 2017 18:37:30 +0800 Received: from G08CNEXCHPEKD02.g08.fujitsu.local (unknown [10.167.33.83]) by cn.fujitsu.com (Postfix) with ESMTP id 2694647E64CC; Mon, 26 Jun 2017 18:37:30 +0800 (CST) Received: from lufq.5F.lufq.5F (10.167.225.63) by G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 26 Jun 2017 18:37:31 +0800 From: Lu Fengqi <lufq.fnst@cn.fujitsu.com> To: <linux-btrfs@vger.kernel.org> CC: <marc@merlins.org> Date: Mon, 26 Jun 2017 18:37:24 +0800 Message-ID: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.13.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.167.225.63] X-yoursite-MailScanner-ID: 2694647E64CC.AB674 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: lufq.fnst@cn.fujitsu.com X-Broken-Reverse-DNS: no host name for IP address 59.151.112.132 X-SA-Exim-Connect-IP: 59.151.112.132 X-SA-Exim-Rcpt-To: marc@merlins.org X-SA-Exim-Mail-From: lufq.fnst@cn.fujitsu.com X-Spam-Checker-Version: SpamAssassin 3.4.1-mmrules_20121111 (2015-04-28) on magic.merlins.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=7.0 tests=BAYES_00,GREYLIST_ISWHITE, RDNS_NONE autolearn=ham autolearn_force=no version=3.4.1-mmrules_20121111 X-Spam-Report: * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS * -1.5 GREYLIST_ISWHITE The incoming server has been whitelisted for this * receipient and sender Subject: [PATCH v3 1/4] btrfs-progs: lowmem check: Fix false alert about file extent interrupt X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000) X-SA-Exim-Scanned: Yes (on mail1.merlins.org) Status: RO Content-Length: 1811 Lines: 52 As Qu mentioned in this thread (https://www.spinics.net/lists/linux-btrfs/msg64469.html), compression can cause regular extent to co-exist with inlined extent. This coexistence makes things confusing. Since it was permitted currently, so fix btrfsck to prevent a bunch of error logs that will make user feel panic. When check file extent, record the extent_end of regular extent to check if there is a gap between the regular extents. Normally there is only one inlined extent, so the extent_end of inlined extent is useless. However, if regular extent can co-exist with inlined extent, the extent_end of inlined extent also need to record. Reported-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> --- Changlog: v2: Just fix reported-by v3: Output verbose information when file extent interrupt cmds-check.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index c052f66e..70d2b7f2 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -4782,6 +4782,7 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey, extent_num_bytes, item_inline_len); err |= FILE_EXTENT_ERROR; } + *end += extent_num_bytes; *size += extent_num_bytes; return err; } @@ -4847,8 +4848,8 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey, root->objectid, fkey->objectid, fkey->offset); } else if (!no_holes && *end != fkey->offset) { err |= FILE_EXTENT_ERROR; - error("root %llu EXTENT_DATA[%llu %llu] interrupt", - root->objectid, fkey->objectid, fkey->offset); + error("root %llu EXTENT_DATA[%llu %llu] interrupt, should start at %llu", + root->objectid, fkey->objectid, fkey->offset, *end); } *end += extent_num_bytes; -- 2.13.1 [-- Attachment #3: p2.patch --] [-- Type: text/x-diff, Size: 3267 bytes --] >From lufq.fnst@cn.fujitsu.com Mon Jun 26 03:37:41 2017 Received: from [59.151.112.132] (port=50126 helo=heian.cn.fujitsu.com) by mail1.merlins.org with esmtp (Exim 4.87 #1) id 1dPROj-0001kT-Tq for <marc@merlins.org>; Mon, 26 Jun 2017 03:37:41 -0700 X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; d="scan'208";a="20491848" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 26 Jun 2017 18:37:30 +0800 Received: from G08CNEXCHPEKD02.g08.fujitsu.local (unknown [10.167.33.83]) by cn.fujitsu.com (Postfix) with ESMTP id B3C5047E64D5; Mon, 26 Jun 2017 18:37:30 +0800 (CST) Received: from lufq.5F.lufq.5F (10.167.225.63) by G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 26 Jun 2017 18:37:32 +0800 From: Lu Fengqi <lufq.fnst@cn.fujitsu.com> To: <linux-btrfs@vger.kernel.org> CC: <marc@merlins.org> Date: Mon, 26 Jun 2017 18:37:25 +0800 Message-ID: <20170626103727.8945-2-lufq.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.13.1 In-Reply-To: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com> References: <20170626103727.8945-1-lufq.fnst@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.167.225.63] X-yoursite-MailScanner-ID: B3C5047E64D5.AC56F X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: lufq.fnst@cn.fujitsu.com X-Broken-Reverse-DNS: no host name for IP address 59.151.112.132 X-SA-Exim-Connect-IP: 59.151.112.132 X-SA-Exim-Rcpt-To: marc@merlins.org X-SA-Exim-Mail-From: lufq.fnst@cn.fujitsu.com X-Spam-Checker-Version: SpamAssassin 3.4.1-mmrules_20121111 (2015-04-28) on magic.merlins.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=7.0 tests=BAYES_00,GREYLIST_ISWHITE, RDNS_NONE autolearn=ham autolearn_force=no version=3.4.1-mmrules_20121111 X-Spam-Report: * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS * -1.5 GREYLIST_ISWHITE The incoming server has been whitelisted for this * receipient and sender Subject: [PATCH v3 2/4] btrfs-progs: lowmem check: Fix false alert about referencer count mismatch X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000) X-SA-Exim-Scanned: Yes (on mail1.merlins.org) Status: O Content-Length: 915 Lines: 29 The normal back reference counting doesn't care about the extent referred by the extent data in the shared leaf. The check_extent_data_backref function need to skip the leaf that owner mismatch with the root_id. Reported-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> --- cmds-check.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cmds-check.c b/cmds-check.c index 70d2b7f2..f42968cd 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -10692,7 +10692,8 @@ static int check_extent_data_backref(struct btrfs_fs_info *fs_info, leaf = path.nodes[0]; slot = path.slots[0]; - if (slot >= btrfs_header_nritems(leaf)) + if (slot >= btrfs_header_nritems(leaf) || + btrfs_header_owner(leaf) != root_id) goto next; btrfs_item_key_to_cpu(leaf, &key, slot); if (key.objectid != objectid || key.type != BTRFS_EXTENT_DATA_KEY) -- 2.13.1 ^ permalink raw reply related [flat|nested] 77+ messages in thread
* 4.11 relocate crash, null pointer @ 2017-05-01 17:06 ` Marc MERLIN 2017-05-01 18:08 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-01 17:06 UTC (permalink / raw) To: linux-btrfs I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday. I don't think in a case like this, there is there a way to roll back all writes across all subvolumes in the last 24H, correct? Is the best thing to go in each subvolume, delete the recent snapshots and rename the one from 24H as the current one? BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512 BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512 BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure BTRFS info (device dm-5): forced readonly BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure BUG: unable to handle kernel NULL pointer dereference at (null) IP: __del_reloc_root+0x3f/0xa6 PGD 189a0e067 PUD 189a0f067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10 asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio] CPU: 0 PID: 9056 Comm: btrfs Tainted: G U 4.11.0-amd64-preempt-sysrq-20170406 #2 Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 task: ffff88374d2a60c0 task.stack: ffffa6f226424000 RIP: 0010:__del_reloc_root+0x3f/0xa6 RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246 RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2 RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568 RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480 R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0 R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570 FS: 00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0 Call Trace: free_reloc_roots+0x4f/0x5d merge_reloc_roots+0x159/0x1ba relocate_block_group+0x410/0x492 btrfs_relocate_block_group+0x12d/0x253 btrfs_relocate_chunk+0x3e/0xb1 btrfs_balance+0xd16/0xf36 btrfs_ioctl_balance+0x24f/0x2cd ? __alloc_pages_nodemask+0x134/0x1e0 btrfs_ioctl+0x1447/0x1e22 ? mem_cgroup_charge_statistics+0x1e/0x88 ? get_page+0x9/0x26 ? __lru_cache_add+0x2a/0x6c ? set_pte_at+0x9/0xd ? __handle_mm_fault+0x61d/0xa6f vfs_ioctl+0x21/0x38 ? vfs_ioctl+0x21/0x38 do_vfs_ioctl+0x4ef/0x537 ? current_kernel_time64+0x10/0x36 ? __audit_syscall_entry+0xc2/0xe6 ? syscall_trace_enter+0x1ac/0x20e SyS_ioctl+0x57/0x7b do_syscall_64+0x6b/0x7d entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7facd097ecc7 RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7 RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003 RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040 R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003 R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001 Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89 RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40 CR2: 0000000000000000 ---[ end trace 64c3fa4dc953d295 ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Rebooting in 20 seconds.. ACPI MEMORY or I/O RESET_REG. -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-01 17:06 ` 4.11 relocate crash, null pointer Marc MERLIN @ 2017-05-01 18:08 ` Marc MERLIN 2017-05-02 1:50 ` Chris Murphy 2017-05-05 1:13 ` Qu Wenruo 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-05-01 18:08 UTC (permalink / raw) To: linux-btrfs; +Cc: clm, bo.li.liu, fdmanana, jbacik, quwenruo, dsterba So, I forgot to mention that it's my main media and backup server that got corrupted. Yes, I do actually have a backup of a backup server, but it's going to take days to recover due to the amount of data to copy back, not counting lots of manual typing due to the number of subvolumes, btrfs send/receive relationships and so forth. Really, I should be able to roll back all writes from the last 24H, run a check --repair/scrub on top just to be sure, and be back on track. In the meantime, the good news is that the filesystem doesn't crash the kernel (the poasted crash below) now that I was able to cancel the btrfs balance, but it goes read only at the drop of a hat, even when I'm trying to delete recent snapshots and all data that was potentially written in the last 24H On Mon, May 01, 2017 at 10:06:41AM -0700, Marc MERLIN wrote: > I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday. > > I don't think in a case like this, there is there a way to roll back all > writes across all subvolumes in the last 24H, correct? > > Is the best thing to go in each subvolume, delete the recent snapshots and > rename the one from 24H as the current one? Well, just like I expected, it's a pain in the rear and this can't even help fix the top level mountpoint which doesn't have snapshots, so I can't roll it back. btrfs should really have an easy way to roll back X hours, or days to recover from garbage written after a good known point, given that it is COW afterall. Is there a way do this with check --repair maybe? In the meantime, I got stuck while trying to delete snapshots: Let's say I have this: ID 428 gen 294021 top level 5 path backup ID 2023 gen 294021 top level 5 path Soft ID 3021 gen 294051 top level 428 path backup/debian32 ID 4400 gen 294018 top level 428 path backup/debian64 ID 4930 gen 294019 top level 428 path backup/ubuntu I can easily Delete subvolume (no-commit): '/mnt/btrfs_pool2/Soft' and then: gargamel:/mnt/btrfs_pool2# mv Soft_rw.20170430_01:50:22 Soft But I can't delete backup, which actually is mostly only a directory containing other things (in hindsight I shouldn't have made that a subvolume) Delete subvolume (no-commit): '/mnt/btrfs_pool2/backup' ERROR: cannot delete '/mnt/btrfs_pool2/backup': Directory not empty This is because backup has a lot of subvolumes due to btrfs send/receive relationships. Is it possible to recover there? Can you reparent subvolumes to a different subvolume without doing a full copy via btrfs send/receive? Thanks, Marc > BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now > BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space > BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now > BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space > BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now > BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512 > BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512 > BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure > BTRFS info (device dm-5): forced readonly > BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: __del_reloc_root+0x3f/0xa6 > PGD 189a0e067 > PUD 189a0f067 > PMD 0 > > Oops: 0000 [#1] PREEMPT SMP > Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10 > asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio] > CPU: 0 PID: 9056 Comm: btrfs Tainted: G U 4.11.0-amd64-preempt-sysrq-20170406 #2 > Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 > task: ffff88374d2a60c0 task.stack: ffffa6f226424000 > RIP: 0010:__del_reloc_root+0x3f/0xa6 > RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246 > RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2 > RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568 > RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0 > R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570 > FS: 00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0 > Call Trace: > free_reloc_roots+0x4f/0x5d > merge_reloc_roots+0x159/0x1ba > relocate_block_group+0x410/0x492 > btrfs_relocate_block_group+0x12d/0x253 > btrfs_relocate_chunk+0x3e/0xb1 > btrfs_balance+0xd16/0xf36 > btrfs_ioctl_balance+0x24f/0x2cd > ? __alloc_pages_nodemask+0x134/0x1e0 > btrfs_ioctl+0x1447/0x1e22 > ? mem_cgroup_charge_statistics+0x1e/0x88 > ? get_page+0x9/0x26 > ? __lru_cache_add+0x2a/0x6c > ? set_pte_at+0x9/0xd > ? __handle_mm_fault+0x61d/0xa6f > vfs_ioctl+0x21/0x38 > ? vfs_ioctl+0x21/0x38 > do_vfs_ioctl+0x4ef/0x537 > ? current_kernel_time64+0x10/0x36 > ? __audit_syscall_entry+0xc2/0xe6 > ? syscall_trace_enter+0x1ac/0x20e > SyS_ioctl+0x57/0x7b > do_syscall_64+0x6b/0x7d > entry_SYSCALL64_slow_path+0x25/0x25 > RIP: 0033:0x7facd097ecc7 > RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7 > RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003 > RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040 > R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003 > R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001 > Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89 > RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40 > CR2: 0000000000000000 > ---[ end trace 64c3fa4dc953d295 ]--- > Kernel panic - not syncing: Fatal exception > Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > Rebooting in 20 seconds.. > ACPI MEMORY or I/O RESET_REG. > > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-01 18:08 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN @ 2017-05-02 1:50 ` Chris Murphy 2017-05-02 3:23 ` Marc MERLIN 2017-05-05 1:13 ` Qu Wenruo 1 sibling, 1 reply; 77+ messages in thread From: Chris Murphy @ 2017-05-02 1:50 UTC (permalink / raw) To: Marc MERLIN Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, Qu Wenruo, David Sterba What about btfs check (no repair), without and then also with --mode=lowmem? In theory I like the idea of a 24 hour rollback; but in normal usage Btrfs will eventually free up space containing stale and no longer necessary metadata. Like the chunk tree, it's always changing, so you get to a point, even with snapshots, that the old state of that tree is just - gone. A snapshot of an fs tree does not make the chunk tree frozen in time. To do what you want, maybe isn't a ton of work if it could be based on a variation of the existing btrfs seed device code. Call it a "super snapshot". I like the idea of triage, where bad parts of the file system can just be cut off, like triage. Compared to other filesystems, they'll say this is hardware sabotage and nothing can be done. Btrfs is a bit deceptive in that it sorta invites the idea we can use hardware that isn't proven, and the fs can survive. In any case, it's a big problem in my mind if no existing tools can fix a file system of this size. So before making anymore changes, make sure you have a btrfs-image somewhere, even if it's huge. The offline checker needs to be able to repair it, right now it's all we have for such a case. Chris Murphy ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 1:50 ` Chris Murphy @ 2017-05-02 3:23 ` Marc MERLIN 2017-05-02 4:56 ` Chris Murphy ` (2 more replies) 0 siblings, 3 replies; 77+ messages in thread From: Marc MERLIN @ 2017-05-02 3:23 UTC (permalink / raw) To: Chris Murphy Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, Qu Wenruo, David Sterba Hi Chris, Thanks for the reply, much appreciated. On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote: > What about btfs check (no repair), without and then also with --mode=lowmem? > > In theory I like the idea of a 24 hour rollback; but in normal usage > Btrfs will eventually free up space containing stale and no longer > necessary metadata. Like the chunk tree, it's always changing, so you > get to a point, even with snapshots, that the old state of that tree > is just - gone. A snapshot of an fs tree does not make the chunk tree > frozen in time. Right, of course, I was being way over optimistic here. I kind of forgot that metadata wasn't COW, my bad. > In any case, it's a big problem in my mind if no existing tools can > fix a file system of this size. So before making anymore changes, make > sure you have a btrfs-image somewhere, even if it's huge. The offline > checker needs to be able to repair it, right now it's all we have for > such a case. The image will be huge, and take maybe 24H to make (last time it took some silly amount of time like that), and honestly I'm not sure how useful it'll be. Outside of the kernel crashing if I do a btrfs balance, and hopefully the crash report I gave is good enough, the state I'm in is not btrfs' fault. If I can't roll back to a reasonably working state, with data loss of a known quantity that I can recover from backup, I'll have to destroy and filesystem and recover from scratch, which will take multiple days. Since I can't wait too long before getting back to a working state, I think I'm going to try btrfs check --repair after a scrub to get a list of all the pathanmes/inodes that are known to be damaged, and work from there. Sounds reasonable? Also, how is --mode=lowmem being useful? And for re-parenting a sub-subvolume, is that possible? (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete sub1) In the meantime, a simple check without repair looks like this. It will likely take many hours to complete: gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 checking extents checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B parent transid verify failed on 1671538819072 wanted 293964 found 293902 parent transid verify failed on 1671538819072 wanted 293964 found 293902 checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 (...) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 3:23 ` Marc MERLIN @ 2017-05-02 4:56 ` Chris Murphy 2017-05-02 5:11 ` Marc MERLIN 2017-05-02 19:59 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow 2017-05-02 5:01 ` Duncan 2017-05-05 1:19 ` Qu Wenruo 2 siblings, 2 replies; 77+ messages in thread From: Chris Murphy @ 2017-05-02 4:56 UTC (permalink / raw) To: Marc MERLIN Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, Qu Wenruo, David Sterba On Mon, May 1, 2017 at 9:23 PM, Marc MERLIN <marc@merlins.org> wrote: > Hi Chris, > > Thanks for the reply, much appreciated. > > On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote: >> What about btfs check (no repair), without and then also with --mode=lowmem? >> >> In theory I like the idea of a 24 hour rollback; but in normal usage >> Btrfs will eventually free up space containing stale and no longer >> necessary metadata. Like the chunk tree, it's always changing, so you >> get to a point, even with snapshots, that the old state of that tree >> is just - gone. A snapshot of an fs tree does not make the chunk tree >> frozen in time. > > Right, of course, I was being way over optimistic here. I kind of forgot > that metadata wasn't COW, my bad. Well it is COW. But there's more to the file system than fs trees, and just because an fs tree gets snapshot doesn't mean all data is snapshot. So whether snapshot or not, there's metadata that becomes obsolete as the file system is updated and those areas get freed up and eventually overwritten. > >> In any case, it's a big problem in my mind if no existing tools can >> fix a file system of this size. So before making anymore changes, make >> sure you have a btrfs-image somewhere, even if it's huge. The offline >> checker needs to be able to repair it, right now it's all we have for >> such a case. > > The image will be huge, and take maybe 24H to make (last time it took > some silly amount of time like that), and honestly I'm not sure how > useful it'll be. > Outside of the kernel crashing if I do a btrfs balance, and hopefully > the crash report I gave is good enough, the state I'm in is not btrfs' > fault. > > If I can't roll back to a reasonably working state, with data loss of a > known quantity that I can recover from backup, I'll have to destroy and > filesystem and recover from scratch, which will take multiple days. > Since I can't wait too long before getting back to a working state, I > think I'm going to try btrfs check --repair after a scrub to get a list > of all the pathanmes/inodes that are known to be damaged, and work from > there. > Sounds reasonable? Yes. > > Also, how is --mode=lowmem being useful? Testing. lowmem is a different implementation, so it might find different things from the regular check. > > And for re-parenting a sub-subvolume, is that possible? > (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume > and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete > sub1) Well you can move sub2 out of sub1 just like a directory and then delete sub1. If it's read-only it can't be moved, but you can use btrfs property get/set ro true/false to temporarily make it not read-only, move it, then make it read-only again, and it's still fine to use with btrfs send receive. > > In the meantime, a simple check without repair looks like this. It will > likely take many hours to complete: > gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2 > Checking filesystem on /dev/mapper/dshelf2 > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > checking extents > checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A > checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 > checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 > checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 Not understanding the problem, it's by definition naive for me to suggest it should go read-only sooner before hosing itself. But I'd like to think it's possible for Btrfs to look backward every once in a while for sanity checking, to limit damage should it be occurring even if the hardware isn't reporting any problems. -- Chris Murphy ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 4:56 ` Chris Murphy @ 2017-05-02 5:11 ` Marc MERLIN 2017-05-02 18:47 ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN 2017-07-07 5:37 ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN 2017-05-02 19:59 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow 1 sibling, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-05-02 5:11 UTC (permalink / raw) To: Chris Murphy Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, Qu Wenruo, David Sterba On Mon, May 01, 2017 at 10:56:06PM -0600, Chris Murphy wrote: > > Right, of course, I was being way over optimistic here. I kind of forgot > > that metadata wasn't COW, my bad. > > Well it is COW. But there's more to the file system than fs trees, and > just because an fs tree gets snapshot doesn't mean all data is > snapshot. So whether snapshot or not, there's metadata that becomes > obsolete as the file system is updated and those areas get freed up > and eventually overwritten. Got it, thanks for explaining. > > Also, how is --mode=lowmem being useful? > > Testing. lowmem is a different implementation, so it might find > different things from the regular check. I see. I've fired off some scrub -r and then check to run overnight, I'll see if it finishes overnight assuming the kernel doesn't crash again (yeah, just to make things simpler, I'm hitting another issue when I/O piles up on btrfs on top of dmcrypt on top of bcache http://lkml.iu.edu/hypermail/linux/kernel/1705.0/00626.html https://pastebin.com/YqE4riw0 but that's not a bcache bug, just something else getting in the way. > > And for re-parenting a sub-subvolume, is that possible? > > (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume > > and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete > > sub1) > > Well you can move sub2 out of sub1 just like a directory and then > delete sub1. If it's read-only it can't be moved, but you can use > btrfs property get/set ro true/false to temporarily make it not > read-only, move it, then make it read-only again, and it's still fine > to use with btrfs send receive. Ah, I didn't think mv would work from inside a subvolume to outside of a subvolume without copying data (it doesn't for files) but I guess it would for for subvolumes, good point. I'll try that, thanks. > Not understanding the problem, it's by definition naive for me to > suggest it should go read-only sooner before hosing itself. But I'd > like to think it's possible for Btrfs to look backward every once in a > while for sanity checking, to limit damage should it be occurring even > if the hardware isn't reporting any problems. Fair point. To be honest, maybe btrfs could indeed have detected problems earlier, but ultimately it's not really its fault if bad things happen when I'm having repeated storage errors underneath. For all I know, some data got written after getting corrupted and btrfs would not notice that right away. Now, I kind of naively thought I could simply unroll all writes done after a certain point. You pointed right (rightfully so) that it's not nearly as simple as I was hoping. So at this point, I think it's just a matter of me providing check/repair logs if they are useful, and someone looking into this balance causing a kernel crash, which is IMO the only real thing that btrfs should reasonably fix. I'll update the thread when I have more logs and have moved further on the recovery. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* btrfs check --repair: failed to repair damaged filesystem, aborting 2017-05-02 5:11 ` Marc MERLIN @ 2017-05-02 18:47 ` Marc MERLIN 2017-05-03 6:00 ` Marc MERLIN 2017-07-07 5:37 ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-02 18:47 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba (cc trimmed) The one in debian/unstable crashed: gargamel:~# btrfs --version btrfs-progs v4.7.3 gargamel:~# btrfs check --repair /dev/mapper/dshelf2 bytenr mismatch, want=2899180224512, have=3981076597540270796 extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed. btrfs[0x43e418] btrfs[0x43e43f] btrfs[0x43f276] btrfs[0x43f46f] btrfs[0x4407ef] btrfs[0x440963] btrfs(btrfs_inc_extent_ref+0x513)[0x44107a] btrfs[0x420053] btrfs[0x4265eb] btrfs(cmd_check+0x1111)[0x427d6d] btrfs(main+0x12f)[0x40a341] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1] btrfs(_start+0x2a)[0x40a37a] Ok, it's old, let's take git from today: gargamel:~# btrfs --version btrfs-progs v4.10.2 As a note, gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2 enabling repair mode ERROR: low memory mode doesn't support repair yet As a note, a 32bit binary on a 64bit kernel: gargamel:~# btrfs check --repair /dev/mapper/dshelf2 enabling repair mode Checking filesystem on /dev/mapper/dshelf2 UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 checking extents checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B parent transid verify failed on 1671538819072 wanted 293964 found 293902 parent transid verify failed on 1671538819072 wanted 293964 found 293902 checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1 Aborted let's try again with a 64bit binary built from git: (...) Repaired extent references for 4227617038336 ref mismatch on [4227872751616 4096] extent item 1, found 0 Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0 offset 0 found 0 wanted 1 back 0x56470b18e7f0 Backref disk bytenr does not match extent record, bytenr=4227872751616, ref bytenr=0 backpointer mismatch on [4227872751616 4096] owner ref check failed [4227872751616 4096] repair deleting extent record: key 4227872751616 168 4096 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 Repaired extent references for 4227872751616 ref mismatch on [6674127745024 32768] extent item 0, found 1 Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 offset 0 found 1 wanted 0 back 0x5648afda0f20 backpointer mismatch on [6674127745024 32768] checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C bytenr mismatch, want=6983266418688, have=13671317608077697645 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, want=2899180224512, have=3981076597540270796 failed to repair damaged filesystem, aborting So, I'm out of luck now, full wipe and 3-5 day rebuild? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: btrfs check --repair: failed to repair damaged filesystem, aborting 2017-05-02 18:47 ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN @ 2017-05-03 6:00 ` Marc MERLIN 2017-05-03 6:17 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-03 6:00 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba David, I think you maintain btrfs-progs, but I'm not sure if you're in charge of check --repair. Could you comment on the bottom of the mail, namely: > failed to repair damaged filesystem, aborting > So, I'm out of luck now, full wipe and 3-5 day rebuild? Thanks, Marc Rest: On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote: > (cc trimmed) > > The one in debian/unstable crashed: > gargamel:~# btrfs --version > btrfs-progs v4.7.3 > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed. > btrfs[0x43e418] > btrfs[0x43e43f] > btrfs[0x43f276] > btrfs[0x43f46f] > btrfs[0x4407ef] > btrfs[0x440963] > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a] > btrfs[0x420053] > btrfs[0x4265eb] > btrfs(cmd_check+0x1111)[0x427d6d] > btrfs(main+0x12f)[0x40a341] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1] > btrfs(_start+0x2a)[0x40a37a] > > Ok, it's old, let's take git from today: > gargamel:~# btrfs --version > btrfs-progs v4.10.2 > As a note, > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2 > enabling repair mode > ERROR: low memory mode doesn't support repair yet > > As a note, a 32bit binary on a 64bit kernel: > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > enabling repair mode > Checking filesystem on /dev/mapper/dshelf2 > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > checking extents > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1 > Aborted > > let's try again with a 64bit binary built from git: > (...) > Repaired extent references for 4227617038336 > ref mismatch on [4227872751616 4096] extent item 1, found 0 > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0 > offset 0 found 0 wanted 1 back 0x56470b18e7f0 > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref > bytenr=0 > backpointer mismatch on [4227872751616 4096] > owner ref check failed [4227872751616 4096] > repair deleting extent record: key 4227872751616 168 4096 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > Repaired extent references for 4227872751616 > ref mismatch on [6674127745024 32768] extent item 0, found 1 > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not > found in extent tree > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 > offset 0 found 1 wanted 0 back 0x5648afda0f20 > backpointer mismatch on [6674127745024 32768] > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > bytenr mismatch, want=6983266418688, have=13671317608077697645 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > failed to repair damaged filesystem, aborting > > > So, I'm out of luck now, full wipe and 3-5 day rebuild? > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: btrfs check --repair: failed to repair damaged filesystem, aborting 2017-05-03 6:00 ` Marc MERLIN @ 2017-05-03 6:17 ` Marc MERLIN 2017-05-03 6:32 ` Roman Mamedov 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-03 6:17 UTC (permalink / raw) To: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba On Tue, May 02, 2017 at 11:00:08PM -0700, Marc MERLIN wrote: > David, > > I think you maintain btrfs-progs, but I'm not sure if you're in charge > of check --repair. > Could you comment on the bottom of the mail, namely: > > failed to repair damaged filesystem, aborting > > So, I'm out of luck now, full wipe and 3-5 day rebuild? Actually, another thought: Is there or should there be a way to repair around the bit that cannot be repaired? Separately, or not, can I locate which bits are causing the repair to fail and maybe get a pointer to the path/inode so that I can hopefully just delete those bad data structures (assuming deleting them is even possible and that the FS won't just go read only as I try to do that) Here is the full run if that helps: https://pastebin.com/STMFHty4 > Thanks, > Marc > > Rest: > On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote: > > (cc trimmed) > > > > The one in debian/unstable crashed: > > gargamel:~# btrfs --version > > btrfs-progs v4.7.3 > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed. > > btrfs[0x43e418] > > btrfs[0x43e43f] > > btrfs[0x43f276] > > btrfs[0x43f46f] > > btrfs[0x4407ef] > > btrfs[0x440963] > > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a] > > btrfs[0x420053] > > btrfs[0x4265eb] > > btrfs(cmd_check+0x1111)[0x427d6d] > > btrfs(main+0x12f)[0x40a341] > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1] > > btrfs(_start+0x2a)[0x40a37a] > > > > Ok, it's old, let's take git from today: > > gargamel:~# btrfs --version > > btrfs-progs v4.10.2 > > As a note, > > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2 > > enabling repair mode > > ERROR: low memory mode doesn't support repair yet > > > > As a note, a 32bit binary on a 64bit kernel: > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > > enabling repair mode > > Checking filesystem on /dev/mapper/dshelf2 > > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > > checking extents > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1 > > Aborted > > > > let's try again with a 64bit binary built from git: > > (...) > > Repaired extent references for 4227617038336 > > ref mismatch on [4227872751616 4096] extent item 1, found 0 > > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0 > > offset 0 found 0 wanted 1 back 0x56470b18e7f0 > > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref > > bytenr=0 > > backpointer mismatch on [4227872751616 4096] > > owner ref check failed [4227872751616 4096] > > repair deleting extent record: key 4227872751616 168 4096 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > Repaired extent references for 4227872751616 > > ref mismatch on [6674127745024 32768] extent item 0, found 1 > > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not > > found in extent tree > > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 > > offset 0 found 1 wanted 0 back 0x5648afda0f20 > > backpointer mismatch on [6674127745024 32768] > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > bytenr mismatch, want=6983266418688, have=13671317608077697645 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > failed to repair damaged filesystem, aborting > > > > > > So, I'm out of luck now, full wipe and 3-5 day rebuild? > > > > Thanks, > > Marc > > -- > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Microsoft is to operating systems .... > > .... what McDonalds is to gourmet cooking > > Home page: http://marc.merlins.org/ > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: btrfs check --repair: failed to repair damaged filesystem, aborting 2017-05-03 6:17 ` Marc MERLIN @ 2017-05-03 6:32 ` Roman Mamedov 2017-05-03 20:40 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Roman Mamedov @ 2017-05-03 6:32 UTC (permalink / raw) To: Marc MERLIN; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba On Tue, 2 May 2017 23:17:11 -0700 Marc MERLIN <marc@merlins.org> wrote: > On Tue, May 02, 2017 at 11:00:08PM -0700, Marc MERLIN wrote: > > David, > > > > I think you maintain btrfs-progs, but I'm not sure if you're in charge > > of check --repair. > > Could you comment on the bottom of the mail, namely: > > > failed to repair damaged filesystem, aborting > > > So, I'm out of luck now, full wipe and 3-5 day rebuild? > > Actually, another thought: > Is there or should there be a way to repair around the bit that cannot > be repaired? > Separately, or not, can I locate which bits are causing the repair to > fail and maybe get a pointer to the path/inode so that I can hopefully > just delete those bad data structures (assuming deleting them is even > possible and that the FS won't just go read only as I try to do that) There is the "btrfs-corrupt-block" tool which helped me to kick Btrfsck further along its course in a similar "unrepairable" situation. https://www.spinics.net/lists/linux-btrfs/msg53061.html In your case it appears like the block 2899180224512 is giving it the most trouble, so you could start with killing that one. From what I can tell this tool zeroes out the entire block, so Btrfsck can simply delete the reference and forget it, rather than repeatedly trying to figure out solutions and bailing out with "failed to repair damaged filesystem, aborting". Depending on what was stored in it, you may have either no visible effect, or a complete filesystem failure, or anything in between. Hence if you want to experiment with this, find a way to work on writable overlay snapshots (also described in the linked message). > Here is the full run if that helps: > https://pastebin.com/STMFHty4 > > > Thanks, > > Marc > > > > Rest: > > On Tue, May 02, 2017 at 11:47:22AM -0700, Marc MERLIN wrote: > > > (cc trimmed) > > > > > > The one in debian/unstable crashed: > > > gargamel:~# btrfs --version > > > btrfs-progs v4.7.3 > > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > extent-tree.c:2721: alloc_reserved_tree_block: Assertion `ret` failed. > > > btrfs[0x43e418] > > > btrfs[0x43e43f] > > > btrfs[0x43f276] > > > btrfs[0x43f46f] > > > btrfs[0x4407ef] > > > btrfs[0x440963] > > > btrfs(btrfs_inc_extent_ref+0x513)[0x44107a] > > > btrfs[0x420053] > > > btrfs[0x4265eb] > > > btrfs(cmd_check+0x1111)[0x427d6d] > > > btrfs(main+0x12f)[0x40a341] > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f6b632e82b1] > > > btrfs(_start+0x2a)[0x40a37a] > > > > > > Ok, it's old, let's take git from today: > > > gargamel:~# btrfs --version > > > btrfs-progs v4.10.2 > > > As a note, > > > gargamel:~# btrfs check --mode=lowmem --repair /dev/mapper/dshelf2 > > > enabling repair mode > > > ERROR: low memory mode doesn't support repair yet > > > > > > As a note, a 32bit binary on a 64bit kernel: > > > gargamel:~# btrfs check --repair /dev/mapper/dshelf2 > > > enabling repair mode > > > Checking filesystem on /dev/mapper/dshelf2 > > > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > > > checking extents > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > > > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > > > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > > > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > > > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > > > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > > > cmds-check.c:6291: add_data_backref: BUG_ON `!back` triggered, value 1 > > > Aborted > > > > > > let's try again with a 64bit binary built from git: > > > (...) > > > Repaired extent references for 4227617038336 > > > ref mismatch on [4227872751616 4096] extent item 1, found 0 > > > Incorrect local backref count on 4227872751616 parent 3493071667200 owner 0 > > > offset 0 found 0 wanted 1 back 0x56470b18e7f0 > > > Backref disk bytenr does not match extent record, bytenr=4227872751616, ref > > > bytenr=0 > > > backpointer mismatch on [4227872751616 4096] > > > owner ref check failed [4227872751616 4096] > > > repair deleting extent record: key 4227872751616 168 4096 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > Repaired extent references for 4227872751616 > > > ref mismatch on [6674127745024 32768] extent item 0, found 1 > > > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not > > > found in extent tree > > > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 > > > offset 0 found 1 wanted 0 back 0x5648afda0f20 > > > backpointer mismatch on [6674127745024 32768] > > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > > checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E > > > checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C > > > bytenr mismatch, want=6983266418688, have=13671317608077697645 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > > > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > > > bytenr mismatch, want=2899180224512, have=3981076597540270796 > > > failed to repair damaged filesystem, aborting > > > > > > > > > So, I'm out of luck now, full wipe and 3-5 day rebuild? > > > > > > Thanks, > > > Marc > > > -- > > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > > Microsoft is to operating systems .... > > > .... what McDonalds is to gourmet cooking > > > Home page: http://marc.merlins.org/ > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Microsoft is to operating systems .... > > .... what McDonalds is to gourmet cooking > > Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- With respect, Roman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: btrfs check --repair: failed to repair damaged filesystem, aborting 2017-05-03 6:32 ` Roman Mamedov @ 2017-05-03 20:40 ` Marc MERLIN 0 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-05-03 20:40 UTC (permalink / raw) To: Roman Mamedov; +Cc: Btrfs BTRFS, Chris Mason, Qu Wenruo, David Sterba On Wed, May 03, 2017 at 11:32:26AM +0500, Roman Mamedov wrote: > > Actually, another thought: > > Is there or should there be a way to repair around the bit that cannot > > be repaired? > > Separately, or not, can I locate which bits are causing the repair to > > fail and maybe get a pointer to the path/inode so that I can hopefully > > just delete those bad data structures (assuming deleting them is even > > possible and that the FS won't just go read only as I try to do that) > > There is the "btrfs-corrupt-block" tool which helped me to kick Btrfsck > further along its course in a similar "unrepairable" situation. > https://www.spinics.net/lists/linux-btrfs/msg53061.html > > In your case it appears like the block 2899180224512 is giving it the most > trouble, so you could start with killing that one. From what I can tell this > tool zeroes out the entire block, so Btrfsck can simply delete the reference > and forget it, rather than repeatedly trying to figure out solutions and > bailing out with "failed to repair damaged filesystem, aborting". > > Depending on what was stored in it, you may have either no visible effect, or > a complete filesystem failure, or anything in between. Hence if you want to > experiment with this, find a way to work on writable overlay snapshots (also > described in the linked message). Thanks for the tip. This does not seem to have worked at all, though. Did I do something wrong? gargamel:/var/local/src/btrfs-progs# ./btrfs-corrupt-block -l 2899180224512 /dev/mapper/dshelf2 mirror 1 logical 2899180224512 physical 2814363009024 device /dev/mapper/dshelf2 corrupting 2899180224512 copy 1 mirror 2 logical 2899180224512 physical 2814899879936 device /dev/mapper/dshelf2 corrupting 2899180224512 copy 2 gargamel:/mnt/btrfs_pool1# btrfs check --repair /dev/mapper/dshelf2 (...) checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 checksum verify failed on 2899180224512 found E5245DBD wanted 00000000 checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 bytenr mismatch, want=2899180224512, have=0 Repaired extent references for 3566695825408 ref mismatch on [6674127745024 32768] extent item 0, found 1 Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 offset 0 found 1 wanted 0 back 0x555cb4e9ced0 backpointer mismatch on [6674127745024 32768] checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C checksum verify failed on 6983266418688 found BCBF9E15 wanted 785FF67E checksum verify failed on 6983266418688 found 393B112A wanted 2B19CD5C bytenr mismatch, want=6983266418688, have=13671317608077697645 checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 checksum verify failed on 2899180224512 found E5245DBD wanted 00000000 checksum verify failed on 2899180224512 found F25BEE55 wanted 00000000 bytenr mismatch, want=2899180224512, have=0 failed to repair damaged filesystem, aborting Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 2017-05-02 5:11 ` Marc MERLIN 2017-05-02 18:47 ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN @ 2017-07-07 5:37 ` Marc MERLIN 2017-07-07 5:39 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-07-07 5:37 UTC (permalink / raw) To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba I'm still trying to fix my filesystem. It seems to work well enough since the damage is apparently localized, but I'd really want check --repair to actually bring it back to a working state, but now it's crashing This is btrfs tools from git from a few days ago Failed to find [4068943577088, 168, 16384] btrfs unable to find ref byte nr 4068943577088 parent 0 root 4 owner 1 offset 0 Failed to find [5905106075648, 168, 16384] btrfs unable to find ref byte nr 5906282119168 parent 0 root 4 owner 0 offset 1 Failed to find [21037056, 168, 16384] btrfs unable to find ref byte nr 21037056 parent 0 root 3 owner 1 offset 0 Failed to find [21053440, 168, 16384] btrfs unable to find ref byte nr 21053440 parent 0 root 3 owner 0 offset 1 Failed to find [21299200, 168, 16384] btrfs unable to find ref byte nr 21299200 parent 0 root 3 owner 0 offset 1 Failed to find [5523931971584, 168, 16384] btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861 owner 3 offset 0 ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 btrfs(+0x113cf)[0x5651e60443cf] btrfs(__btrfs_cow_block+0x576)[0x5651e6045848] btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6] btrfs(btrfs_search_slot+0x11df)[0x5651e604969d] btrfs(+0x59184)[0x5651e608c184] btrfs(cmd_check+0x2bd4)[0x5651e60987b3] btrfs(main+0x85)[0x5651e60442c3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1] btrfs(_start+0x2a)[0x5651e6043e3a] Full log: enabling repair mode Checking filesystem on /dev/mapper/dshelf2 UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede checking extents Fixed 0 roots. checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots checksum verify failed on 3037243965440 found 179689AF wanted 82B97043 checksum verify failed on 3037243965440 found 179689AF wanted 82B97043 checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09 checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09 checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598 checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598 checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135 checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135 checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7 checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7 checksum verify failed on 3111569391616 found 3C623707 wanted D955D668 checksum verify failed on 3111569391616 found 3C623707 wanted D955D668 checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2 checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2 checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35 checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35 checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339 checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339 checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43 checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43 checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0 checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0 checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10 checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10 checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81 checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81 checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B Csum didn't match The following tree block(s) is corrupted in tree 3861: tree block bytenr: 1710573748224, level: 1, node key: (1073956, 12, 959325) Try to repair the btree for root 3861 Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Csum didn't match Failed to find [4068943577088, 168, 16384] btrfs unable to find ref byte nr 4068943577088 parent 0 root 4 owner 1 offset 0 Failed to find [5905106075648, 168, 16384] btrfs unable to find ref byte nr 5906282119168 parent 0 root 4 owner 0 offset 1 Failed to find [21037056, 168, 16384] btrfs unable to find ref byte nr 21037056 parent 0 root 3 owner 1 offset 0 Failed to find [21053440, 168, 16384] btrfs unable to find ref byte nr 21053440 parent 0 root 3 owner 0 offset 1 Failed to find [21299200, 168, 16384] btrfs unable to find ref byte nr 21299200 parent 0 root 3 owner 0 offset 1 Failed to find [5523931971584, 168, 16384] btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861 owner 3 offset 0 ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 btrfs(+0x113cf)[0x5651e60443cf] btrfs(__btrfs_cow_block+0x576)[0x5651e6045848] btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6] btrfs(btrfs_search_slot+0x11df)[0x5651e604969d] btrfs(+0x59184)[0x5651e608c184] btrfs(cmd_check+0x2bd4)[0x5651e60987b3] btrfs(main+0x85)[0x5651e60442c3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1] btrfs(_start+0x2a)[0x5651e6043e3a] Aborted gargamel:~# -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 2017-07-07 5:37 ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN @ 2017-07-07 5:39 ` Marc MERLIN 2017-07-07 9:33 ` Lu Fengqi 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-07-07 5:39 UTC (permalink / raw) To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba On Thu, Jul 06, 2017 at 10:37:18PM -0700, Marc MERLIN wrote: > I'm still trying to fix my filesystem. > It seems to work well enough since the damage is apparently localized, but > I'd really want check --repair to actually bring it back to a working > state, but now it's crashing > > This is btrfs tools from git from a few days ago > > Failed to find [4068943577088, 168, 16384] > btrfs unable to find ref byte nr 4068943577088 parent 0 root 4 owner 1 offset 0 > Failed to find [5905106075648, 168, 16384] > btrfs unable to find ref byte nr 5906282119168 parent 0 root 4 owner 0 offset 1 > Failed to find [21037056, 168, 16384] > btrfs unable to find ref byte nr 21037056 parent 0 root 3 owner 1 offset 0 > Failed to find [21053440, 168, 16384] > btrfs unable to find ref byte nr 21053440 parent 0 root 3 owner 0 offset 1 > Failed to find [21299200, 168, 16384] > btrfs unable to find ref byte nr 21299200 parent 0 root 3 owner 0 offset 1 > Failed to find [5523931971584, 168, 16384] > btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861 owner 3 offset 0 > ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 > btrfs(+0x113cf)[0x5651e60443cf] > btrfs(__btrfs_cow_block+0x576)[0x5651e6045848] > btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6] > btrfs(btrfs_search_slot+0x11df)[0x5651e604969d] > btrfs(+0x59184)[0x5651e608c184] > btrfs(cmd_check+0x2bd4)[0x5651e60987b3] > btrfs(main+0x85)[0x5651e60442c3] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1] > btrfs(_start+0x2a)[0x5651e6043e3a] Mmmh, never mind, it seems that the software raid suffered yet another double disk failure due to some undermined flakiness in the underlying block device cabling :-/ That would likely explain the failures here. > Full log: > enabling repair mode > Checking filesystem on /dev/mapper/dshelf2 > UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede > checking extents > Fixed 0 roots. > checking free space cache > cache and super generation don't match, space cache will be invalidated > checking fs roots > checksum verify failed on 3037243965440 found 179689AF wanted 82B97043 > checksum verify failed on 3037243965440 found 179689AF wanted 82B97043 > checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F > checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F > checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E > checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E > checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C > checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C > checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09 > checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09 > checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA > checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA > checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC > checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC > checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598 > checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598 > checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135 > checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135 > checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D > checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D > checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC > checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC > checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7 > checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7 > checksum verify failed on 3111569391616 found 3C623707 wanted D955D668 > checksum verify failed on 3111569391616 found 3C623707 wanted D955D668 > checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A > checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A > checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2 > checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2 > checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35 > checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35 > checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339 > checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339 > checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43 > checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43 > checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF > checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF > checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE > checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE > checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0 > checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0 > checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE > checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE > checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10 > checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10 > checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81 > checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81 > checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B > checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B > Csum didn't match > The following tree block(s) is corrupted in tree 3861: > tree block bytenr: 1710573748224, level: 1, node key: (1073956, 12, 959325) > Try to repair the btree for root 3861 > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Csum didn't match > Failed to find [4068943577088, 168, 16384] > btrfs unable to find ref byte nr 4068943577088 parent 0 root 4 owner 1 offset 0 > Failed to find [5905106075648, 168, 16384] > btrfs unable to find ref byte nr 5906282119168 parent 0 root 4 owner 0 offset 1 > Failed to find [21037056, 168, 16384] > btrfs unable to find ref byte nr 21037056 parent 0 root 3 owner 1 offset 0 > Failed to find [21053440, 168, 16384] > btrfs unable to find ref byte nr 21053440 parent 0 root 3 owner 0 offset 1 > Failed to find [21299200, 168, 16384] > btrfs unable to find ref byte nr 21299200 parent 0 root 3 owner 0 offset 1 > Failed to find [5523931971584, 168, 16384] > btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861 owner 3 offset 0 > ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 > btrfs(+0x113cf)[0x5651e60443cf] > btrfs(__btrfs_cow_block+0x576)[0x5651e6045848] > btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6] > btrfs(btrfs_search_slot+0x11df)[0x5651e604969d] > btrfs(+0x59184)[0x5651e608c184] > btrfs(cmd_check+0x2bd4)[0x5651e60987b3] > btrfs(main+0x85)[0x5651e60442c3] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1] > btrfs(_start+0x2a)[0x5651e6043e3a] > Aborted > gargamel:~# > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 2017-07-07 5:39 ` Marc MERLIN @ 2017-07-07 9:33 ` Lu Fengqi 2017-07-07 16:38 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Lu Fengqi @ 2017-07-07 9:33 UTC (permalink / raw) To: Marc MERLIN; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba On Thu, Jul 06, 2017 at 10:39:53PM -0700, Marc MERLIN wrote: >On Thu, Jul 06, 2017 at 10:37:18PM -0700, Marc MERLIN wrote: >> I'm still trying to fix my filesystem. >> It seems to work well enough since the damage is apparently localized, but >> I'd really want check --repair to actually bring it back to a working >> state, but now it's crashing I apologise for my late reply. As a colleague left, I have to take over his work recently. >> >> This is btrfs tools from git from a few days ago >> >> Failed to find [4068943577088, 168, 16384] >> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4 owner 1 offset 0 >> Failed to find [5905106075648, 168, 16384] >> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4 owner 0 offset 1 >> Failed to find [21037056, 168, 16384] >> btrfs unable to find ref byte nr 21037056 parent 0 root 3 owner 1 offset 0 >> Failed to find [21053440, 168, 16384] >> btrfs unable to find ref byte nr 21053440 parent 0 root 3 owner 0 offset 1 >> Failed to find [21299200, 168, 16384] >> btrfs unable to find ref byte nr 21299200 parent 0 root 3 owner 0 offset 1 >> Failed to find [5523931971584, 168, 16384] >> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861 owner 3 offset 0 >> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 >> btrfs(+0x113cf)[0x5651e60443cf] >> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848] >> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6] >> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d] >> btrfs(+0x59184)[0x5651e608c184] >> btrfs(cmd_check+0x2bd4)[0x5651e60987b3] >> btrfs(main+0x85)[0x5651e60442c3] >> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1] >> btrfs(_start+0x2a)[0x5651e6043e3a] > >Mmmh, never mind, it seems that the software raid suffered yet another >double disk failure due to some undermined flakiness in the underlying block >device cabling :-/ >That would likely explain the failures here. I'm sorry for hear this. Which raid level are you using? So could you recover from this double disk failure? -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 2017-07-07 9:33 ` Lu Fengqi @ 2017-07-07 16:38 ` Marc MERLIN 2017-07-09 4:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-07-07 16:38 UTC (permalink / raw) To: Lu Fengqi; +Cc: Btrfs BTRFS, Josef Bacik, Qu Wenruo, David Sterba On Fri, Jul 07, 2017 at 05:33:20PM +0800, Lu Fengqi wrote: > I apologise for my late reply. As a colleague left, I have to take over his > work recently. no worries. > >Mmmh, never mind, it seems that the software raid suffered yet another > >double disk failure due to some undermined flakiness in the underlying block > >device cabling :-/ > >That would likely explain the failures here. > > I'm sorry for hear this. Which raid level are you using? So could you recover > from this double disk failure? The disks aren't failed, and the array wasn't being written to. It's just a matter of putting the disks back in the md raid5 array in the right order. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-07 16:38 ` Marc MERLIN @ 2017-07-09 4:34 ` Marc MERLIN 2017-07-09 5:05 ` We really need a better/working btrfs check --repair Marc MERLIN ` (2 more replies) 0 siblings, 3 replies; 77+ messages in thread From: Marc MERLIN @ 2017-07-09 4:34 UTC (permalink / raw) To: Lu Fengqi; +Cc: Btrfs BTRFS, David Sterba Sigh, This is now the 3rd filesystem I have (on 3 different machines) that is getting corruption of some kind (on 4.11.6). This is starting to look suspicious :-/ Can I fix this filesystem in some other way? gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 enabling repair mode Checking filesystem on /dev/mapper/crypt_bcache2 UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57 checking extents ref mismatch on [14655689654272 16384] extent item 0, found 1 Backref 14655689654272 parent 15455 root 15455 not found in extent tree backpointer mismatch on [14655689654272 16384] owner ref check failed [14655689654272 16384] repair deleting extent record: key 14655689654272 169 1 adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455 Repaired extent references for 14655689654272 root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) ERROR: failed to repair root items: Invalid argument Recreating the filesystem is going to take me a week of work, a lot of if manual, and I'm not feeling very good with doing this since the backup server this is a backup of, is also seeing some hopefully minor) problems too. I really hope there isn't a new corruption problem in 4.11, because when I'm getting corruption on my laptop, my backup server, and the backup of my backup server, I'm starting to run out of redundant backups :( (and I'm not mentioning all the time this is costing me) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* We really need a better/working btrfs check --repair 2017-07-09 4:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN @ 2017-07-09 5:05 ` Marc MERLIN 2017-07-09 6:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 2017-07-09 7:57 ` Martin Steigerwald 2 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-07-09 5:05 UTC (permalink / raw) To: Lu Fengqi, Chris Mason; +Cc: Btrfs BTRFS, David Sterba +Chris On Sat, Jul 08, 2017 at 09:34:17PM -0700, Marc MERLIN wrote: > gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 > enabling repair mode > Checking filesystem on /dev/mapper/crypt_bcache2 > UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57 > checking extents > ref mismatch on [14655689654272 16384] extent item 0, found 1 > Backref 14655689654272 parent 15455 root 15455 not found in extent tree > backpointer mismatch on [14655689654272 16384] > owner ref check failed [14655689654272 16384] > repair deleting extent record: key 14655689654272 169 1 > adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455 > Repaired extent references for 14655689654272 > root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) > ERROR: failed to repair root items: Invalid argument On this note, getting hit 3 times on 3 different filesystems, that are not badly damaged, but in none of those caess can btrfs check --repair put them in a working state, is really bringing home the problem with lack of proper fsck. I understand that some errors are hard to fix without unknown data loss, but btrfs check --repair should just do what it takes to put the filesystem back into a consistent state, never mind what data is lost. Restoring 10 to 20TB of data is getting old and is not really an acceptable answer as the only way out. I should not have to recreate a filesystem as the only way to bring it back to a working state. Before Duncan tells me my filesystem is too big, and I should keep to very small filesystems so that it's less work for each time btrfs gets corrupted again, and fails again to bring back the filesystem to a usable state after discarding some data, that's just not an acceptable answer long term, and by long term honestly I mean now. I just have data that doesn't segment well and the more small filesystems I make the more time I'm going to waste managing them all and dealing with which one gets full first :( So, whether 4.11 has a corruption problem, or not, please put some resources behind btrfs check --repair, be it the lowmem mode, or not. Thank you Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 4:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 2017-07-09 5:05 ` We really need a better/working btrfs check --repair Marc MERLIN @ 2017-07-09 6:34 ` Marc MERLIN 2017-07-09 7:57 ` Martin Steigerwald 2 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-07-09 6:34 UTC (permalink / raw) To: Lu Fengqi; +Cc: Btrfs BTRFS, David Sterba On Sat, Jul 08, 2017 at 09:34:17PM -0700, Marc MERLIN wrote: > Sigh, > > This is now the 3rd filesystem I have (on 3 different machines) that is > getting corruption of some kind (on 4.11.6). > This is starting to look suspicious :-/ > > Can I fix this filesystem in some other way? > gargamel:/var/local/scr/host# btrfs check --repair /dev/mapper/crypt_bcache2 > enabling repair mode > Checking filesystem on /dev/mapper/crypt_bcache2 > UUID: c4e6f9ca-e9a2-43d7-befa-763fc2cd5a57 > checking extents > ref mismatch on [14655689654272 16384] extent item 0, found 1 > Backref 14655689654272 parent 15455 root 15455 not found in extent tree > backpointer mismatch on [14655689654272 16384] > owner ref check failed [14655689654272 16384] > repair deleting extent record: key 14655689654272 169 1 > adding new tree backref on start 14655689654272 len 16384 parent 0 root 15455 > Repaired extent references for 14655689654272 > root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) > ERROR: failed to repair root items: Invalid argument Mmmh, actually to be fair, this was the 2nd run, I didn't scroll back enough and missed the first run (doing too many recoveries at once, I'm getting mixed up). This first run looks like a lot more things happened: http://marc.merlins.org/tmp/btrfs_check_ds5.txt The amount of things that went wrong here are very worrisome, given that there were no issues with those drives and that array has been working for over a year without problems, until I recently upgraded to 4.11 :( Now mind you, despite the 21MB of things that got fixed, I still kind of have the expectation that btrfs check --repairs continues and fixes everything until the filesystem is clean again, just like e2fsck -f would, but I understand that this filesystem somehow got corrupted to a point that it's maybe not that simple to do so. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 4:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 2017-07-09 5:05 ` We really need a better/working btrfs check --repair Marc MERLIN 2017-07-09 6:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN @ 2017-07-09 7:57 ` Martin Steigerwald 2017-07-09 9:16 ` Paul Jones 2017-07-31 21:07 ` Ivan Sizov 2 siblings, 2 replies; 77+ messages in thread From: Martin Steigerwald @ 2017-07-09 7:57 UTC (permalink / raw) To: Marc MERLIN; +Cc: Lu Fengqi, Btrfs BTRFS, David Sterba Hello Marc. Marc MERLIN - 08.07.17, 21:34: > Sigh, > > This is now the 3rd filesystem I have (on 3 different machines) that is > getting corruption of some kind (on 4.11.6). Anyone else getting corruptions with 4.11? I happily switch back to 4.10.17 or even 4.9 if that is the case. I may even do so just from your reports. Well, yes, I will do exactly that. I just switch back for 4.10 for now. Better be safe, than sorry. I know how you feel, Marc. I posted about a corruption on one of my backup harddisks here some time ago that btrfs check --repair wasn´t able to handle. I redid that disk from scratch and it took a long, long time. I agree with you that this has to stop. Before that I will never *ever* recommend this to a customer. Ideally no corruptions in stable kernels, especially when its a .6 at the end of the version number. But if so… then fixable. Other filesystems like Ext4 and XFS can do it… so this should be possible with BTRFS as well. Thanks, -- Martin ^ permalink raw reply [flat|nested] 77+ messages in thread
* RE: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 7:57 ` Martin Steigerwald @ 2017-07-09 9:16 ` Paul Jones 2017-07-09 11:17 ` Duncan 2017-07-31 21:07 ` Ivan Sizov 1 sibling, 1 reply; 77+ messages in thread From: Paul Jones @ 2017-07-09 9:16 UTC (permalink / raw) To: Martin Steigerwald, Marc MERLIN; +Cc: Btrfs BTRFS [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1175 bytes --] > -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs- > owner@vger.kernel.org] On Behalf Of Martin Steigerwald > Sent: Sunday, 9 July 2017 5:58 PM > To: Marc MERLIN <marc@merlins.org> > Cc: Lu Fengqi <lufq.fnst@cn.fujitsu.com>; Btrfs BTRFS <linux- > btrfs@vger.kernel.org>; David Sterba <dsterba@suse.cz> > Subject: Re: 4.11.6 / more corruption / root 15455 has a root item with a more > recent gen (33682) compared to the found root node (0) > > Hello Marc. > > Marc MERLIN - 08.07.17, 21:34: > > Sigh, > > > > This is now the 3rd filesystem I have (on 3 different machines) that > > is getting corruption of some kind (on 4.11.6). > > Anyone else getting corruptions with 4.11? > > I happily switch back to 4.10.17 or even 4.9 if that is the case. I may even do > so just from your reports. Well, yes, I will do exactly that. I just switch back > for 4.10 for now. Better be safe, than sorry. No corruption for me - I've been on 4.11 since about .2 and everything seems fine. Currently on 4.11.8 Paul. ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±ý»k~ÏâØ^nr¡ö¦zË\x1aëh¨èÚ&£ûàz¿äz¹Þú+Ê+zf£¢·h§~Ûiÿÿïêÿêçz_è®\x0fæj:+v¨þ)ߣøm ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 9:16 ` Paul Jones @ 2017-07-09 11:17 ` Duncan 2017-07-09 13:00 ` Martin Steigerwald 2017-07-29 19:29 ` Imran Geriskovan 0 siblings, 2 replies; 77+ messages in thread From: Duncan @ 2017-07-09 11:17 UTC (permalink / raw) To: linux-btrfs Paul Jones posted on Sun, 09 Jul 2017 09:16:36 +0000 as excerpted: >> Marc MERLIN - 08.07.17, 21:34: >> > >> > This is now the 3rd filesystem I have (on 3 different machines) that >> > is getting corruption of some kind (on 4.11.6). >> >> Anyone else getting corruptions with 4.11? >> >> I happily switch back to 4.10.17 or even 4.9 if that is the case. I may >> even do so just from your reports. Well, yes, I will do exactly that. I >> just switch back for 4.10 for now. Better be safe, than sorry. > > No corruption for me - I've been on 4.11 since about .2 and everything > seems fine. Currently on 4.11.8 No corruptions here either. 4.12.0 now, previously 4.12-rc5(ish, git), before that 4.11.0. I have however just upgraded to new ssds then wiped and setup the old ones as another backup set, so everything is on brand new filesystems on fast ssds, no possibility of old undetected corruption suddenly triggering problems. Also, all my btrfs are raid1 or dup for checksummed redundancy, and relatively small, the largest now 80 GiB per device, after the upgrade. And my use-case doesn't involve snapshots or subvolumes. So any bug that is most likely on older filesystems, say those without the no-holes feature, for instance, or that doesn't tend to hit raid1 or dup mode, or that is less likely on small filesystems on fast ssds, or that triggers most often with reflinks and thus on filesystems with snapshots, is unlikely to hit me. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 11:17 ` Duncan @ 2017-07-09 13:00 ` Martin Steigerwald 2017-07-29 19:29 ` Imran Geriskovan 1 sibling, 0 replies; 77+ messages in thread From: Martin Steigerwald @ 2017-07-09 13:00 UTC (permalink / raw) To: linux-btrfs Hello Duncan. Duncan - 09.07.17, 11:17: > Paul Jones posted on Sun, 09 Jul 2017 09:16:36 +0000 as excerpted: > >> Marc MERLIN - 08.07.17, 21:34: > >> > This is now the 3rd filesystem I have (on 3 different machines) that > >> > is getting corruption of some kind (on 4.11.6). > >> > >> Anyone else getting corruptions with 4.11? > >> > >> I happily switch back to 4.10.17 or even 4.9 if that is the case. I may > >> even do so just from your reports. Well, yes, I will do exactly that. I > >> just switch back for 4.10 for now. Better be safe, than sorry. > > > > No corruption for me - I've been on 4.11 since about .2 and everything > > seems fine. Currently on 4.11.8 > > No corruptions here either. 4.12.0 now, previously 4.12-rc5(ish, git), > before that 4.11.0. > > I have however just upgraded to new ssds then wiped and setup the old […] > Also, all my btrfs are raid1 or dup for checksummed redundancy, and > relatively small, the largest now 80 GiB per device, after the upgrade. > And my use-case doesn't involve snapshots or subvolumes. > > So any bug that is most likely on older filesystems, say those without > the no-holes feature, for instance, or that doesn't tend to hit raid1 or > dup mode, or that is less likely on small filesystems on fast ssds, or > that triggers most often with reflinks and thus on filesystems with > snapshots, is unlikely to hit me. Hmmm, the BTRFS filesystems on my laptop 3 to 5 or even more years old. I stick with 4.10 for now, I think. The older ones are RAID 1 across two SSDs, the newer one is single device, on one SSD. These filesystems didn´t fail me in years and since 4.5 or 4.6 even the "I search for free space" kernel hang (hung tasks and all that) is gone as well. Thanks, -- Martin ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 11:17 ` Duncan 2017-07-09 13:00 ` Martin Steigerwald @ 2017-07-29 19:29 ` Imran Geriskovan 2017-07-29 23:38 ` Duncan 1 sibling, 1 reply; 77+ messages in thread From: Imran Geriskovan @ 2017-07-29 19:29 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs On 7/9/17, Duncan <1i5t5.duncan@cox.net> wrote: > I have however just upgraded to new ssds then wiped and setup the old > ones as another backup set, so everything is on brand new filesystems on > fast ssds, no possibility of old undetected corruption suddenly > triggering problems. > > Also, all my btrfs are raid1 or dup for checksummed redundancy Do you have any experience/advice/comment regarding dup data on ssds? ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-29 19:29 ` Imran Geriskovan @ 2017-07-29 23:38 ` Duncan 2017-07-30 14:54 ` Imran Geriskovan 0 siblings, 1 reply; 77+ messages in thread From: Duncan @ 2017-07-29 23:38 UTC (permalink / raw) To: linux-btrfs Imran Geriskovan posted on Sat, 29 Jul 2017 21:29:46 +0200 as excerpted: > On 7/9/17, Duncan <1i5t5.duncan@cox.net> wrote: >> I have however just upgraded to new ssds then wiped and setup the old >> ones as another backup set, so everything is on brand new filesystems on >> fast ssds, no possibility of old undetected corruption suddenly >> triggering problems. >> >> Also, all my btrfs are raid1 or dup for checksummed redundancy > > Do you have any experience/advice/comment regarding > dup data on ssds? Very good question. =:^) Limited. Most of my btrfs are raid1, with dup only used on the device- respective /boot btrfs (of which there are four, one on each of the two ssds that otherwise form the btrfs raid1 pairs, for each of the working and backup copy pairs -- I can use BIOS to select any of the four to boot), and those are all sub-GiB mixed-bg mode. So all my dup experience is sub-GiB mixed-blockgroup mode. Within that limitation, my only btrfs problem has been that at my initially chosen size of 256 MiB, mkfs.btrfs at least used to create an initial data/metadata chunk of 64 MiB. Remember, this is dup mode, so there's two of them = 128 MiB. Because there's also a system chunk, that means the initial chunk cannot be balanced even with an entirely empty filesystem, because there's not enough space to write a second 64 MiB chunk duped to 128 MiB. Between that and the 256 MiB in dup mode size meaning under 128 MiB usable, and the fact that I routinely run and sometimes need to bisect pre-release kernels, I was routinely running out of space, then cleaning up, but not being able to do a full cleanup without a blow-away and new mkfs.btrfs, because I couldn't balance. When I recently purchased the second pair of (now larger) ssds in ordered to put everything, including the media and backups that were previously still on spinning rust, on ssd, I redid the layout and made the /boots 512 MiB, still mixed-bg dup mode. That seems to have solved the problem, and I can now rebalance the first mkfs.btrfs-created mixed-bg chunk, as it's now small enough that it's less than half the filesystem even when duped. Because it's now 512 MiB, however, I can't say for sure whether the previous problem with mkfs.btrfs creating an initial mixed-bg chunk of a quarter the 256 MiB filesystem size, so in dup mode it can't be balanced because it's half the total filesystem size and with the system chunk as well, the other half is partially used so there's no space to write the balance destination chunks, is fixed, or not. What I can say is that the problem doesn't affect the new 512 MiB size, at least with btrfs-progs 4.11.x, which is what I used to mkfs.btrfs the new layout. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-29 23:38 ` Duncan @ 2017-07-30 14:54 ` Imran Geriskovan 2017-07-31 4:53 ` Duncan 0 siblings, 1 reply; 77+ messages in thread From: Imran Geriskovan @ 2017-07-30 14:54 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs On 7/30/17, Duncan <1i5t5.duncan@cox.net> wrote: >>> Also, all my btrfs are raid1 or dup for checksummed redundancy >> Do you have any experience/advice/comment regarding >> dup data on ssds? > Very good question. =:^) > Limited. Most of my btrfs are raid1, with dup only used on the device- > respective /boot btrfs (of which there are four, one on each of the two > ssds that otherwise form the btrfs raid1 pairs, for each of the working > and backup copy pairs -- I can use BIOS to select any of the four to > boot), and those are all sub-GiB mixed-bg mode. Is this a military or deep space device? ;) > So all my dup experience is sub-GiB mixed-blockgroup mode. > > Within that limitation, my only btrfs problem has been that at my > initially chosen size of 256 MiB, mkfs.btrfs at least used to create an > initial data/metadata chunk of 64 MiB. Remember, this is dup mode, so > there's two of them = 128 MiB. Because there's also a system chunk, that > means the initial chunk cannot be balanced even with an entirely empty > filesystem, because there's not enough space to write a second 64 MiB > chunk duped to 128 MiB. For /boot, I've also tried dup data. But because of combinations of constraints you've mentioned, I totally give-up trying to have a bullet proof /boot as my poor laptop is not mission critical as your device and as I do always have bootable backups and always carry some bootable sdcards. Perhaps that has something to do with me kicking out all systemd, inits, initramfs, mkinitcpio, dracut, etc, etc. Now the init on /boot is a "19 lines" shell script, including lines for keymap, hdparm, crytpsetup. And let's not forget this is possible by a custom kernel, its reliable buddy syslinux. Interestingly my seach for reliability started with "dup data" and ended up here. :) ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-30 14:54 ` Imran Geriskovan @ 2017-07-31 4:53 ` Duncan 2017-07-31 20:32 ` Imran Geriskovan 0 siblings, 1 reply; 77+ messages in thread From: Duncan @ 2017-07-31 4:53 UTC (permalink / raw) To: linux-btrfs Imran Geriskovan posted on Sun, 30 Jul 2017 16:54:25 +0200 as excerpted: > On 7/30/17, Duncan <1i5t5.duncan@cox.net> wrote: >>>> Also, all my btrfs are raid1 or dup for checksummed redundancy > >>> Do you have any experience/advice/comment regarding dup data on ssds? > >> Very good question. =:^) > >> Limited. Most of my btrfs are raid1, with dup only used on the device- >> respective /boot btrfs (of which there are four, one on each of the two >> ssds that otherwise form the btrfs raid1 pairs, for each of the working >> and backup copy pairs -- I can use BIOS to select any of the four to >> boot), and those are all sub-GiB mixed-bg mode. > > Is this a military or deep space device? ;) Just happens to have four physical ssds, two pairs, with everything but /boot being paired btrfs raid1. Because I wanted similar partition layout for ease of management, that's a /boot on each one, and because bios can only point to one at a time, that's four separate grub installs [1], each of which is configured to load its own /boot. While four is a bit much, three can certainly be very useful, because it allows a bad grub upgrade to be core-installed to one BIOS-boot partition, while allowing me to fat-finger point it to the wrong /boot on a second device destroying my ability to boot to it as well, and still have a third untouched to boot from. The forth is simply bonus insurance on that, more by accident due to having two pair than because I really needed it. A minimum of three /boots is also quite convenient for my kernel update routine, given I routinely test and sometimes bisect pre-release kernels. The default/working /boot gets the prereleases with a release and stable fallback, the first backup the releases and a stable fallback, and the secondary backups get updated less frequently, generally when I'm doing a / backup cycle as well and there has been either a kernel config or system change substantial enough that I'm no longer confident the older kernels will work correctly with the updated system. Of course the same general testing/release/stable /boot system works well for other related updates, say to the grub menu (I use grub2's bash-like scripting language directly, not the high level stuff which I find too difficult to tweak to my liking) or the initrd, which I attach to the individual kernels at build-time, so a tested kernel selection is a tested initramfs selection as well. > For /boot, I've also tried dup data. > > But because of combinations of constraints you've mentioned, > I totally give-up trying to have a bullet proof /boot as my poor laptop > is not mission critical as your device and as I do always have bootable > backups and always carry some bootable sdcards. When I complained about the 64-MiB default mixed-bg mode chunk size on a 256 MiB filesystem being too big to allow balance in dup mode, a dev answered that in theory chunk sizes are supposed to be limited to 1/8 filesystem size (down to something like a 16 MiB minimum chunk size I think, but might be 8 or 32), but something about my setup, likely the mixed-bg mode as it's less tested, was short-circuiting that, thus the quarter-fs-size 64 MiB chunk sizes, which he agreed didn't make much sense on a 256 MiB filesystem in dup mode. He was able to duplicate the problem, and there seemed no disagreement is was a bug, but I'm not sure if mkfs.btrfs was ever patched to fix it, and of course now with the bigger half-gig filesystem the same 64-MiB initial chunk size is fine. And my other quarter-gig btrfs, log, is raid1, quarter-gig per device, so I'd not see the problem there, mixed-mode or not. (As mentioned in the footnote below, at least in this go-round it's not... more by accident than intent.) Meanwhile, such bugs come with the territory when you're running what might be roughly compared at the commercial software level to late beta or rc level software, or even initial release, pre-service-release-1, level, which I'd argue is a more accurate btrfs comparison at this point. As long as you stay within the known stable areas the danger of it eating your data is relatively small now, but the full feature set isn't there yet, and some of the features that are there are significantly less mature and stable than others. > Perhaps that has something to do with me kicking out all systemd, inits, > initramfs, mkinitcpio, dracut, etc, etc. > > Now the init on /boot is a "19 lines" shell script, including lines for > keymap, hdparm, crytpsetup. And let's not forget this is possible by a > custom kernel, its reliable buddy syslinux. FWIW... I really like grub2, especially it's quite flexible bash-like scripting language (the higher level stuff intended for normal users just isn't flexible enough for me, so I need the scripting language anyway, and once I knew that, the higher level stuff only got in the way) and command line that allow all sorts of stuff like browsing for kernel commandline documentation at the boot prompt that I never imagined possible in a boot manager. And after holding off for awhile, I'm now a cautious adopter and supporter of systemd in general, tho I don't use its solutions for /everything/ and don't like its extremely aggressive feature expansion. And after resisting an initr* for years as unnecessary, I've been a reluctant adopter since a btrfs raid1 root effectively requires it (rootflags=device= doesn't seem to work, for whatever reason, or at least didn't when I initially converted to btrfs, so at least a limited initr* seems the only viable solution for a btrfs raid1 root). And I'm using dracut for that, tho quite cut down from its default, with a monolithic kernel and only installing necessary dracut modules. But particularly after the last dracut update pulled in kmod as a mandatory dep as it now links against its libs, despite my monolithic kernel built without module support, I've been considering similar initr* alternatives, including hand-rolling my own initr* build scripts. Because I'm still not happy having to run an initr* at all, especially since there's more "magic" there than I'm particularly comfortable with since I like to grok the boot and thus potential recovery process better than I do this, and dracut was just the most convenient option at the time. But kmod isn't a /huge/ dep, particularly with the executables and docs install-masked so it's only the library, headers and *.pc config file installed, and the current dracut solution works /reasonably/ well, so finding/creating an alternative isn't particularly high on my priority list, and I'll probably never do it unless dracut suddenly decides some of its other modules are going to need mandatory deps, or something else radically changes the current fragile balance and I really do need that currently lacking initr* grok. > Interestingly my seach for reliability started with "dup data" and ended > up here. :) =:^) --- [1] Grub and partition layout: I install grub-core (i386-pc) to a raw GPT legacy BIOS boot partition. While this only requires a partition size of about a third of a MiB, I use gdisk's default 1 MiB alignment and the first MiB is the GTP and the alignment gap, so this first BIOS boot partition starts at 1 MiB and must be a whole MiB unit in size. Because I wanted plenty of room, however, and wanted additional partitions a minimum of 4 MiB aligned, I configured a 3 MiB BIOS boot partition for grub to use, thus accomplishing that 4 MiB alignment for further partitions. The second partition is a currently unused GPT EFI partition for forward compatibility, 252 MiB in size so further partitions are quarter-GiB aligned. The third partition is the /boot partition we've been discussing, a half GiB in size, thus ending at 3/4 GiB. It's my only btrfs mixed-mode dup in the layout, so a half gig in size but a quarter gig usable. As mentioned, with four physical ssds that's a total of four /boots, each pointed at by the grub-core installation in the first partition on the corresponding ssd. Partition 4 is the log partition, a quarter GiB in size as log rotation keeps typical usage under 50 MiB, but the quarter gig size means it ends on the 1 GiB boundary and further partitions are GiB aligned. In the last layout generation this was a half gig and /boot a quarter gig, but I decided /boot could use the extra quarter gig more than log so I traded sizes. This, like all further partitions, is btrfs raid1. I intended to make it mixed-bg mode, as it was in the previous generation layout, but forgot the mkfs.btrfs switch for that and it no longer defaults to mixed at under a gig, so I got standard mode. Never-the-less, with raid1 instead of dup, and low normal usage, the chunk size is small enough that balance shouldn't be an issue, and if it is I can always blow it away and recreate in mixed mode. All further partitions are gig-aligned btrfs raid1 pair-device, three copies, working/0 and backups 1 and 2, on two separate pairs of ssds. The older pair is 256GB/238GiB with the backup/1 copy, the newer pair is 1TB/931GiB with working/0 and backup/2. The partition size and layout is identical on all four thru the sub-GiB and first copy, with the second copy on the larger pair being a same-sequence same-size repeat of the first, beyond the non-duplicated sub-GiB, of course. So as long as the GPT on one of the four remains intact and bootable, I can easily recreate the other three. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 4:53 ` Duncan @ 2017-07-31 20:32 ` Imran Geriskovan 2017-08-01 1:36 ` Duncan 0 siblings, 1 reply; 77+ messages in thread From: Imran Geriskovan @ 2017-07-31 20:32 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs >>>> Do you have any experience/advice/comment regarding dup data on ssds? >>> Very good question. =:^) >> Now the init on /boot is a "19 lines" shell script, including lines for >> keymap, hdparm, crytpsetup. And let's not forget this is possible by a >> custom kernel and its reliable buddy syslinux. > > FWIW... > And I'm using dracut for that, tho quite cut down from its default, with > a monolithic kernel and only installing necessary dracut modules. Just create minimal bootable /boot for running below init. (Your initramfs/rd is a bloated and packaged version of this anyway.) Kick the rest. Since you a have your own kernel you are not far away from it. #!/bin/sh # This is actually busybox ash or hush. Cant remember now. # You may compile/customize your busybox as well. Easy. mount proc /proc -t proc mount sys /sys -t sysfs mount run /run -t tmpfs mkdir /dev/pts /dev/shm /run/lock mount devpts /dev/pts -t devpts & mount shm /dev/shm -t tmpfs & mount -o remount,rw,noatime / & # '&' is for backgrounding/parallel_execution. # Use responsibly double checking its side effects # depending on your setup. hdparm -B 254 /dev/sda & loadkmap < /boot/trq.bkmap cryptsetup -T 10 luksOpen /dev/sdXX sdXX mount /dev/mapper/sdXX /mnt/new_root -t btrfs -o noatime,compress=lzo cd /mnt/new_root mount --move /dev ./dev mount --move /proc ./proc mount --move /sys ./sys mount --move /run ./run pivot_root . boot exec chroot . busybox init # Jump to your real roots init. Whatever it may be. > But particularly after the last dracut update pulled in kmod as a > mandatory dep as it now links against its libs, despite my monolithic > kernel built without module support, I've been considering similar initr* > alternatives, including hand-rolling my own initr* build scripts. > > Because I'm still not happy having to run an initr* at all, especially > since there's more "magic" there than I'm particularly comfortable with > since I like to grok the boot and thus potential recovery process better > than I do this, and dracut was just the most convenient option at the > time. >> Interestingly my seach for reliability started with "dup data" and ended >> up here. :) > =:^) ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 20:32 ` Imran Geriskovan @ 2017-08-01 1:36 ` Duncan 2017-08-01 15:18 ` Imran Geriskovan 0 siblings, 1 reply; 77+ messages in thread From: Duncan @ 2017-08-01 1:36 UTC (permalink / raw) To: linux-btrfs Imran Geriskovan posted on Mon, 31 Jul 2017 22:32:39 +0200 as excerpted: >>> Now the init on /boot is a "19 lines" shell script, including lines >>> for keymap, hdparm, crytpsetup. And let's not forget this is possible >>> by a custom kernel and its reliable buddy syslinux. >> >> FWIW... >> And I'm using dracut for that, tho quite cut down from its default, >> with a monolithic kernel and only installing necessary dracut modules. > > Just create minimal bootable /boot for running below init. > (Your initramfs/rd is a bloated and packaged version of this anyway.) > Kick the rest. Since you a have your own kernel you are not far away > from it. Thanks. You just solved my primary problem of needing to take the time to actually research all the steps and in what order I needed to do them, for a hand-rolled script. =:^) Unfortunately, while I've been laid-up the last ~5 days due to a twisted knee and have been spending more time on the lists, etc, and would have loved to spend a day or so testing and setting this up, I'm back to work tomorrow, so I've no idea when I'll actually get to play with this. But meanwhile, I'm saving your message for reference when the time comes. It should be /very/ useful! =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-08-01 1:36 ` Duncan @ 2017-08-01 15:18 ` Imran Geriskovan 0 siblings, 0 replies; 77+ messages in thread From: Imran Geriskovan @ 2017-08-01 15:18 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs On 8/1/17, Duncan <1i5t5.duncan@cox.net> wrote: > Imran Geriskovan posted on Mon, 31 Jul 2017 22:32:39 +0200 as excerpted: >>>> Now the init on /boot is a "19 lines" shell script, including lines >>>> for keymap, hdparm, crytpsetup. And let's not forget this is possible >>>> by a custom kernel and its reliable buddy syslinux. >>> And I'm using dracut for that, tho quite cut down from its default, >>> with a monolithic kernel and only installing necessary dracut modules. >> Just create minimal bootable /boot for running below init. >> (Your initramfs/rd is a bloated and packaged version of this anyway.) >> Kick the rest. Since you a have your own kernel you are not far away >> from it. > Thanks. You just solved my primary problem of needing to take the time > to actually research all the steps and in what order I needed to do them, > for a hand-rolled script. =:^) It's just a minimal one. But it is a good start. For possible extensions extract your initramfs and explore it. Dracut is bloated. Try mkinitcpio. Once your have your self hosting bootmng, kernel, modules, /boot, init, etc chain, you'll be shocked to realize you have been spending so much time for that bullshit while trying to keep them up.. Get to this point in the shortest possible time. Save your precious time. And reclaim your systems reliability. For X, you'll still need udev or eudev. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-09 7:57 ` Martin Steigerwald 2017-07-09 9:16 ` Paul Jones @ 2017-07-31 21:07 ` Ivan Sizov 2017-07-31 21:17 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Ivan Sizov @ 2017-07-31 21:07 UTC (permalink / raw) To: Martin Steigerwald Cc: Marc MERLIN, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>: > Hello Marc. > > Marc MERLIN - 08.07.17, 21:34: >> Sigh, >> >> This is now the 3rd filesystem I have (on 3 different machines) that is >> getting corruption of some kind (on 4.11.6). > > Anyone else getting corruptions with 4.11? Yes, a lot. There are at least 3 cases, probably I've missed something. https://www.spinics.net/lists/linux-btrfs/msg67177.html https://www.spinics.net/lists/linux-btrfs/msg67681.html https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275 If an additional debug info is needed, I'm ready to provide it. -- Ivan Sizov ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 21:07 ` Ivan Sizov @ 2017-07-31 21:17 ` Marc MERLIN 2017-07-31 21:39 ` Ivan Sizov 2017-07-31 22:00 ` Justin Maggard 0 siblings, 2 replies; 77+ messages in thread From: Marc MERLIN @ 2017-07-31 21:17 UTC (permalink / raw) To: Ivan Sizov Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote: > 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>: > > Hello Marc. > > > > Marc MERLIN - 08.07.17, 21:34: > >> Sigh, > >> > >> This is now the 3rd filesystem I have (on 3 different machines) that is > >> getting corruption of some kind (on 4.11.6). > > > > Anyone else getting corruptions with 4.11? > Yes, a lot. There are at least 3 cases, probably I've missed something. > https://www.spinics.net/lists/linux-btrfs/msg67177.html > https://www.spinics.net/lists/linux-btrfs/msg67681.html > https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275 Indeed. My main server is happy back on 4.9.36 and while my laptop is stuck on 4.11 due to other kernel issues that prevent me from going back to 4.9, it only corrupted a single filesystem so far, and no other ones that I've noticed yet. Hopefully that will hold :-/ Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 21:17 ` Marc MERLIN @ 2017-07-31 21:39 ` Ivan Sizov 2017-08-01 16:41 ` Ivan Sizov 2017-07-31 22:00 ` Justin Maggard 1 sibling, 1 reply; 77+ messages in thread From: Ivan Sizov @ 2017-07-31 21:39 UTC (permalink / raw) To: Marc MERLIN Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan 2017-08-01 0:17 GMT+03:00 Marc MERLIN <marc@merlins.org>: > On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote: >> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>: >> > Hello Marc. >> > >> > Marc MERLIN - 08.07.17, 21:34: >> >> Sigh, >> >> >> >> This is now the 3rd filesystem I have (on 3 different machines) that is >> >> getting corruption of some kind (on 4.11.6). >> > >> > Anyone else getting corruptions with 4.11? >> Yes, a lot. There are at least 3 cases, probably I've missed something. >> https://www.spinics.net/lists/linux-btrfs/msg67177.html >> https://www.spinics.net/lists/linux-btrfs/msg67681.html >> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275 > > Indeed. My main server is happy back on 4.9.36 and while my laptop is > stuck on 4.11 due to other kernel issues that prevent me from going back > to 4.9, it only corrupted a single filesystem so far, and no other ones > that I've noticed yet. > Hopefully that will hold :-/ > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 I want to try mounting and checking FS under Live images with different kernels tomorrow. Today's Fedora Rawhide image seems to be built incorrectly. Can you advice me where to get a fresh live image with 4.12 kernel (it's not important which distro that will be)? -- Ivan Sizov ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 21:39 ` Ivan Sizov @ 2017-08-01 16:41 ` Ivan Sizov 0 siblings, 0 replies; 77+ messages in thread From: Ivan Sizov @ 2017-08-01 16:41 UTC (permalink / raw) To: Marc MERLIN Cc: Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan 2017-08-01 0:39 GMT+03:00 Ivan Sizov <sivan606@gmail.com>: > 2017-08-01 0:17 GMT+03:00 Marc MERLIN <marc@merlins.org>: >> On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote: >>> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>: >>> > Hello Marc. >>> > >>> > Marc MERLIN - 08.07.17, 21:34: >>> >> Sigh, >>> >> >>> >> This is now the 3rd filesystem I have (on 3 different machines) that is >>> >> getting corruption of some kind (on 4.11.6). >>> > >>> > Anyone else getting corruptions with 4.11? >>> Yes, a lot. There are at least 3 cases, probably I've missed something. >>> https://www.spinics.net/lists/linux-btrfs/msg67177.html >>> https://www.spinics.net/lists/linux-btrfs/msg67681.html >>> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275 >> >> Indeed. My main server is happy back on 4.9.36 and while my laptop is >> stuck on 4.11 due to other kernel issues that prevent me from going back >> to 4.9, it only corrupted a single filesystem so far, and no other ones >> that I've noticed yet. >> Hopefully that will hold :-/ >> >> Marc >> -- >> "A mouse is a device used to point at the xterm you want to type in" - A.S.R. >> Microsoft is to operating systems .... >> .... what McDonalds is to gourmet cooking >> Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 > > I want to try mounting and checking FS under Live images with > different kernels tomorrow. Today's Fedora Rawhide image seems to be > built incorrectly. Can you advice me where to get a fresh live image > with 4.12 kernel (it's not important which distro that will be)? > > -- > Ivan Sizov Mounting problem persists: on 4.13.0 with btrfs-progs v4.11.1 (latest Fedora Rawhide Live) on 4.10.0 with btrfs-progs v4.9.1 (Ubuntu 17.04 Live) on 4.9.0 with btrfs-progs v 4.7.3 (Debian 9 Stretch Live) "btrfs check --readonly" also gives the same output on 4.11, 4.10 and 4.9. Marc, how did you roll back and fix those errors? -- Ivan Sizov ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 21:17 ` Marc MERLIN 2017-07-31 21:39 ` Ivan Sizov @ 2017-07-31 22:00 ` Justin Maggard 2017-08-01 6:38 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Justin Maggard @ 2017-07-31 22:00 UTC (permalink / raw) To: Marc MERLIN Cc: Ivan Sizov, Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan On Mon, Jul 31, 2017 at 2:17 PM, Marc MERLIN <marc@merlins.org> wrote: > On Tue, Aug 01, 2017 at 12:07:14AM +0300, Ivan Sizov wrote: >> 2017-07-09 10:57 GMT+03:00 Martin Steigerwald <martin@lichtvoll.de>: >> > Hello Marc. >> > >> > Marc MERLIN - 08.07.17, 21:34: >> >> Sigh, >> >> >> >> This is now the 3rd filesystem I have (on 3 different machines) that is >> >> getting corruption of some kind (on 4.11.6). >> > >> > Anyone else getting corruptions with 4.11? >> Yes, a lot. There are at least 3 cases, probably I've missed something. >> https://www.spinics.net/lists/linux-btrfs/msg67177.html >> https://www.spinics.net/lists/linux-btrfs/msg67681.html >> https://unix.stackexchange.com/questions/369133/dealing-with-btrfs-ref-backpointer-mismatches-backref-missing/369275 > > Indeed. My main server is happy back on 4.9.36 and while my laptop is > stuck on 4.11 due to other kernel issues that prevent me from going back > to 4.9, it only corrupted a single filesystem so far, and no other ones > that I've noticed yet. > Hopefully that will hold :-/ > Marc, do you have quotas enabled? IIRC, you're a send/receive user. The combination of quotas and btrfs receive can corrupt your filesystem, as shown by the xfstest I sent to the list a little while ago. -Justin ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) 2017-07-31 22:00 ` Justin Maggard @ 2017-08-01 6:38 ` Marc MERLIN 0 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-08-01 6:38 UTC (permalink / raw) To: Justin Maggard Cc: Ivan Sizov, Martin Steigerwald, Lu Fengqi, Btrfs BTRFS, David Sterba, Duncan On Mon, Jul 31, 2017 at 03:00:53PM -0700, Justin Maggard wrote: > Marc, do you have quotas enabled? IIRC, you're a send/receive user. > The combination of quotas and btrfs receive can corrupt your > filesystem, as shown by the xfstest I sent to the list a little while > ago. Thanks for checking. I do not use quota given the problems I had with them early on over 2y ago. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 4:56 ` Chris Murphy 2017-05-02 5:11 ` Marc MERLIN @ 2017-05-02 19:59 ` Kai Krakow 1 sibling, 0 replies; 77+ messages in thread From: Kai Krakow @ 2017-05-02 19:59 UTC (permalink / raw) To: linux-btrfs Am Mon, 1 May 2017 22:56:06 -0600 schrieb Chris Murphy <lists@colorremedies.com>: > On Mon, May 1, 2017 at 9:23 PM, Marc MERLIN <marc@merlins.org> wrote: > > Hi Chris, > > > > Thanks for the reply, much appreciated. > > > > On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote: > >> What about btfs check (no repair), without and then also with > >> --mode=lowmem? > >> > >> In theory I like the idea of a 24 hour rollback; but in normal > >> usage Btrfs will eventually free up space containing stale and no > >> longer necessary metadata. Like the chunk tree, it's always > >> changing, so you get to a point, even with snapshots, that the old > >> state of that tree is just - gone. A snapshot of an fs tree does > >> not make the chunk tree frozen in time. > > > > Right, of course, I was being way over optimistic here. I kind of > > forgot that metadata wasn't COW, my bad. > > Well it is COW. But there's more to the file system than fs trees, and > just because an fs tree gets snapshot doesn't mean all data is > snapshot. So whether snapshot or not, there's metadata that becomes > obsolete as the file system is updated and those areas get freed up > and eventually overwritten. > > > > > >> In any case, it's a big problem in my mind if no existing tools can > >> fix a file system of this size. So before making anymore changes, > >> make sure you have a btrfs-image somewhere, even if it's huge. The > >> offline checker needs to be able to repair it, right now it's all > >> we have for such a case. > > > > The image will be huge, and take maybe 24H to make (last time it > > took some silly amount of time like that), and honestly I'm not > > sure how useful it'll be. > > Outside of the kernel crashing if I do a btrfs balance, and > > hopefully the crash report I gave is good enough, the state I'm in > > is not btrfs' fault. > > > > If I can't roll back to a reasonably working state, with data loss > > of a known quantity that I can recover from backup, I'll have to > > destroy and filesystem and recover from scratch, which will take > > multiple days. Since I can't wait too long before getting back to a > > working state, I think I'm going to try btrfs check --repair after > > a scrub to get a list of all the pathanmes/inodes that are known to > > be damaged, and work from there. > > Sounds reasonable? > > Yes. > > > > > > Also, how is --mode=lowmem being useful? > > Testing. lowmem is a different implementation, so it might find > different things from the regular check. > > > > > > And for re-parenting a sub-subvolume, is that possible? > > (I want to delete /sub1/ but I can't because I have /sub1/sub2 > > that's also a subvolume and I'm not sure how to re-parent sub2 to > > somewhere else so that I can subvolume delete sub1) > > Well you can move sub2 out of sub1 just like a directory and then > delete sub1. If it's read-only it can't be moved, but you can use > btrfs property get/set ro true/false to temporarily make it not > read-only, move it, then make it read-only again, and it's still fine > to use with btrfs send receive. > > > > > > > > > In the meantime, a simple check without repair looks like this. It > > will likely take many hours to complete: > > gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2 > > Checking filesystem on /dev/mapper/dshelf2 > > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > > checking extents > > checksum verify failed on 3096461459456 found 0E6B7980 wanted > > FBE5477A checksum verify failed on 3096461459456 found 0E6B7980 > > wanted FBE5477A checksum verify failed on 2899180224512 found > > 7A6D427F wanted 7E899EE5 checksum verify failed on 2899180224512 > > found 7A6D427F wanted 7E899EE5 checksum verify failed on > > 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed > > on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr mismatch, > > want=2899180224512, have=3981076597540270796 checksum verify failed > > on 1449488023552 found CECC36AF wanted 199FE6C5 checksum verify > > failed on 1449488023552 found CECC36AF wanted 199FE6C5 checksum > > verify failed on 1449544613888 found 895D691B wanted A0C64D2B > > checksum verify failed on 1449544613888 found 895D691B wanted > > A0C64D2B parent transid verify failed on 1671538819072 wanted > > 293964 found 293902 parent transid verify failed on 1671538819072 > > wanted 293964 found 293902 checksum verify failed on 1671603781632 > > found 18BC28D6 wanted 372655A0 checksum verify failed on > > 1671603781632 found 18BC28D6 wanted 372655A0 checksum verify failed > > on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum verify > > failed on 1759425052672 found 843B59F1 wanted F0FF7D00 checksum > > verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > > checksum verify failed on 2182657212416 found CD8EFC0C wanted > > 70847071 checksum verify failed on 2898779357184 found 96395131 > > wanted 433D6E09 checksum verify failed on 2898779357184 found > > 96395131 wanted 433D6E09 checksum verify failed on 2899180224512 > > found 7A6D427F wanted 7E899EE5 checksum verify failed on > > 2899180224512 found 7A6D427F wanted 7E899EE5 checksum verify failed > > on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify > > failed on 2899180224512 found 7A6D427F wanted 7E899EE5 bytenr > > mismatch, want=2899180224512, have=3981076597540270796 checksum > > verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > > checksum verify failed on 2182657212416 found CD8EFC0C wanted > > 70847071 checksum verify failed on 2182657212416 found CD8EFC0C > > wanted 70847071 checksum verify failed on 2182657212416 found > > CD8EFC0C wanted 70847071 checksum verify failed on 2182657212416 > > found CD8EFC0C wanted 70847071 > > Not understanding the problem, it's by definition naive for me to > suggest it should go read-only sooner before hosing itself. But I'd > like to think it's possible for Btrfs to look backward every once in a > while for sanity checking, to limit damage should it be occurring even > if the hardware isn't reporting any problems. Would it be possible to make btrfs avoid using parts of the filesystem it detected corruptions in? Then a still-in-theory online repair tool could check these parts, maybe repair them (or destroy them upon request), and make those parts of the fs available again... Such a repair tool (scanning only known corrupted parts) would probably also need less memory and time to run. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 3:23 ` Marc MERLIN 2017-05-02 4:56 ` Chris Murphy @ 2017-05-02 5:01 ` Duncan 2017-05-02 19:53 ` Kai Krakow 2017-05-23 16:58 ` Marc MERLIN 2017-05-05 1:19 ` Qu Wenruo 2 siblings, 2 replies; 77+ messages in thread From: Duncan @ 2017-05-02 5:01 UTC (permalink / raw) To: linux-btrfs Marc MERLIN posted on Mon, 01 May 2017 20:23:46 -0700 as excerpted: > Also, how is --mode=lowmem being useful? FWIW, I just watched your talk that's linked from the wiki, and wondered what you were doing these days as I hadn't seen any posts from you here for awhile. Well, that you're asking that question confirms you've not been following the list too closely... Of course that's understandable as people have other stuff to do, but just sayin'. The answer is... btrfs check in lowmem mode isn't simply lowmem, it's also effectively a very nearly entirely rewritten second implementation, which has already demonstrated its worth as it has already allowed finding and fixing a number of bugs in normal mode check. Of course normal mode check has returned the favor a few times as well, so it is now reasonably standard list troubleshooting practice to ask for the output from both modes to see what and where they differ, especially if it's not something known to be directly fixable by normal mode, which of course remains the more mature default. So even if neither one can actually fix the problem ATM, any differences in output both lend important clues to the real problem, and potentially help developers to find and fix bugs in one or the other implementation. Tho it's worth noting that lowmem mode can be expected to take longer, as it favors lower memory usage over speed, just as the mode title suggests it will. On a filesystem as big as yours... it may unfortunately not be entirely practical, especially if as you hint there's at least some time pressure here, tho it's not extreme. Of course on-list I'm somewhat known for my arguments propounding the notion that any filesystem that's too big to be practically maintained (including time necessary to restore from backups, should that be necessary for whatever reason) is... too big... and should ideally be broken along logical and functional boundaries into a number of individual smaller filesystems until such point as each one is found to be practically maintainable within a reasonably practical time frame. Don't put all the eggs in one basket, and when the bottom of one of those baskets inevitably falls out, most of your eggs will be safe in other baskets. =:^) But as someone else (pg, IIRC) on-list is fond of saying, lots of other people "know better" (TM). Whatever. It's your data, your systems and your time, not mine. I just know what I've found (sometimes finding it the hard way!) to work best for me, and TBs on TBs of data on a single filesystem, even if it's a backup and is itself backed up, isn't something I'd be putting my own faith in, as the time even for a simple restore from backups is simply too high for me to consider it at all practical. =:^) > And for re-parenting a sub-subvolume, is that possible? > (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's > also a subvolume and I'm not sure how to re-parent sub2 to somewhere > else so that I can subvolume delete sub1) As I believe you know my own use-case doesn't deal with subvolumes and snapshots, so this may be of limited practicality, but FWIW, the sysadmin's guide discussion of snapshot management and special cases seems apropos as a first stop, before going further: https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Managing_Snapshots Note that toward the bottom of "management" it discusses moving subvolumes (which will obviously reparent them), but then below that in special cases it says that read-only subvolumes (and thus snapshots) cannot be moved, explaining why. *BUT*, and here's the "go further" part, keep in mind that subvolume-read- only is a property, gettable and settable by btrfs property. So you should be able to unset the read-only property of a subvolume or snapshot, move it, then if desired, set it again. Of course I wouldn't expect send -p to work with such a snapshot, but send -c /might/ still work, I'm not actually sure but I'd consider it worth trying. (I'd try -p as well, but expect it to fail...) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 5:01 ` Duncan @ 2017-05-02 19:53 ` Kai Krakow 2017-05-23 16:58 ` Marc MERLIN 1 sibling, 0 replies; 77+ messages in thread From: Kai Krakow @ 2017-05-02 19:53 UTC (permalink / raw) To: linux-btrfs Am Tue, 2 May 2017 05:01:02 +0000 (UTC) schrieb Duncan <1i5t5.duncan@cox.net>: > Of course on-list I'm somewhat known for my arguments propounding the > notion that any filesystem that's too big to be practically > maintained (including time necessary to restore from backups, should > that be necessary for whatever reason) is... too big... and should > ideally be broken along logical and functional boundaries into a > number of individual smaller filesystems until such point as each one > is found to be practically maintainable within a reasonably practical > time frame. Don't put all the eggs in one basket, and when the bottom > of one of those baskets inevitably falls out, most of your eggs will > be safe in other baskets. =:^) Hehe... Yes, you're a fan of small filesystems. I'm more from the opposite camp, preferring one big filesystem to not mess around with size constraints of small filesystems fighting for the same volume space. It also gives such filesystems better chances for data locality of data staying in totally different parts across your fs mounts and can reduce head movement. Of course, much of this is not true if you use different devices per filesystem, or use SSDs, or SAN where you have no real control over the physical placement of image stripes anyway. But well... In an ideal world, subvolumes of btrfs would be totally independent of each other, just only share the same volume and dynamically allocating chunks of space from it. If one is broken, it is simply not usable and it should be destroyable. A garbage collector would grab the leftover chunks from the subvolume and free them, and you could recreate this subvolume from backup. In reality, shared extents will cross subvolume borders so it is probably not how things could work anytime in the near of far future. This idea is more like having thinly provisioned LVM volumes which allocate space as the filesystems on top need them, much like doing thinly provisioned images with a VM host system. The problem here is, unlike subvolumes, those chunks of space could never be given back to the host as it doesn't know if it is still in use. Of course, there's implementations available which allow thinning the images by passing through TRIM from the guest to the host (or by other means of communication channels between host and guest), but that is usually not giving good performance, if even supported. I tried once to exploit this in VirtualBox and hoped it would translate guest discards into hole punching requests on the host, and it's even documented to work that way... But (a) it was horrible slow, and (b) it was incredibly unstable to the point of being useless. OTOH, it's not announced as a stable feature and has to be enabled by manually editing the XML config files. But I still like the idea: Is it possible to make btrfs still work if one subvolume gets corrupted? Of course it should have ways of telling the user which other subvolumes are interconnected through shared extents so those would be also discarded upon corruption cleanup - at least if those extents couldn't be made any sense of any longer. Since corruption is an issue mostly of subvolumes being written to, snapshots should be mostly safe. Such a feature would also only make sense if btrfs had an online repair tool. BTW, are there plans for having an online repair tool in the future? Maybe one that only scans and fixes part of the filesystems (for obvious performance reasons, wrt Duncans idea of handling filesystems), i.e. those parts that the kernel discovered having corruptions? If I could then just delete and restore affected files, this would be even better than having independent subvolumes like above. -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 5:01 ` Duncan 2017-05-02 19:53 ` Kai Krakow @ 2017-05-23 16:58 ` Marc MERLIN 2017-05-24 10:16 ` Duncan 1 sibling, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-23 16:58 UTC (permalink / raw) To: Duncan; +Cc: linux-btrfs On Tue, May 02, 2017 at 05:01:02AM +0000, Duncan wrote: > Marc MERLIN posted on Mon, 01 May 2017 20:23:46 -0700 as excerpted: > > > Also, how is --mode=lowmem being useful? > > FWIW, I just watched your talk that's linked from the wiki, and wondered > what you were doing these days as I hadn't seen any posts from you here > for awhile. First, sorry for the late reply. Because you didn't Cc me in the answer, it went to a different folder where I only saw your replies now. Off topic, but basically I'm not dead or anything, I have btrfs working well enough to not mess with it further because I have many other hobbies :) that is unless I put a new SAS card in my server, hit some corruption bugs, and now I'm back spending days fixing the system. > Well, that you're asking that question confirms you've not been following > the list too closely... Of course that's understandable as people have > other stuff to do, but just sayin'. That's exactly right. I'm subscribed to way too many lists on way too many topics to be up to date with all, sadly :( > Of course on-list I'm somewhat known for my arguments propounding the > notion that any filesystem that's too big to be practically maintained > (including time necessary to restore from backups, should that be > necessary for whatever reason) is... too big... and should ideally be > broken along logical and functional boundaries into a number of > individual smaller filesystems until such point as each one is found to > be practically maintainable within a reasonably practical time frame. > Don't put all the eggs in one basket, and when the bottom of one of those > baskets inevitably falls out, most of your eggs will be safe in other > baskets. =:^) That's a valid point, and in my case, I can back it up/restore, it just takes a bit of time, but most of the time is manually babysitting all those subvolumes that I need to recreate by hand with btrfs send/restore relationships, which all get lost during backup/restore. This is the most painful part. What's too big? I've only ever used a filesystem that fits on on a raid of 4 data drives. That value has increased over time, but I don't have a a crazy array of 20+ drives as a single filesystem, or anything. Since drives have gotten bigger, but not that much faster, I use bcache to make things more acceptable in speed. > *BUT*, and here's the "go further" part, keep in mind that subvolume-read- > only is a property, gettable and settable by btrfs property. > > So you should be able to unset the read-only property of a subvolume or > snapshot, move it, then if desired, set it again. > > Of course I wouldn't expect send -p to work with such a snapshot, but > send -c /might/ still work, I'm not actually sure but I'd consider it > worth trying. (I'd try -p as well, but expect it to fail...) That's an interesting point, thanks for making it. In that case, I did have to destroy and recreate the filesystem since btrfs check --repair was unable to fix it, but knowing how to reparent read only subvolumes may be handy in the future, thanks. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-23 16:58 ` Marc MERLIN @ 2017-05-24 10:16 ` Duncan 0 siblings, 0 replies; 77+ messages in thread From: Duncan @ 2017-05-24 10:16 UTC (permalink / raw) To: linux-btrfs Marc MERLIN posted on Tue, 23 May 2017 09:58:47 -0700 as excerpted: > That's a valid point, and in my case, I can back it up/restore, it just > takes a bit of time, but most of the time is manually babysitting all > those subvolumes that I need to recreate by hand with btrfs send/restore > relationships, which all get lost during backup/restore. > This is the most painful part. > What's too big? I've only ever used a filesystem that fits on on a raid > of 4 data drives. That value has increased over time, but I don't have a > a crazy array of 20+ drives as a single filesystem, or anything. > Since drives have gotten bigger, but not that much faster, I use bcache > to make things more acceptable in speed. What's too big? That depends on your tolerance for pain, but given the subvolumes manually recreated by hand with send/receive scenario, I'd probably try to break it down so while there's the same number of snapshots to restore, the number of subvolumes the snapshots are taken against are limited. My own rule of thumb is if it's taking so long that it's a barrier to doing it, I really need to either break things down further, or upgrade to faster storage. The latter is why I'm actually looking at upgrading my media and second backup set, on spinning rust, to ssd. Because while I used to do backups spinning rust to spinning rust of that size all the time, ssds have spoiled me, and now I dread doing the spinning rust backups... or restores. Tho in my case the spinning rust is only a half- TB, so a pair of half-TB to 1 TB ssds for an upgrade is still cost effective. It's not like I'm going multi-TB, which would still be cost prohibitive on SSD, particularly since I want raid1, so doubling the number of SSDs. Meanwhile, what I'd do with that raid of four drives (and /did/ do with my 4-drive raid back a few storage generations ago, when 300 GB spinning- rust disks were still quite big, and what I do with my paired SSDs with btrfs now) is partition them up and do raids of partitions on each drive. One thing that's nice about that is that you can actually do a set of backups on a second set of partitions on the same physical devices, because the physical device redundancy of the raids covers loss of a device, and the separate partitions and raids (btrfs raid1 now) cover the fat-finger or simple loss of filesystem risk. A second set of backups to separate devices can then be made just in case, and depending on the need, swapped out to off-premises or uploaded to the cloud or whatever, but you always have the primary backup at hand to boot to or mount if the working copy fails, by simply pointing to the backup partitions and filesystem instead of the normal working copy. For root, I even have a grub menu item that switches to the backup copy, and for fstab, I have a set of stubs that are assembled via script into three copies of fstab that swap working and backup copies as necessary, with /etc/fstab itself being a symlink to the working copy one, that I simply switch to point to the one that loads the backup copies as working, on the backup. Or I can mount the root filesystem for maintenance from the initramfs, and switch the fstab symlink from there, before exiting maintenance and booting the main system. I learned this "split it up" method the hard way back before mdraid had write-intent bitmaps, and I had only two much larger raids, working and backup, where if one device dropped out and I brought it back in, I had to wait way too long for the huge working raid to resync. When I split things up by function into multiple raids, most of the time only some of them were active and only one or two of the active ones would actually have been being written at the time so were out of sync, and syncing them was fast as they were much smaller than the larger full system raids I had been using previously. >> *BUT*, and here's the "go further" part, keep in mind that >> subvolume-read- >> only is a property, gettable and settable by btrfs property. >> >> So you should be able to unset the read-only property of a subvolume or >> snapshot, move it, then if desired, set it again. >> >> Of course I wouldn't expect send -p to work with such a snapshot, but >> send -c /might/ still work, I'm not actually sure but I'd consider it >> worth trying. (I'd try -p as well, but expect it to fail...) > > That's an interesting point, thanks for making it. > In that case, I did have to destroy and recreate the filesystem since > btrfs check --repair was unable to fix it, but knowing how to reparent > read only subvolumes may be handy in the future, thanks. Hopefully you won't end up testing it any time soon, but if you do, please confirm whether my suspicions that send -p won't work after toggling and reparenting, but send -c still will, are correct. (For those who read this out of thread context where I believe I already stated it, my own use-case involves neither snapshots nor send-receive. But it'd be useful information to confirm, both for others, and in case I suddenly find myself with a different use-case for some reason or other.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-02 3:23 ` Marc MERLIN 2017-05-02 4:56 ` Chris Murphy 2017-05-02 5:01 ` Duncan @ 2017-05-05 1:19 ` Qu Wenruo 2017-05-05 2:10 ` Qu Wenruo 2017-05-05 2:40 ` Marc MERLIN 2 siblings, 2 replies; 77+ messages in thread From: Qu Wenruo @ 2017-05-05 1:19 UTC (permalink / raw) To: Marc MERLIN, Chris Murphy Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba At 05/02/2017 11:23 AM, Marc MERLIN wrote: > Hi Chris, > > Thanks for the reply, much appreciated. > > On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote: >> What about btfs check (no repair), without and then also with --mode=lowmem? >> >> In theory I like the idea of a 24 hour rollback; but in normal usage >> Btrfs will eventually free up space containing stale and no longer >> necessary metadata. Like the chunk tree, it's always changing, so you >> get to a point, even with snapshots, that the old state of that tree >> is just - gone. A snapshot of an fs tree does not make the chunk tree >> frozen in time. > > Right, of course, I was being way over optimistic here. I kind of forgot > that metadata wasn't COW, my bad. > >> In any case, it's a big problem in my mind if no existing tools can >> fix a file system of this size. So before making anymore changes, make >> sure you have a btrfs-image somewhere, even if it's huge. The offline >> checker needs to be able to repair it, right now it's all we have for >> such a case. > > The image will be huge, and take maybe 24H to make (last time it took > some silly amount of time like that), and honestly I'm not sure how > useful it'll be. > Outside of the kernel crashing if I do a btrfs balance, and hopefully > the crash report I gave is good enough, the state I'm in is not btrfs' > fault. > > If I can't roll back to a reasonably working state, with data loss of a > known quantity that I can recover from backup, I'll have to destroy and > filesystem and recover from scratch, which will take multiple days. > Since I can't wait too long before getting back to a working state, I > think I'm going to try btrfs check --repair after a scrub to get a list > of all the pathanmes/inodes that are known to be damaged, and work from > there. > Sounds reasonable? > > Also, how is --mode=lowmem being useful? > > And for re-parenting a sub-subvolume, is that possible? > (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's also a subvolume > and I'm not sure how to re-parent sub2 to somewhere else so that I can subvolume delete > sub1) > > In the meantime, a simple check without repair looks like this. It will > likely take many hours to complete: > gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2 > Checking filesystem on /dev/mapper/dshelf2 > UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 > checking extents > checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A > checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > parent transid verify failed on 1671538819072 wanted 293964 found 293902 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 > checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 > checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 > checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E > checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 > bytenr mismatch, want=2899180224512, have=3981076597540270796 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 > (...) Full output please. I know it will be long, but the point here is, full output could help us to at least locate where the most corruption are. If most corruption are only in extent tree, the chance to recover will increase hugely. As extent tree is just a backref for all allocated extents, it's not really important if recovery (read) is the primary goal. But if other tree (fs or subvolume tree important for you) also get corrupted, I'm afraid your last chance will be "btrfs restore" then. Thanks, Qu > > Thanks, > Marc > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-05 1:19 ` Qu Wenruo @ 2017-05-05 2:10 ` Qu Wenruo 2017-05-05 2:40 ` Marc MERLIN 1 sibling, 0 replies; 77+ messages in thread From: Qu Wenruo @ 2017-05-05 2:10 UTC (permalink / raw) To: Marc MERLIN, Chris Murphy Cc: Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba At 05/05/2017 09:19 AM, Qu Wenruo wrote: > > > At 05/02/2017 11:23 AM, Marc MERLIN wrote: >> Hi Chris, >> >> Thanks for the reply, much appreciated. >> >> On Mon, May 01, 2017 at 07:50:22PM -0600, Chris Murphy wrote: >>> What about btfs check (no repair), without and then also with >>> --mode=lowmem? >>> >>> In theory I like the idea of a 24 hour rollback; but in normal usage >>> Btrfs will eventually free up space containing stale and no longer >>> necessary metadata. Like the chunk tree, it's always changing, so you >>> get to a point, even with snapshots, that the old state of that tree >>> is just - gone. A snapshot of an fs tree does not make the chunk tree >>> frozen in time. >> Right, of course, I was being way over optimistic here. I kind of forgot >> that metadata wasn't COW, my bad. >> >>> In any case, it's a big problem in my mind if no existing tools can >>> fix a file system of this size. So before making anymore changes, make >>> sure you have a btrfs-image somewhere, even if it's huge. The offline >>> checker needs to be able to repair it, right now it's all we have for >>> such a case. >> >> The image will be huge, and take maybe 24H to make (last time it took >> some silly amount of time like that), and honestly I'm not sure how >> useful it'll be. >> Outside of the kernel crashing if I do a btrfs balance, and hopefully >> the crash report I gave is good enough, the state I'm in is not btrfs' >> fault. >> >> If I can't roll back to a reasonably working state, with data loss of a >> known quantity that I can recover from backup, I'll have to destroy and >> filesystem and recover from scratch, which will take multiple days. >> Since I can't wait too long before getting back to a working state, I >> think I'm going to try btrfs check --repair after a scrub to get a list >> of all the pathanmes/inodes that are known to be damaged, and work from >> there. >> Sounds reasonable? >> >> Also, how is --mode=lowmem being useful? >> >> And for re-parenting a sub-subvolume, is that possible? >> (I want to delete /sub1/ but I can't because I have /sub1/sub2 that's >> also a subvolume >> and I'm not sure how to re-parent sub2 to somewhere else so that I can >> subvolume delete >> sub1) >> >> In the meantime, a simple check without repair looks like this. It will >> likely take many hours to complete: >> gargamel:/var/local/space# btrfs check /dev/mapper/dshelf2 >> Checking filesystem on /dev/mapper/dshelf2 >> UUID: 03e9a50c-1ae6-4782-ab9c-5f310a98e653 >> checking extents >> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A >> checksum verify failed on 3096461459456 found 0E6B7980 wanted FBE5477A >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> bytenr mismatch, want=2899180224512, have=3981076597540270796 >> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 >> checksum verify failed on 1449488023552 found CECC36AF wanted 199FE6C5 >> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B >> checksum verify failed on 1449544613888 found 895D691B wanted A0C64D2B >> parent transid verify failed on 1671538819072 wanted 293964 found 293902 >> parent transid verify failed on 1671538819072 wanted 293964 found 293902 >> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 >> checksum verify failed on 1671603781632 found 18BC28D6 wanted 372655A0 >> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 >> checksum verify failed on 1759425052672 found 843B59F1 wanted F0FF7D00 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 >> checksum verify failed on 2898779357184 found 96395131 wanted 433D6E09 >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E >> checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 >> bytenr mismatch, want=2899180224512, have=3981076597540270796 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> checksum verify failed on 2182657212416 found CD8EFC0C wanted 70847071 >> (...) > > Full output please. Sorry for not noticing the link. [Conclusion] After checking the full result, some of fs/subvolume trees are corrupted. [Details] Some example here: --- ref mismatch on [6674127745024 32768] extent item 0, found 1 Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 offset 0 found 1 wanted 0 back 0x5648afda0f20 backpointer mismatch on [6674127745024 32768] --- The extent at 6674127745024 seems to be an *DATA* extent. While current default nodesize is 16K and ancient default node is 4K. Unless you specified -n 32K at mkfs time, it's a DATA extent. Further more, it's a shared data backref, it's using its parent tree block to do backref walk. And its parent tree block is 7566652473344. While such bytenr can't be found anywhere (including csum error output), that's to say either we can't find that tree block nor can't reach the tree root for it. Considering it's data extent, its owner is either root or fs/subvolume tree. Such cases are everywhere, as I found other extent sized from 4K to 44K, so I'm pretty sure there must be some fs/subvolume tree corrupted. (Data extent in root tree is seldom 4K sized) So unfortunately, your fs/subvolume trees are also corrupted. And almost no chance to do a graceful recovery. [Alternatives] I would recommend to use "btrfs restore -f <subvolid>" to restore specified subvolume. What we can do is to try to dump the tree of a subvolume, and manually gather what's still here and put them somewhere else. And that's what btrfs-restore is doing. The only good new is, your chunk tree seems to be good, so btrfs-restore shouldn't encounter too many problems. Good luck. Thanks, Qu > > I know it will be long, but the point here is, full output could help us > to at least locate where the most corruption are. > > If most corruption are only in extent tree, the chance to recover will > increase hugely. > > As extent tree is just a backref for all allocated extents, it's not > really important if recovery (read) is the primary goal. > > But if other tree (fs or subvolume tree important for you) also get > corrupted, I'm afraid your last chance will be "btrfs restore" then. > > Thanks, > Qu > >> >> Thanks, >> Marc >> ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-05 1:19 ` Qu Wenruo 2017-05-05 2:10 ` Qu Wenruo @ 2017-05-05 2:40 ` Marc MERLIN 2017-05-05 5:03 ` Qu Wenruo 1 sibling, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-05 2:40 UTC (permalink / raw) To: Qu Wenruo Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba On Fri, May 05, 2017 at 09:19:29AM +0800, Qu Wenruo wrote: > Sorry for not noticing the link. no problem, it was only one line amongst many :) Thanks much for having had a look. > [Conclusion] > After checking the full result, some of fs/subvolume trees are corrupted. > > [Details] > Some example here: > > --- > ref mismatch on [6674127745024 32768] extent item 0, found 1 > Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not > found in extent tree > Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 > offset 0 found 1 wanted 0 back 0x5648afda0f20 > backpointer mismatch on [6674127745024 32768] > --- > > The extent at 6674127745024 seems to be an *DATA* extent. > While current default nodesize is 16K and ancient default node is 4K. > > Unless you specified -n 32K at mkfs time, it's a DATA extent. I did not, so you must be right about DATA, which should be good, right, I don't mind losing data as long as the underlying metadata is correct. I should have given more data on the FS: gargamel:/var/local/src/btrfs-progs# btrfs fi df /mnt/btrfs_pool2/ Data, single: total=6.28TiB, used=6.12TiB System, DUP: total=32.00MiB, used=720.00KiB Metadata, DUP: total=97.00GiB, used=94.39GiB GlobalReserve, single: total=512.00MiB, used=0.00B gargamel:/var/local/src/btrfs-progs# btrfs fi usage /mnt/btrfs_pool2 Overall: Device size: 7.28TiB Device allocated: 6.47TiB Device unallocated: 824.48GiB Device missing: 0.00B Used: 6.30TiB Free (estimated): 994.45GiB (min: 582.21GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:6.28TiB, Used:6.12TiB /dev/mapper/dshelf2 6.28TiB Metadata,DUP: Size:97.00GiB, Used:94.39GiB /dev/mapper/dshelf2 194.00GiB System,DUP: Size:32.00MiB, Used:720.00KiB /dev/mapper/dshelf2 64.00MiB Unallocated: /dev/mapper/dshelf2 824.48GiB > Further more, it's a shared data backref, it's using its parent tree block > to do backref walk. > > And its parent tree block is 7566652473344. > While such bytenr can't be found anywhere (including csum error output), > that's to say either we can't find that tree block nor can't reach the tree > root for it. > > Considering it's data extent, its owner is either root or fs/subvolume tree. > > > Such cases are everywhere, as I found other extent sized from 4K to 44K, so > I'm pretty sure there must be some fs/subvolume tree corrupted. > (Data extent in root tree is seldom 4K sized) > > So unfortunately, your fs/subvolume trees are also corrupted. > And almost no chance to do a graceful recovery. So I'm confused here. You're saying my metadata is not corrupted (and in my case, I have DUP, so I should have 2 copies), but with data blocks (which are not duped) corrupted, it's also possible to lose the filesystem in a way that it can't be taken back to a clean state, even by deleting some corrupted data? > [Alternatives] > I would recommend to use "btrfs restore -f <subvolid>" to restore specified > subvolume. I don't need to restore data, the data is a backup. It will just take many days to recreate (plus many hours of typing from me because the backup updates are automated, but recreating everything, is not automated) So if I understand correctly, my metadata is fine (and I guess I have 2 copies, so it would have been unlucky to get both copies corrupted), but enough data blocks got corrupted that btrfs cannot recover, even by deleting the corrupted data blocks. Correct? And is it not possible to clear the corrupted blocks like this? ./btrfs-corrupt-block -l 2899180224512 /dev/mapper/dshelf2 and just accept the lost data but get btrfs check repair to deal with the deleted blocks and bring the rest back to a clean state? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-05 2:40 ` Marc MERLIN @ 2017-05-05 5:03 ` Qu Wenruo 2017-05-05 15:43 ` Marc MERLIN 0 siblings, 1 reply; 77+ messages in thread From: Qu Wenruo @ 2017-05-05 5:03 UTC (permalink / raw) To: Marc MERLIN Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba At 05/05/2017 10:40 AM, Marc MERLIN wrote: > On Fri, May 05, 2017 at 09:19:29AM +0800, Qu Wenruo wrote: >> Sorry for not noticing the link. > > no problem, it was only one line amongst many :) > Thanks much for having had a look. > >> [Conclusion] >> After checking the full result, some of fs/subvolume trees are corrupted. >> >> [Details] >> Some example here: >> >> --- >> ref mismatch on [6674127745024 32768] extent item 0, found 1 >> Backref 6674127745024 parent 7566652473344 owner 0 offset 0 num_refs 0 not >> found in extent tree >> Incorrect local backref count on 6674127745024 parent 7566652473344 owner 0 >> offset 0 found 1 wanted 0 back 0x5648afda0f20 >> backpointer mismatch on [6674127745024 32768] >> --- >> >> The extent at 6674127745024 seems to be an *DATA* extent. >> While current default nodesize is 16K and ancient default node is 4K. >> >> Unless you specified -n 32K at mkfs time, it's a DATA extent. > > I did not, so you must be right about DATA, which should be good, right, > I don't mind losing data as long as the underlying metadata is correct. > > I should have given more data on the FS: > > gargamel:/var/local/src/btrfs-progs# btrfs fi df /mnt/btrfs_pool2/ > Data, single: total=6.28TiB, used=6.12TiB > System, DUP: total=32.00MiB, used=720.00KiB > Metadata, DUP: total=97.00GiB, used=94.39GiB Tons of metadata since the fs is so large. > GlobalReserve, single: total=512.00MiB, used=0.00B > > gargamel:/var/local/src/btrfs-progs# btrfs fi usage /mnt/btrfs_pool2 > Overall: > Device size: 7.28TiB > Device allocated: 6.47TiB > Device unallocated: 824.48GiB > Device missing: 0.00B > Used: 6.30TiB > Free (estimated): 994.45GiB (min: 582.21GiB) > Data ratio: 1.00 > Metadata ratio: 2.00 > Global reserve: 512.00MiB (used: 0.00B) > > Data,single: Size:6.28TiB, Used:6.12TiB > /dev/mapper/dshelf2 6.28TiB > > Metadata,DUP: Size:97.00GiB, Used:94.39GiB > /dev/mapper/dshelf2 194.00GiB > > System,DUP: Size:32.00MiB, Used:720.00KiB > /dev/mapper/dshelf2 64.00MiB > > Unallocated: > /dev/mapper/dshelf2 824.48GiB > > >> Further more, it's a shared data backref, it's using its parent tree block >> to do backref walk. >> >> And its parent tree block is 7566652473344. >> While such bytenr can't be found anywhere (including csum error output), >> that's to say either we can't find that tree block nor can't reach the tree >> root for it. >> >> Considering it's data extent, its owner is either root or fs/subvolume tree. >> >> >> Such cases are everywhere, as I found other extent sized from 4K to 44K, so >> I'm pretty sure there must be some fs/subvolume tree corrupted. >> (Data extent in root tree is seldom 4K sized) >> >> So unfortunately, your fs/subvolume trees are also corrupted. >> And almost no chance to do a graceful recovery. > > So I'm confused here. You're saying my metadata is not corrupted (and in > my case, I have DUP, so I should have 2 copies), Nope, here I'm all talking about metadata (tree blocks). Difference is the owner, either extent tree or fs/subvolume tree. The fsck doesn't check data blocks. > but with data blocks > (which are not duped) corrupted, it's also possible to lose the > filesystem in a way that it can't be taken back to a clean state, even > by deleting some corrupted data? No, it can't be repaired by deleting data. The problem is, tree blocks (metadata) that refers these data blocks are corrupted. And they are corrupted in such a way that both extent tree (tree contains extent allocation info) and fs tree (tree contains real fs info, like inode and data location) are corrupted. So graceful recovery is not possible now. > >> [Alternatives] >> I would recommend to use "btrfs restore -f <subvolid>" to restore specified >> subvolume. > > I don't need to restore data, the data is a backup. It will just take > many days to recreate (plus many hours of typing from me because the > backup updates are automated, but recreating everything, is not > automated) > > So if I understand correctly, my metadata is fine (and I guess I have 2 > copies, so it would have been unlucky to get both copies corrupted), but > enough data blocks got corrupted that btrfs cannot recover, even by > deleting the corrupted data blocks. Correct? Unfortunately, no, even you have 2 copies, a lot of tree blocks are corrupted that neither copy matches checksum. Just like the following tree block, both copy have wrong checksum. --- checksum verify failed on 2899180224512 found ABBE39B0 wanted E0735D0E checksum verify failed on 2899180224512 found 7A6D427F wanted 7E899EE5 --- > > And is it not possible to clear the corrupted blocks like this? > ./btrfs-corrupt-block -l 2899180224512 /dev/mapper/dshelf2 > and just accept the lost data but get btrfs check repair to deal with > the deleted blocks and bring the rest back to a clean state?No, that won't help. Corrupted blocks are corrupted, that command is just trying to corrupt it again. It won't do the black magic to adjust tree blocks to avoid them. That's done in btrfs check. (and --repair) Btrfs check will just skip corrupted tree blocks and continue, while btrfs check --repair will try to rebuild the tree and avoid corrupted blocks. But as you can see, btrfs check can't handle it, due to the complicated corruption combination. So I'm afraid no good method to recover. Thanks, Qu > > Thanks, > Marc > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-05 5:03 ` Qu Wenruo @ 2017-05-05 15:43 ` Marc MERLIN 2017-05-17 18:23 ` Kai Krakow 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-05-05 15:43 UTC (permalink / raw) To: Qu Wenruo, hurikhan77 Cc: Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba Thanks again for your answer. Obviously even if my filesystem is toast, it's useful to learn from what happened. On Fri, May 05, 2017 at 01:03:02PM +0800, Qu Wenruo wrote: > > > So unfortunately, your fs/subvolume trees are also corrupted. > > > And almost no chance to do a graceful recovery. > > So I'm confused here. You're saying my metadata is not corrupted (and in > > my case, I have DUP, so I should have 2 copies), > > Nope, here I'm all talking about metadata (tree blocks). > Difference is the owner, either extent tree or fs/subvolume tree. I see. I didn't realize that my filesystem managed to corrupt both copies of its metadata. > The fsck doesn't check data blocks. Right, that's what scrub does, fair enough. > The problem is, tree blocks (metadata) that refers these data blocks are > corrupted. > > And they are corrupted in such a way that both extent tree (tree contains > extent allocation info) and fs tree (tree contains real fs info, like inode > and data location) are corrupted. > > So graceful recovery is not possible now. I see, thanks for explaining. > Unfortunately, no, even you have 2 copies, a lot of tree blocks are > corrupted that neither copy matches checksum. Thanks for confirming. I guess if I'm having corruption due to a bad card, it makes sense that both get updated after one another and both got corrupted for the same reason. > Corrupted blocks are corrupted, that command is just trying to corrupt it > again. > It won't do the black magic to adjust tree blocks to avoid them. I see. you may hve seen the earlier message from Kai Krakow who was able to to recover his FS by trying this trick, but I understand it can't work in all cases. Thanks again for your answers. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-05 15:43 ` Marc MERLIN @ 2017-05-17 18:23 ` Kai Krakow 0 siblings, 0 replies; 77+ messages in thread From: Kai Krakow @ 2017-05-17 18:23 UTC (permalink / raw) To: Marc MERLIN Cc: Qu Wenruo, Chris Murphy, Btrfs BTRFS, Chris Mason, bo.li.liu, fdmanana, Josef Bacik, David Sterba Am Fri, 5 May 2017 08:43:23 -0700 schrieb Marc MERLIN <marc@merlins.org>: [missing quote of the command] > > Corrupted blocks are corrupted, that command is just trying to > > corrupt it again. > > It won't do the black magic to adjust tree blocks to avoid them. > > I see. you may hve seen the earlier message from Kai Krakow who was > able to to recover his FS by trying this trick, but I understand it > can't work in all cases. Huh, what trick? I don't take credit for it... ;-) The corrupt-block trick must've been someone else... -- Regards, Kai Replies to list-only preferred. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? 2017-05-01 18:08 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN 2017-05-02 1:50 ` Chris Murphy @ 2017-05-05 1:13 ` Qu Wenruo 1 sibling, 0 replies; 77+ messages in thread From: Qu Wenruo @ 2017-05-05 1:13 UTC (permalink / raw) To: Marc MERLIN, linux-btrfs; +Cc: clm, bo.li.liu, fdmanana, jbacik, dsterba At 05/02/2017 02:08 AM, Marc MERLIN wrote: > So, I forgot to mention that it's my main media and backup server that got > corrupted. Yes, I do actually have a backup of a backup server, but it's > going to take days to recover due to the amount of data to copy back, not > counting lots of manual typing due to the number of subvolumes, btrfs > send/receive relationships and so forth. > > Really, I should be able to roll back all writes from the last 24H, run a > check --repair/scrub on top just to be sure, and be back on track. > > In the meantime, the good news is that the filesystem doesn't crash the > kernel (the poasted crash below) now that I was able to cancel the btrfs balance, > but it goes read only at the drop of a hat, even when I'm trying to delete > recent snapshots and all data that was potentially written in the last 24H > > On Mon, May 01, 2017 at 10:06:41AM -0700, Marc MERLIN wrote: >> I have a filesystem that sadly got corrupted by a SAS card I just installed yesterday. >> >> I don't think in a case like this, there is there a way to roll back all >> writes across all subvolumes in the last 24H, correct? Sorry for the late reply. I thought the case is already finished as I see little chance to recover. :( No, no way to roll back unless you're completely sure there is only 1 transaction commit happened in last 24H. (Well, not really possible in real world) Btrfs is only capable to rollback to *previous* commit. That's ensure by forced metadata CoW. But beyond previous commit, only god knows. If all metadata CoW write is done in some place never used by any previous metadata, then there is the chance to recover. But mostly the possibility is very low, some mount option like ssd will change the extent allocator behavior to improve the possibility, but still need a lot of luck. More detailed comment will be replied to btrfs check mail. Thanks, Qu >> >> Is the best thing to go in each subvolume, delete the recent snapshots and >> rename the one from 24H as the current one? > > Well, just like I expected, it's a pain in the rear and this can't even help > fix the top level mountpoint which doesn't have snapshots, so I can't roll > it back. > btrfs should really have an easy way to roll back X hours, or days to > recover from garbage written after a good known point, given that it is COW > afterall. > > Is there a way do this with check --repair maybe? > > In the meantime, I got stuck while trying to delete snapshots: > > Let's say I have this: > ID 428 gen 294021 top level 5 path backup > ID 2023 gen 294021 top level 5 path Soft > ID 3021 gen 294051 top level 428 path backup/debian32 > ID 4400 gen 294018 top level 428 path backup/debian64 > ID 4930 gen 294019 top level 428 path backup/ubuntu > > I can easily > Delete subvolume (no-commit): '/mnt/btrfs_pool2/Soft' > and then: > gargamel:/mnt/btrfs_pool2# mv Soft_rw.20170430_01:50:22 Soft > > But I can't delete backup, which actually is mostly only a directory > containing other things (in hindsight I shouldn't have made that a > subvolume) > Delete subvolume (no-commit): '/mnt/btrfs_pool2/backup' > ERROR: cannot delete '/mnt/btrfs_pool2/backup': Directory not empty > > This is because backup has a lot of subvolumes due to btrfs send/receive > relationships. > > Is it possible to recover there? Can you reparent subvolumes to a different > subvolume without doing a full copy via btrfs send/receive? > > Thanks, > Marc > >> BTRFS warning (device dm-5): failed to load free space cache for block group 6746013696000, rebuilding it now >> BTRFS warning (device dm-5): block group 6754603630592 has wrong amount of free space >> BTRFS warning (device dm-5): failed to load free space cache for block group 6754603630592, rebuilding it now >> BTRFS warning (device dm-5): block group 7125178777600 has wrong amount of free space >> BTRFS warning (device dm-5): failed to load free space cache for block group 7125178777600, rebuilding it now >> BTRFS error (device dm-5): bad tree block start 3981076597540270796 2899180224512 >> BTRFS error (device dm-5): bad tree block start 942082474969670243 2899180224512 >> BTRFS: error (device dm-5) in __btrfs_free_extent:6944: errno=-5 IO failure >> BTRFS info (device dm-5): forced readonly >> BTRFS: error (device dm-5) in btrfs_run_delayed_refs:2961: errno=-5 IO failure >> BUG: unable to handle kernel NULL pointer dereference at (null) >> IP: __del_reloc_root+0x3f/0xa6 >> PGD 189a0e067 >> PUD 189a0f067 >> PMD 0 >> >> Oops: 0000 [#1] PREEMPT SMP >> Modules linked in: veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype br_netfilter bridge stp llc tun autofs4 softdog binfmt_misc ftdi_sio nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter lm85 hwmon_vid pl2303 dm_snapshot dm_bufio iptable_nat ip_tables nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_conntrack_ftp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat nf_conntrack x_tables sg st snd_pcm_oss snd_mixer_oss bcache kvm_intel kvm irqbypass snd_hda_codec_realtek snd_cmipci snd_hda_codec_generic snd_hda_intel snd_mpu401_uart snd_hda_codec snd_opl3_lib snd_rawmidi snd_hda_core snd_seq_device snd_hwdep eeepc_wmi snd_pcm asus_wmi rc_ati_x10 >> asix snd_timer ati_remote sparse_keymap usbnet rfkill snd hwmon soundcore rc_core evdev libphy tpm_infineon pcspkr i915 parport_pc i2c_i801 input_leds mei_me lpc_ich parport tpm_tis battery usbserial tpm_tis_core tpm wmi e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core lrw ablk_helper dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx crc32c_intel blowfish_x86_64 blowfish_common pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd xhci_pci ehci_pci sata_sil24 xhci_hcd mvsas ehci_hcd r8169 usbcore mii libsas scsi_transport_sas thermal fan [last unloaded: ftdi_sio] >> CPU: 0 PID: 9056 Comm: btrfs Tainted: G U 4.11.0-amd64-preempt-sysrq-20170406 #2 >> Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 >> task: ffff88374d2a60c0 task.stack: ffffa6f226424000 >> RIP: 0010:__del_reloc_root+0x3f/0xa6 >> RSP: 0018:ffffa6f226427a40 EFLAGS: 00210246 >> RAX: 0000000000000000 RBX: ffff8838ee256000 RCX: 00000000ffffffe2 >> RDX: 0000000000000001 RSI: ffffffff9f83b410 RDI: ffff8837992da568 >> RBP: ffffa6f226427a68 R08: 0000000000000000 R09: ffffffff9fd69480 >> R10: 0000000000000000 R11: 0000000000000000 R12: ffffa6f226427ab0 >> R13: ffff883768938000 R14: ffff8837992da568 R15: ffff8837992da570 >> FS: 00007facd18d28c0(0000) GS:ffff883a5e200000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 0000000000000000 CR3: 0000000189a10000 CR4: 00000000001406f0 >> Call Trace: >> free_reloc_roots+0x4f/0x5d >> merge_reloc_roots+0x159/0x1ba >> relocate_block_group+0x410/0x492 >> btrfs_relocate_block_group+0x12d/0x253 >> btrfs_relocate_chunk+0x3e/0xb1 >> btrfs_balance+0xd16/0xf36 >> btrfs_ioctl_balance+0x24f/0x2cd >> ? __alloc_pages_nodemask+0x134/0x1e0 >> btrfs_ioctl+0x1447/0x1e22 >> ? mem_cgroup_charge_statistics+0x1e/0x88 >> ? get_page+0x9/0x26 >> ? __lru_cache_add+0x2a/0x6c >> ? set_pte_at+0x9/0xd >> ? __handle_mm_fault+0x61d/0xa6f >> vfs_ioctl+0x21/0x38 >> ? vfs_ioctl+0x21/0x38 >> do_vfs_ioctl+0x4ef/0x537 >> ? current_kernel_time64+0x10/0x36 >> ? __audit_syscall_entry+0xc2/0xe6 >> ? syscall_trace_enter+0x1ac/0x20e >> SyS_ioctl+0x57/0x7b >> do_syscall_64+0x6b/0x7d >> entry_SYSCALL64_slow_path+0x25/0x25 >> RIP: 0033:0x7facd097ecc7 >> RSP: 002b:00007ffefd3c3128 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 >> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007facd097ecc7 >> RDX: 00007ffefd3c31b8 RSI: 00000000c4009420 RDI: 0000000000000003 >> RBP: 00007ffefd3c31b8 R08: 0000000000000003 R09: 0000000000008040 >> R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000003 >> R13: 00007ffefd3c4cc9 R14: 0000000000000001 R15: 0000000000000001 >> Code: af f0 01 00 00 48 89 fb 4d 8b b5 10 0b 00 00 4d 8d be 70 05 00 00 49 81 c6 68 05 00 00 4c 89 ff e8 0f 44 43 00 48 8b 03 4c 89 f7 <48> 8b 30 e8 0e fc ff ff 48 85 c0 49 89 c4 74 0b 4c 89 f6 48 89 >> RIP: __del_reloc_root+0x3f/0xa6 RSP: ffffa6f226427a40 >> CR2: 0000000000000000 >> ---[ end trace 64c3fa4dc953d295 ]--- >> Kernel panic - not syncing: Fatal exception >> Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) >> Rebooting in 20 seconds.. >> ACPI MEMORY or I/O RESET_REG. >> >> -- >> "A mouse is a device used to point at the xterm you want to type in" - A.S.R. >> Microsoft is to operating systems .... >> .... what McDonalds is to gourmet cooking >> Home page: http://marc.merlins.org/ > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-28 14:43 ` Marc MERLIN 2017-05-01 17:06 ` 4.11 relocate crash, null pointer Marc MERLIN @ 2017-06-29 13:36 ` Lu Fengqi 2017-06-29 15:30 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Lu Fengqi @ 2017-06-29 13:36 UTC (permalink / raw) To: Marc MERLIN; +Cc: Qu Wenruo, Btrfs BTRFS On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote: >[cc trimmed] > >On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote: >> Because the output is abnormal, except for the relevant DIR_ITEM and >> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA. >> I wonder if the file system is online when this command is executed? If >> so, please re-execute it offline again; if not, could you apply my >> patches re-check it again? > >The filesystem was offline and I had those 2 patches applied. I am afraid I don't know why the inode item disappers. Besides, if btrfs-debug-tree can't find the inode item, btrfs check shouldn't report this inode item's extent data interrupt. Could you check the disk again? The error output may have changed. > >Marc >-- >"A mouse is a device used to point at the xterm you want to type in" - A.S.R. >Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking >Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-29 13:36 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi @ 2017-06-29 15:30 ` Marc MERLIN 2017-06-30 14:59 ` Lu Fengqi 0 siblings, 1 reply; 77+ messages in thread From: Marc MERLIN @ 2017-06-29 15:30 UTC (permalink / raw) To: Lu Fengqi; +Cc: Qu Wenruo, Btrfs BTRFS On Thu, Jun 29, 2017 at 09:36:15PM +0800, Lu Fengqi wrote: > On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote: > >[cc trimmed] > > > >On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote: > >> Because the output is abnormal, except for the relevant DIR_ITEM and > >> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA. > >> I wonder if the file system is online when this command is executed? If > >> so, please re-execute it offline again; if not, could you apply my > >> patches re-check it again? > > > >The filesystem was offline and I had those 2 patches applied. > > I am afraid I don't know why the inode item disappers. Besides, if > btrfs-debug-tree can't find the inode item, btrfs check shouldn't report > this inode item's extent data interrupt. Could you check the disk > again? The error output may have changed. I just did but it takes 24H. I just have the results now: gargamel:~# btrfs check --mode lowmem /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt ERROR: errors found in fs roots found 5544779124736 bytes used, error(s) found total csum bytes: 5344523140 total tree bytes: 71323058176 total fs tree bytes: 59288403968 total extent tree bytes: 5378277376 btree space waste bytes: 10912183048 file data blocks allocated: 7830914256896 referenced 6244104495104 This is looking better, but not 0. Can I ignore these or should we look into them still? Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-29 15:30 ` Marc MERLIN @ 2017-06-30 14:59 ` Lu Fengqi 0 siblings, 0 replies; 77+ messages in thread From: Lu Fengqi @ 2017-06-30 14:59 UTC (permalink / raw) To: Marc MERLIN; +Cc: Qu Wenruo, Btrfs BTRFS On Thu, Jun 29, 2017 at 08:30:35AM -0700, Marc MERLIN wrote: >On Thu, Jun 29, 2017 at 09:36:15PM +0800, Lu Fengqi wrote: >> On Wed, Jun 28, 2017 at 07:43:48AM -0700, Marc MERLIN wrote: >> >[cc trimmed] >> > >> >On Wed, Jun 28, 2017 at 03:10:27PM +0800, Lu Fengqi wrote: >> >> Because the output is abnormal, except for the relevant DIR_ITEM and >> >> DIR_INDEX, I can't find the above mentiond INODE_ITEM and EXTENT_DATA. >> >> I wonder if the file system is online when this command is executed? If >> >> so, please re-execute it offline again; if not, could you apply my >> >> patches re-check it again? >> > >> >The filesystem was offline and I had those 2 patches applied. >> >> I am afraid I don't know why the inode item disappers. Besides, if >> btrfs-debug-tree can't find the inode item, btrfs check shouldn't report >> this inode item's extent data interrupt. Could you check the disk >> again? The error output may have changed. > >I just did but it takes 24H. I just have the results now: >gargamel:~# btrfs check --mode lowmem /dev/mapper/dshelf2 >Checking filesystem on /dev/mapper/dshelf2 >UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede >checking extents >checking free space cache >cache and super generation don't match, space cache will be invalidated >checking fs roots >ERROR: root 3862 EXTENT_DATA[18170706 4096] interrupt >ERROR: root 3862 EXTENT_DATA[18170706 16384] interrupt >ERROR: root 3862 EXTENT_DATA[18170706 20480] interrupt >ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt >ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt >ERROR: errors found in fs roots >found 5544779124736 bytes used, error(s) found >total csum bytes: 5344523140 >total tree bytes: 71323058176 >total fs tree bytes: 59288403968 >total extent tree bytes: 5378277376 >btree space waste bytes: 10912183048 >file data blocks allocated: 7830914256896 > referenced 6244104495104 > > >This is looking better, but not 0. >Can I ignore these or should we look into them still? > >Marc >-- >"A mouse is a device used to point at the xterm you want to type in" - A.S.R. >Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking >Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 > > Personally, I think since the normal mode didn't report any error related this inode, then these error maybe caused by the bug of lowmem mode and btrfs-debug-tree. At your convenience, would you please give me all items about this inode? I think it can provide some clues regarding the disappearance of inode and the extent interrupt. It can be dumped by this following command: # btrfs-debug-tree /dev/mapper/dshelf2 | grep -C 10 18170706 Please pay attention that, this dump may contain filenames, feel free to mask the filenames. Thank you for your assistance. -- Thanks, Lu ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: How to fix errors that check --mode lomem finds, but --mode normal doesn't? 2017-06-22 2:53 ` Marc MERLIN 2017-06-22 4:08 ` Qu Wenruo @ 2017-06-22 4:08 ` Qu Wenruo 1 sibling, 0 replies; 77+ messages in thread From: Qu Wenruo @ 2017-06-22 4:08 UTC (permalink / raw) To: Marc MERLIN; +Cc: Chris Murphy, Hugo Mills, Btrfs BTRFS At 06/22/2017 10:53 AM, Marc MERLIN wrote: > Ok, first it finished (almost 24H) > > (...) > ERROR: root 3862 EXTENT_DATA[18170706 135168] interrupt > ERROR: root 3862 EXTENT_DATA[18170706 1048576] interrupt > ERROR: root 3864 EXTENT_DATA[109336 4096] interrupt > ERROR: errors found in fs roots > found 5544779108352 bytes used, error(s) found > total csum bytes: 5344523140 > total tree bytes: 71323041792 > total fs tree bytes: 59288403968 > total extent tree bytes: 5378260992 > btree space waste bytes: 10912166856 > file data blocks allocated: 7830914256896 > referenced 6244104495104 > > Thanks for your reply Qu > > On Thu, Jun 22, 2017 at 10:22:57AM +0800, Qu Wenruo wrote: >>> gargamel:~# btrfs check -p --mode lowmem /dev/mapper/dshelf2 >>> Checking filesystem on /dev/mapper/dshelf2 >>> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede >>> ERROR: extent[3886187384832, 81920] referencer count mismatch (root: >>> 11930, owner: 375444, offset: 1851654144) wanted: 1, have: 4 >> >> This means that in extent tree, btrfs says there is only one referring >> to this extent, but lowmem mode find 4. >> >> It would provide great help if you could dump extent tree for it. >> # btrfs-debug-tree <dev> | grep -C 10 3886187384832 > > extent data backref root 11712 objectid 375444 offset 1851572224 count 1 > extent data backref root 11276 objectid 375444 offset 1851572224 count 1 > extent data backref root 11058 objectid 375444 offset 1851572224 count 1 > extent data backref root 11494 objectid 375444 offset 1851572224 count 1 > item 37 key (3886187352064 EXTENT_ITEM 32768) itemoff 11381 itemsize 140 > extent refs 4 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851596800 count 1 > extent data backref root 11276 objectid 375444 offset 1851596800 count 1 > extent data backref root 11058 objectid 375444 offset 1851596800 count 1 > extent data backref root 11494 objectid 375444 offset 1851596800 count 1 > item 38 key (3886187384832 EXTENT_ITEM 81920) itemoff 11212 itemsize 169 > extent refs 16 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851654144 count 4 > extent data backref root 11276 objectid 375444 offset 1851654144 count 4 > extent data backref root 11058 objectid 375444 offset 1851654144 count 3 > extent data backref root 11494 objectid 375444 offset 1851654144 count 4 > extent data backref root 11930 objectid 375444 offset 1851654144 count 1 > item 39 key (3886187466752 EXTENT_ITEM 16384) itemoff 11043 itemsize 169 > extent refs 5 gen 32382 flags DATA > extent data backref root 11712 objectid 375444 offset 1851744256 count 1 > extent data backref root 11276 objectid 375444 offset 1851744256 count 1 Well, there is only the output from extent tree. I was also expecting output from subvolue (11930) tree. It could be done by # btrfs-debug-tree -t 11930 | grep -C 10 3886187384832 But please pay attention that, this dump may contain filenames, feel free to mask the filenames. > > >>> ERROR: errors found in extent allocation tree or chunk allocation >>> cache and super generation don't match, space cache will be invalidated >>> ERROR: root 3857 EXTENT_DATA[108864 4096] interrupt >> >> This means that, for root 3857, inode 108864, file offset 4096, there is >> a gap before that extent. >> In NO_HOLES mode it's allowed, but if NO_HOLES incompat flag is not set, >> this should be a problem. >> >> I wonder if this is a problem caused by inlined compressed file extent. >> >> This can also be dumped by the following command. >> # btrfs-debug-tree -t 3857 <dev> | grep -C 10 108864 > > This one is much bigger (192KB), I've bzipped and attached it. Thanks for this one. And it is caused by inlined compressed extent. Lu Fengqi will send patch fixing it. Thanks, Qu > > Thanks for having a look, I appreciate it. > > Marc > ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 23:12 ` Marc MERLIN 2017-06-20 23:58 ` Marc MERLIN 2017-06-21 3:31 ` Chris Murphy @ 2017-06-21 12:04 ` Duncan 2 siblings, 0 replies; 77+ messages in thread From: Duncan @ 2017-06-21 12:04 UTC (permalink / raw) To: linux-btrfs Marc MERLIN posted on Tue, 20 Jun 2017 16:12:03 -0700 as excerpted: > On Tue, Jun 20, 2017 at 08:44:29AM -0700, Marc MERLIN wrote: >> On Tue, Jun 20, 2017 at 03:36:01PM +0000, Hugo Mills wrote: >> >>>> "space cache will be invalidated " => doesn't that mean that my >>>> cache was already cleared by check --repair, or are you saying I >>>> need to clear it again? >>> >>> I'm never quite sure about that one. :) >>> >>> It can't hurt to clear it manually as well. >> >> Sounds good, done. > > Except it didn't help :( > It worked for a while, and failed again. > > It looks like I'm hitting a persistent bug :( [Omitted free space cache dmesg errors] > Given that check --repair ran clean when I ran it yesterday after this > first happened, and I then ran mount -o clear_cache , the cache got > rebuilt, and I got the problem again, this is not looking good, seems > like a persistent bug :-/ Keep in mind this quote from a recent (I'm quoting -progs 4.11) btrfs- check manpage (reformatted for posting): >>>>> --clear-space-cache v1|v2 completely wipe all free space cache of given type For free space cache v1, the clear_cache kernel mount option only rebuilds the free space cache for block groups that are modified while the filesystem is mounted with that option. Thus, using this option with v1 makes it possible to actually clear the entire free space cache. For free space cache v2, the clear_cache kernel mount option does destroy the entire free space cache. This option with v2 provides an alternative method of clearing the free space cache that doesn’t require mounting the filesystem. <<<<< Given the dmesg, seems you're still running the space cache, not the v2/ tree (which is fine, I'm conservative enough not to have switched yet either). So try the check option instead of the mount option. The mount option might simply have not caught all the badness while it was active. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-20 15:44 ` Marc MERLIN 2017-06-20 23:12 ` Marc MERLIN @ 2017-06-21 3:26 ` Chris Murphy 2017-06-21 4:06 ` Marc MERLIN 1 sibling, 1 reply; 77+ messages in thread From: Chris Murphy @ 2017-06-21 3:26 UTC (permalink / raw) To: Marc MERLIN; +Cc: Hugo Mills, Btrfs BTRFS On Tue, Jun 20, 2017 at 9:44 AM, Marc MERLIN <marc@merlins.org> wrote: > In the meantime, I ran into this again: > https://bugzilla.kernel.org/show_bug.cgi?id=195863 > btrfs check of a big filesystem kills the kernel due to OOM (but btrfs userspace is not OOM killed) > > Is it achievable at all for btrfs check to realize that it's taking all the > available RAM in kernel space, is about to crash the system, and cancel the > check before the system crashes? > I've already confirmed that it doesn't use swap. I've just had to order new > RAM to upgrade my machine from 24GB to 32GB, but 32GB is max for that > hardware, so hopefully the lowmem repair stuff will work before I hit the > 32GB limit next time. Right now Btrfs isn't scalable if you have to repair it because large volumes run into this problem; one of the reasons for the lowmem mode. It's a separate bug that it OOMs even with swap, I don't know why it won't use that, it should be up to kernel memory management to deal with this; I know this works with xfs_repair. I don't know if the idea is that normal mode will go away, in favor of lowmem mode, or if there are fixes still planned for normal mode. If it's going to stick around, it needs to be able to use swap, same for lowmem mode. Just running into a total inability to --repair isn't OK. -- Chris Murphy ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean 2017-06-21 3:26 ` Chris Murphy @ 2017-06-21 4:06 ` Marc MERLIN 0 siblings, 0 replies; 77+ messages in thread From: Marc MERLIN @ 2017-06-21 4:06 UTC (permalink / raw) To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS On Tue, Jun 20, 2017 at 09:26:27PM -0600, Chris Murphy wrote: > Right now Btrfs isn't scalable if you have to repair it because large > volumes run into this problem; one of the reasons for the lowmem mode. > > It's a separate bug that it OOMs even with swap, I don't know why it > won't use that, it should be up to kernel memory management to deal The thing is that it doesn't even get OOM'ed. I didn't look at the code, but I'm assuming it must be using kernel RAM instead of user space RAM, which is why it can't be OOM'ed and why it gets the kernel to deadlock. If that is the case, then the user space code should monitor kernel space usage and cancel the check if it's about to run out of usable RAM (better than deadlocking the system). Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 77+ messages in thread
end of thread, other threads:[~2017-08-01 16:41 UTC | newest] Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN 2017-06-20 15:23 ` Hugo Mills 2017-06-20 15:26 ` Marc MERLIN 2017-06-20 15:36 ` Hugo Mills 2017-06-20 15:44 ` Marc MERLIN 2017-06-20 23:12 ` Marc MERLIN 2017-06-20 23:58 ` Marc MERLIN 2017-06-21 3:31 ` Chris Murphy 2017-06-21 3:43 ` Marc MERLIN 2017-06-21 15:13 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN 2017-06-21 23:22 ` Chris Murphy 2017-06-22 0:48 ` Marc MERLIN 2017-06-22 2:22 ` Qu Wenruo 2017-06-22 2:53 ` Marc MERLIN 2017-06-22 4:08 ` Qu Wenruo 2017-06-23 4:06 ` Marc MERLIN 2017-06-23 8:54 ` Lu Fengqi 2017-06-23 16:17 ` Marc MERLIN 2017-06-24 2:34 ` Marc MERLIN 2017-06-26 10:46 ` Lu Fengqi 2017-06-27 23:11 ` Marc MERLIN 2017-06-28 7:10 ` Lu Fengqi 2017-06-28 14:43 ` Marc MERLIN 2017-05-01 17:06 ` 4.11 relocate crash, null pointer Marc MERLIN 2017-05-01 18:08 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN 2017-05-02 1:50 ` Chris Murphy 2017-05-02 3:23 ` Marc MERLIN 2017-05-02 4:56 ` Chris Murphy 2017-05-02 5:11 ` Marc MERLIN 2017-05-02 18:47 ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN 2017-05-03 6:00 ` Marc MERLIN 2017-05-03 6:17 ` Marc MERLIN 2017-05-03 6:32 ` Roman Mamedov 2017-05-03 20:40 ` Marc MERLIN 2017-07-07 5:37 ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN 2017-07-07 5:39 ` Marc MERLIN 2017-07-07 9:33 ` Lu Fengqi 2017-07-07 16:38 ` Marc MERLIN 2017-07-09 4:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 2017-07-09 5:05 ` We really need a better/working btrfs check --repair Marc MERLIN 2017-07-09 6:34 ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN 2017-07-09 7:57 ` Martin Steigerwald 2017-07-09 9:16 ` Paul Jones 2017-07-09 11:17 ` Duncan 2017-07-09 13:00 ` Martin Steigerwald 2017-07-29 19:29 ` Imran Geriskovan 2017-07-29 23:38 ` Duncan 2017-07-30 14:54 ` Imran Geriskovan 2017-07-31 4:53 ` Duncan 2017-07-31 20:32 ` Imran Geriskovan 2017-08-01 1:36 ` Duncan 2017-08-01 15:18 ` Imran Geriskovan 2017-07-31 21:07 ` Ivan Sizov 2017-07-31 21:17 ` Marc MERLIN 2017-07-31 21:39 ` Ivan Sizov 2017-08-01 16:41 ` Ivan Sizov 2017-07-31 22:00 ` Justin Maggard 2017-08-01 6:38 ` Marc MERLIN 2017-05-02 19:59 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow 2017-05-02 5:01 ` Duncan 2017-05-02 19:53 ` Kai Krakow 2017-05-23 16:58 ` Marc MERLIN 2017-05-24 10:16 ` Duncan 2017-05-05 1:19 ` Qu Wenruo 2017-05-05 2:10 ` Qu Wenruo 2017-05-05 2:40 ` Marc MERLIN 2017-05-05 5:03 ` Qu Wenruo 2017-05-05 15:43 ` Marc MERLIN 2017-05-17 18:23 ` Kai Krakow 2017-05-05 1:13 ` Qu Wenruo 2017-06-29 13:36 ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi 2017-06-29 15:30 ` Marc MERLIN 2017-06-30 14:59 ` Lu Fengqi 2017-06-22 4:08 ` Qu Wenruo 2017-06-21 12:04 ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan 2017-06-21 3:26 ` Chris Murphy 2017-06-21 4:06 ` Marc MERLIN
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.