All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs balance did not progress after 12H
@ 2018-06-18 13:00 Marc MERLIN
  2018-06-19 15:47 ` Marc MERLIN
  0 siblings, 1 reply; 10+ messages in thread
From: Marc MERLIN @ 2018-06-18 13:00 UTC (permalink / raw)
  To: Linux fs Btrfs

So, I ran this:
gargamel:/mnt/btrfs_pool2# btrfs balance start -dusage=60 -v .  &
[1] 24450
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=60
gargamel:/mnt/btrfs_pool2# while :; do btrfs balance status .; sleep 60; done
0 out of about 0 chunks balanced (0 considered), -nan% left
Balance on '.' is running
0 out of about 73 chunks balanced (2 considered), 100% left
Balance on '.' is running

After about 20mn, it changed to this:
1 out of about 73 chunks balanced (6724 considered),  99% left
Balance on '.' is running

Now, 12H later, it's still there, only 1 out of 73.

gargamel:/mnt/btrfs_pool2# btrfs fi show .
Label: 'dshelf2'  uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d
        Total devices 1 FS bytes used 12.72TiB
        devid    1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2

gargamel:/mnt/btrfs_pool2# btrfs fi df .
Data, single: total=13.57TiB, used=12.60TiB
System, DUP: total=32.00MiB, used=1.55MiB
Metadata, DUP: total=121.50GiB, used=116.53GiB
GlobalReserve, single: total=512.00MiB, used=848.00KiB

kernel: 4.16.8

Is that expected? Should I be ready to wait days possibly for this
balance to finish?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H
  2018-06-18 13:00 btrfs balance did not progress after 12H Marc MERLIN
@ 2018-06-19 15:47 ` Marc MERLIN
  2018-06-19 16:30   ` james harvey
  0 siblings, 1 reply; 10+ messages in thread
From: Marc MERLIN @ 2018-06-19 15:47 UTC (permalink / raw)
  To: Linux fs Btrfs

On Mon, Jun 18, 2018 at 06:00:55AM -0700, Marc MERLIN wrote:
> So, I ran this:
> gargamel:/mnt/btrfs_pool2# btrfs balance start -dusage=60 -v .  &
> [1] 24450
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x2): balancing, usage=60
> gargamel:/mnt/btrfs_pool2# while :; do btrfs balance status .; sleep 60; done
> 0 out of about 0 chunks balanced (0 considered), -nan% left
> Balance on '.' is running
> 0 out of about 73 chunks balanced (2 considered), 100% left
> Balance on '.' is running
> 
> After about 20mn, it changed to this:
> 1 out of about 73 chunks balanced (6724 considered),  99% left
> Balance on '.' is running
> 
> Now, 12H later, it's still there, only 1 out of 73.
> 
> gargamel:/mnt/btrfs_pool2# btrfs fi show .
> Label: 'dshelf2'  uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d
>         Total devices 1 FS bytes used 12.72TiB
>         devid    1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2
> 
> gargamel:/mnt/btrfs_pool2# btrfs fi df .
> Data, single: total=13.57TiB, used=12.60TiB
> System, DUP: total=32.00MiB, used=1.55MiB
> Metadata, DUP: total=121.50GiB, used=116.53GiB
> GlobalReserve, single: total=512.00MiB, used=848.00KiB
> 
> kernel: 4.16.8
> 
> Is that expected? Should I be ready to wait days possibly for this
> balance to finish?
 
It's now beeen 2 days, and it's still stuck at 1%
1 out of about 73 chunks balanced (6724 considered),  99% left

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H
  2018-06-19 15:47 ` Marc MERLIN
@ 2018-06-19 16:30   ` james harvey
  2018-06-19 16:58     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 10+ messages in thread
From: james harvey @ 2018-06-19 16:30 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Linux fs Btrfs

On Tue, Jun 19, 2018 at 11:47 AM, Marc MERLIN <marc@merlins.org> wrote:
> On Mon, Jun 18, 2018 at 06:00:55AM -0700, Marc MERLIN wrote:
>> So, I ran this:
>> gargamel:/mnt/btrfs_pool2# btrfs balance start -dusage=60 -v .  &
>> [1] 24450
>> Dumping filters: flags 0x1, state 0x0, force is off
>>   DATA (flags 0x2): balancing, usage=60
>> gargamel:/mnt/btrfs_pool2# while :; do btrfs balance status .; sleep 60; done
>> 0 out of about 0 chunks balanced (0 considered), -nan% left

This (0/0/0, -nan%) seems alarming.  I had this output once when the
system spontaneously rebooted during a balance.  I didn't have any bad
effects afterward.

>> Balance on '.' is running
>> 0 out of about 73 chunks balanced (2 considered), 100% left
>> Balance on '.' is running
>>
>> After about 20mn, it changed to this:
>> 1 out of about 73 chunks balanced (6724 considered),  99% left

This seems alarming.  I wouldn't think # considered should ever exceed
# chunks.  Although, it does say "about", so maybe it can a little
bit, but I wouldn't expect it to exceed it by this much.

>> Balance on '.' is running
>>
>> Now, 12H later, it's still there, only 1 out of 73.
>>
>> gargamel:/mnt/btrfs_pool2# btrfs fi show .
>> Label: 'dshelf2'  uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d
>>         Total devices 1 FS bytes used 12.72TiB
>>         devid    1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2
>>
>> gargamel:/mnt/btrfs_pool2# btrfs fi df .
>> Data, single: total=13.57TiB, used=12.60TiB
>> System, DUP: total=32.00MiB, used=1.55MiB
>> Metadata, DUP: total=121.50GiB, used=116.53GiB
>> GlobalReserve, single: total=512.00MiB, used=848.00KiB
>>
>> kernel: 4.16.8
>>
>> Is that expected? Should I be ready to wait days possibly for this
>> balance to finish?
>
> It's now beeen 2 days, and it's still stuck at 1%
> 1 out of about 73 chunks balanced (6724 considered),  99% left

First, my disclaimer.  I'm not a btrfs developer, and although I've
ran balance many times, I haven't really studied its output beyond the
% left.  I don't know why it says "about", and I don't know if it
should ever be that far off.

In your situation, I would run "btrfs pause <path>", wait to hear from
a btrfs developer, and not use the volume whatsoever in the meantime.
I can make some guesses where to go from here, but won't, as I don't
want to screw things up for you.

What version of btrfs-progs do you have?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H
  2018-06-19 16:30   ` james harvey
@ 2018-06-19 16:58     ` Austin S. Hemmelgarn
  2018-06-20  8:55       ` Duncan
  2018-06-25 16:07       ` btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still Marc MERLIN
  0 siblings, 2 replies; 10+ messages in thread
From: Austin S. Hemmelgarn @ 2018-06-19 16:58 UTC (permalink / raw)
  To: james harvey, Marc MERLIN; +Cc: Linux fs Btrfs

On 2018-06-19 12:30, james harvey wrote:
> On Tue, Jun 19, 2018 at 11:47 AM, Marc MERLIN <marc@merlins.org> wrote:
>> On Mon, Jun 18, 2018 at 06:00:55AM -0700, Marc MERLIN wrote:
>>> So, I ran this:
>>> gargamel:/mnt/btrfs_pool2# btrfs balance start -dusage=60 -v .  &
>>> [1] 24450
>>> Dumping filters: flags 0x1, state 0x0, force is off
>>>    DATA (flags 0x2): balancing, usage=60
>>> gargamel:/mnt/btrfs_pool2# while :; do btrfs balance status .; sleep 60; done
>>> 0 out of about 0 chunks balanced (0 considered), -nan% left
> 
> This (0/0/0, -nan%) seems alarming.  I had this output once when the
> system spontaneously rebooted during a balance.  I didn't have any bad
> effects afterward.
> 
>>> Balance on '.' is running
>>> 0 out of about 73 chunks balanced (2 considered), 100% left
>>> Balance on '.' is running
>>>
>>> After about 20mn, it changed to this:
>>> 1 out of about 73 chunks balanced (6724 considered),  99% left
> 
> This seems alarming.  I wouldn't think # considered should ever exceed
> # chunks.  Although, it does say "about", so maybe it can a little
> bit, but I wouldn't expect it to exceed it by this much.
Actually, output like this is not unusual.  In the above line, the 1 is 
how many chunks have been actually processed, the 73 is how many the 
command expects to process (that is, the count of chunks that fit the 
filtering requirements, in this case, ones which are 60% or less full), 
and the 6724 is how many it has checked against the filtering 
requirements.  So, if you've got a very large number of chunks, and are 
selecting a small number with filters, then the considered value is 
likely to be significantly higher than the first two.
> 
>>> Balance on '.' is running
>>>
>>> Now, 12H later, it's still there, only 1 out of 73.
>>>
>>> gargamel:/mnt/btrfs_pool2# btrfs fi show .
>>> Label: 'dshelf2'  uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d
>>>          Total devices 1 FS bytes used 12.72TiB
>>>          devid    1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2
>>>
>>> gargamel:/mnt/btrfs_pool2# btrfs fi df .
>>> Data, single: total=13.57TiB, used=12.60TiB
>>> System, DUP: total=32.00MiB, used=1.55MiB
>>> Metadata, DUP: total=121.50GiB, used=116.53GiB
>>> GlobalReserve, single: total=512.00MiB, used=848.00KiB
>>>
>>> kernel: 4.16.8
>>>
>>> Is that expected? Should I be ready to wait days possibly for this
>>> balance to finish?
>>
>> It's now beeen 2 days, and it's still stuck at 1%
>> 1 out of about 73 chunks balanced (6724 considered),  99% left
> 
> First, my disclaimer.  I'm not a btrfs developer, and although I've
> ran balance many times, I haven't really studied its output beyond the
> % left.  I don't know why it says "about", and I don't know if it
> should ever be that far off.
> 
> In your situation, I would run "btrfs pause <path>", wait to hear from
> a btrfs developer, and not use the volume whatsoever in the meantime.
I would say this is probably good advice.  I don't really know what's 
going on here myself actually, though it looks like the balance got 
stuck (the output hasn't changed for over 36 hours, unless you've got an 
insanely slow storage array, that's extremely unusual (it should only be 
moving at most 3GB of data per chunk)).

That said, I would question the value of repacking chunks that are 
already more than half full.  Anything above a 50% usage filter 
generally takes a long time, and has limited value in most cases (higher 
values are less likely to reduce the total number of allocated chunks). 
With `-duszge=50` or less, you're guaranteed to reduce the number of 
chunk if at least two match, and it isn't very time consuming for the 
allocator, all because you can pack at least two matching chunks into 
one 'new' chunk (new in quotes because it may re-pack them into existing 
slack space on the FS).  Additionally, `-dusage=50` is usually 
sufficient to mitigate the typical ENOSPC issues that regular balancing 
is supposed to help with.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H
  2018-06-19 16:58     ` Austin S. Hemmelgarn
@ 2018-06-20  8:55       ` Duncan
  2018-06-25 16:07       ` btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still Marc MERLIN
  1 sibling, 0 replies; 10+ messages in thread
From: Duncan @ 2018-06-20  8:55 UTC (permalink / raw)
  To: linux-btrfs

Austin S. Hemmelgarn posted on Tue, 19 Jun 2018 12:58:44 -0400 as
excerpted:

> That said, I would question the value of repacking chunks that are
> already more than half full.  Anything above a 50% usage filter
> generally takes a long time, and has limited value in most cases (higher
> values are less likely to reduce the total number of allocated chunks).
> With `-duszge=50` or less, you're guaranteed to reduce the number of
> chunk if at least two match, and it isn't very time consuming for the
> allocator, all because you can pack at least two matching chunks into
> one 'new' chunk (new in quotes because it may re-pack them into existing
> slack space on the FS). Additionally, `-dusage=50` is usually sufficient
> to mitigate the typical ENOSPC issues that regular balancing is supposed
> to help with.

While I used to agree, 50% for best efficiency, perhaps 66 or 70% if 
you're really pressed for space, now that the allocator can repack into 
existing chunks more efficiently than it used to (at least in ssd mode, 
which all my storage is now), I've seen higher values result in practical/
noticeable recovery of space to unallocated as well.

In fact, I routinely use usage=70 these days, and sometimes use higher, 
to 99 or even 100%[1].  But of course I'm on ssd so it's far faster, and 
partition it up with the biggest partitions being under 100 GiB, so even 
full unfiltered balances are normally under 10 minutes and normal 
filtered balances under a minute, to the point I usually issue the 
balance command and actually wait for completion, so it's a far different 
ball game than issuing a balance command on a multi-TB hard drive and 
expecting it to take hours or even days.  In that case, yeah, a 50% cap 
arguably makes sense, tho he was using 60, which still shouldn't (sans 
bugs like we seem to have here) be /too/ bad.

---
[1] usage=100: -musage=1..100 is the only way I've found to balance 
metadata without rebalancing system as well, with the unfortunate penalty 
for rebalancing system on small filesystems being an increase of the 
system chunk size from 8 MB original mkfs.btrfs size to 32 MB... only a 
few KiB used! =:^(

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still
  2018-06-19 16:58     ` Austin S. Hemmelgarn
  2018-06-20  8:55       ` Duncan
@ 2018-06-25 16:07       ` Marc MERLIN
  2018-06-25 16:24         ` Hans van Kranenburg
  2018-06-25 17:07         ` Austin S. Hemmelgarn
  1 sibling, 2 replies; 10+ messages in thread
From: Marc MERLIN @ 2018-06-25 16:07 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: james harvey, Linux fs Btrfs

On Tue, Jun 19, 2018 at 12:58:44PM -0400, Austin S. Hemmelgarn wrote:
> > In your situation, I would run "btrfs pause <path>", wait to hear from
> > a btrfs developer, and not use the volume whatsoever in the meantime.
> I would say this is probably good advice.  I don't really know what's going
> on here myself actually, though it looks like the balance got stuck (the
> output hasn't changed for over 36 hours, unless you've got an insanely slow
> storage array, that's extremely unusual (it should only be moving at most
> 3GB of data per chunk)).

I didn't hear from any developer, so I had to continue.
- btrfs scrub cancel did not work (hang)
- at reboot mounting the filesystem hung, even with 4.17, which is
  disappointing (it should not hang)
- mount -o recovery still hung
- mount -o ro did not hang though

Sigh, why is my FS corrupted again?
Anyway, back to 
btrfs check --repair
and, it took all my 32GB of RAM on a system I can't add more RAM to, so
I'm hosed. I'll note in passing (and it's not ok at all) that check
--repair after a 20 to 30mn pause, takes all the kernel RAM more quickly
than the system can OOM or log anything, and just deadlocks it.
This is repeateable and totally not ok :(

I'm now left with btrfs-progs git master, and lowmem which finally does
a bit of repair.
So far:
gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2  
enabling repair mode  
WARNING: low-memory mode repair support is only partial  
Checking filesystem on /dev/mapper/dshelf2  
UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d  
Fixed 0 roots.  
ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4
Created new chunk [18457780224000 1073741824]
Delete backref in extent [84302495744 69632]
ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4
Delete backref in extent [84302495744 69632]
ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240
Delete backref in extent [125712527360 12214272]
ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115
Delete backref in extent [125730848768 5111808]
ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115
Delete backref in extent [125730848768 5111808]
ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143
Delete backref in extent [125736914944 6037504]
ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143
Delete backref in extent [125736914944 6037504]
ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431
Delete backref in extent [129952120832 20242432]
ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433
Delete backref in extent [129952120832 20242432]
ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240
Delete backref in extent [134925357056 11829248]
ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240
Delete backref in extent [134925357056 11829248]
ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249
Delete backref in extent [147895111680 12345344]
ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251
Delete backref in extent [147895111680 12345344]
ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 348, have: 418
Delete backref in extent [150850146304 17522688]
ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 555, have: 1449
Deleted root 2 item[156909494272, 178, 5476627808561673095]
ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 556, have: 1452
Deleted root 2 item[156909494272, 178, 7338474132555182983]

At the rate it's going, it'll probably take days though, it's already been 36H

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still
  2018-06-25 16:07       ` btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still Marc MERLIN
@ 2018-06-25 16:24         ` Hans van Kranenburg
  2018-06-25 16:46           ` Marc MERLIN
  2018-06-25 17:07         ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 10+ messages in thread
From: Hans van Kranenburg @ 2018-06-25 16:24 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Linux fs Btrfs

On 06/25/2018 06:07 PM, Marc MERLIN wrote:
> On Tue, Jun 19, 2018 at 12:58:44PM -0400, Austin S. Hemmelgarn wrote:
>>> In your situation, I would run "btrfs pause <path>", wait to hear from
>>> a btrfs developer, and not use the volume whatsoever in the meantime.
>> I would say this is probably good advice.  I don't really know what's going
>> on here myself actually, though it looks like the balance got stuck (the
>> output hasn't changed for over 36 hours, unless you've got an insanely slow
>> storage array, that's extremely unusual (it should only be moving at most
>> 3GB of data per chunk)).
> 
> I didn't hear from any developer, so I had to continue.
> - btrfs scrub cancel did not work (hang)

Did you mean balance cancel? It waits until the current block group is
finished.

> - at reboot mounting the filesystem hung, even with 4.17, which is
>   disappointing (it should not hang)
> - mount -o recovery still hung
> - mount -o ro did not hang though
> 
> Sigh, why is my FS corrupted again?

Again? Do you think balance is corrupting the filesystem? Or have there
been previous btrfs check --repair operations which made smaller
problems bigger in the past?

> Anyway, back to 
> btrfs check --repair
> and, it took all my 32GB of RAM on a system I can't add more RAM to, so
> I'm hosed. I'll note in passing (and it's not ok at all) that check
> --repair after a 20 to 30mn pause, takes all the kernel RAM more quickly
> than the system can OOM or log anything, and just deadlocks it.
> This is repeateable and totally not ok :(
> 
> I'm now left with btrfs-progs git master, and lowmem which finally does
> a bit of repair.
> So far:
> gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2  
> enabling repair mode  
> WARNING: low-memory mode repair support is only partial  
> Checking filesystem on /dev/mapper/dshelf2  
> UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d  
> Fixed 0 roots.  

Am I right to interpret the messages below, and see that you have
extents that are referenced hundreds of times?

Is there heavy snapshotting or deduping going on in this filesystem? If
so, it's not surprising balance will get a hard time moving extents
around, since it has to update all of the metadata for each extent again
in hundreds of places.

Did you investigate what balance was doing if it takes long? Is is using
cpu all the time, or is it reading from disk slowly (random reads) or is
it writing to disk all the time at full speed?

K

> ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4
> Created new chunk [18457780224000 1073741824]
> Delete backref in extent [84302495744 69632]
> ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4
> Delete backref in extent [84302495744 69632]
> ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240
> Delete backref in extent [125712527360 12214272]
> ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115
> Delete backref in extent [125730848768 5111808]
> ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115
> Delete backref in extent [125730848768 5111808]
> ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143
> Delete backref in extent [125736914944 6037504]
> ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143
> Delete backref in extent [125736914944 6037504]
> ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431
> Delete backref in extent [129952120832 20242432]
> ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433
> Delete backref in extent [129952120832 20242432]
> ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240
> Delete backref in extent [134925357056 11829248]
> ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240
> Delete backref in extent [134925357056 11829248]
> ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249
> Delete backref in extent [147895111680 12345344]
> ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251
> Delete backref in extent [147895111680 12345344]
> ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 348, have: 418
> Delete backref in extent [150850146304 17522688]
> ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 555, have: 1449
> Deleted root 2 item[156909494272, 178, 5476627808561673095]
> ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 556, have: 1452
> Deleted root 2 item[156909494272, 178, 7338474132555182983]
> 
> At the rate it's going, it'll probably take days though, it's already been 36H

-- 
Hans van Kranenburg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still
  2018-06-25 16:24         ` Hans van Kranenburg
@ 2018-06-25 16:46           ` Marc MERLIN
  0 siblings, 0 replies; 10+ messages in thread
From: Marc MERLIN @ 2018-06-25 16:46 UTC (permalink / raw)
  To: Hans van Kranenburg; +Cc: Linux fs Btrfs

On Mon, Jun 25, 2018 at 06:24:37PM +0200, Hans van Kranenburg wrote:
> >> output hasn't changed for over 36 hours, unless you've got an insanely slow
> >> storage array, that's extremely unusual (it should only be moving at most
> >> 3GB of data per chunk)).
> > 
> > I didn't hear from any developer, so I had to continue.
> > - btrfs scrub cancel did not work (hang)
> 
> Did you mean balance cancel? It waits until the current block group is
> finished.
 
Yes, I meant that, thanks for correcting me.  And you're correct that
because it was hung, cancel wasn't going to go anywhere.
At least my filesystem was still working at the time (as in IO was going
on just fine)

> > - at reboot mounting the filesystem hung, even with 4.17, which is
> >   disappointing (it should not hang)
> > - mount -o recovery still hung
> > - mount -o ro did not hang though
> > 
> > Sigh, why is my FS corrupted again?
> 
> Again? Do you think balance is corrupting the filesystem? Or have there
> been previous btrfs check --repair operations which made smaller
> problems bigger in the past?

Honestly, I don't fully remember at this point, I keep notes, but not
detailled enough and it's been a little while.
I know I've had to delete/recreate this filesystem twice already over
the last years, but I'm not fully certain I remember when this one was
last wiped.
Yes, I do run balance along with scrub once a month:

    btrfs balance start -musage=0 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
    # After metadata, let's do data:
    btrfs balance start -dusage=0 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
    btrfs balance start -dusage=20 -v $mountpoint 2>&1 | grep -Ev "$FILTER"
    echo btrfs scrub start -Bd $mountpoint
    ionice -c 3 nice -10 btrfs scrub start -Bd $mountpoint

Hard to say if balance has damaged my filesystem over time, but it's
definitely possible.

> Am I right to interpret the messages below, and see that you have
> extents that are referenced hundreds of times?
 
I'm not certain, but it's a backup server with many blocks that are the same, so it 
could be some COW stuff, even if I didn't run any dedupe commands myself.

> Is there heavy snapshotting or deduping going on in this filesystem? If
> so, it's not surprising balance will get a hard time moving extents
> around, since it has to update all of the metadata for each extent again
> in hundreds of places.

There is some snapshotting, but maybe around 20 or so per subvolume, not hundreds.

> Did you investigate what balance was doing if it takes long? Is is using
> cpu all the time, or is it reading from disk slowly (random reads) or is
> it writing to disk all the time at full speed?

I couldn't see what it was doing, but it's running in the kernel, is it not?
(or can you just strace the user space command?)
Either way, it's too late for that now, and given that it didn't make progress of
a single block in 36H, I'm assuming it was well deadlocked.

Thanks for the reply.
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still
  2018-06-25 16:07       ` btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still Marc MERLIN
  2018-06-25 16:24         ` Hans van Kranenburg
@ 2018-06-25 17:07         ` Austin S. Hemmelgarn
  2018-06-25 17:34           ` Marc MERLIN
  1 sibling, 1 reply; 10+ messages in thread
From: Austin S. Hemmelgarn @ 2018-06-25 17:07 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: james harvey, Linux fs Btrfs

On 2018-06-25 12:07, Marc MERLIN wrote:
> On Tue, Jun 19, 2018 at 12:58:44PM -0400, Austin S. Hemmelgarn wrote:
>>> In your situation, I would run "btrfs pause <path>", wait to hear from
>>> a btrfs developer, and not use the volume whatsoever in the meantime.
>> I would say this is probably good advice.  I don't really know what's going
>> on here myself actually, though it looks like the balance got stuck (the
>> output hasn't changed for over 36 hours, unless you've got an insanely slow
>> storage array, that's extremely unusual (it should only be moving at most
>> 3GB of data per chunk)).
> 
> I didn't hear from any developer, so I had to continue.
> - btrfs scrub cancel did not work (hang)
> - at reboot mounting the filesystem hung, even with 4.17, which is
>    disappointing (it should not hang)
> - mount -o recovery still hung
> - mount -o ro did not hang though
One tip here specifically, if you had to reboot during a balance and the 
FS hangs when it mounts, try mounting with `-o skip_balance`.  That 
should pause the balance instead of resuming it on mount, at which point 
you should also be able to cancel it without it hanging.
> 
> Sigh, why is my FS corrupted again?
> Anyway, back to
> btrfs check --repair
> and, it took all my 32GB of RAM on a system I can't add more RAM to, so
> I'm hosed. I'll note in passing (and it's not ok at all) that check
> --repair after a 20 to 30mn pause, takes all the kernel RAM more quickly
> than the system can OOM or log anything, and just deadlocks it.
> This is repeateable and totally not ok :(
> 
> I'm now left with btrfs-progs git master, and lowmem which finally does
> a bit of repair.
> So far:
> gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2
> enabling repair mode
> WARNING: low-memory mode repair support is only partial
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d
> Fixed 0 roots.
> ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4
> Created new chunk [18457780224000 1073741824]
> Delete backref in extent [84302495744 69632]
> ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4
> Delete backref in extent [84302495744 69632]
> ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240
> Delete backref in extent [125712527360 12214272]
> ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115
> Delete backref in extent [125730848768 5111808]
> ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115
> Delete backref in extent [125730848768 5111808]
> ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143
> Delete backref in extent [125736914944 6037504]
> ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143
> Delete backref in extent [125736914944 6037504]
> ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431
> Delete backref in extent [129952120832 20242432]
> ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433
> Delete backref in extent [129952120832 20242432]
> ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240
> Delete backref in extent [134925357056 11829248]
> ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240
> Delete backref in extent [134925357056 11829248]
> ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249
> Delete backref in extent [147895111680 12345344]
> ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251
> Delete backref in extent [147895111680 12345344]
> ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 348, have: 418
> Delete backref in extent [150850146304 17522688]
> ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 555, have: 1449
> Deleted root 2 item[156909494272, 178, 5476627808561673095]
> ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 556, have: 1452
> Deleted root 2 item[156909494272, 178, 7338474132555182983]
> 
> At the rate it's going, it'll probably take days though, it's already been 36H
> 
> Marc
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still
  2018-06-25 17:07         ` Austin S. Hemmelgarn
@ 2018-06-25 17:34           ` Marc MERLIN
  0 siblings, 0 replies; 10+ messages in thread
From: Marc MERLIN @ 2018-06-25 17:34 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: james harvey, Linux fs Btrfs

On Mon, Jun 25, 2018 at 01:07:10PM -0400, Austin S. Hemmelgarn wrote:
> > - mount -o recovery still hung
> > - mount -o ro did not hang though
> One tip here specifically, if you had to reboot during a balance and the FS
> hangs when it mounts, try mounting with `-o skip_balance`.  That should
> pause the balance instead of resuming it on mount, at which point you should
> also be able to cancel it without it hanging.

Very good tip, I have this in all my mountpoints :)

#LABEL=dshelf2 /mnt/btrfs_pool2 btrfs defaults,compress=lzo,skip_balance,noatime

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-06-25 17:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-18 13:00 btrfs balance did not progress after 12H Marc MERLIN
2018-06-19 15:47 ` Marc MERLIN
2018-06-19 16:30   ` james harvey
2018-06-19 16:58     ` Austin S. Hemmelgarn
2018-06-20  8:55       ` Duncan
2018-06-25 16:07       ` btrfs balance did not progress after 12H, hang on reboot, btrfs check --repair kills the system still Marc MERLIN
2018-06-25 16:24         ` Hans van Kranenburg
2018-06-25 16:46           ` Marc MERLIN
2018-06-25 17:07         ` Austin S. Hemmelgarn
2018-06-25 17:34           ` Marc MERLIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.