linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.16 Managed to ENOSPC with <80% used
@ 2014-09-24 20:43 Dan Merillat
  2014-09-24 22:23 ` Holger Hoffstätte
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Merillat @ 2014-09-24 20:43 UTC (permalink / raw)
  To: BTRFS

Any idea how to recover?  I can't cut-paste but it's
Total devices 1 FS bytes used 176.22GiB
size 233.59GiB used 233.59GiB

Basically it's been data allocation happy, since I haven't deleted
53GB at any point.  Unfortunately, none of the chunks are at 0% usage
so a balance -dusage=0 finds nothing to drop.

Attempting a balance with -dusage=25 instantly dies with ENOSPC, since
100% of space is allocated.

Is this recoverable, or do I need to copy to another disk and back?
This is a really unfortunate failure mode for BTRFS.  Usually I catch
it before I get exactly 100% used and can use a balance to get it back
into shape.

What causes it to keep allocating datablocks when it's got so much
free space?  The workload is pretty standard (for devs, at least): git
and kernel builds, and git and android builds.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.16 Managed to ENOSPC with <80% used
  2014-09-24 20:43 3.16 Managed to ENOSPC with <80% used Dan Merillat
@ 2014-09-24 22:23 ` Holger Hoffstätte
  2014-09-25 21:05   ` Dan Merillat
  0 siblings, 1 reply; 6+ messages in thread
From: Holger Hoffstätte @ 2014-09-24 22:23 UTC (permalink / raw)
  To: linux-btrfs

On Wed, 24 Sep 2014 16:43:43 -0400, Dan Merillat wrote:

> Any idea how to recover?  I can't cut-paste but it's
> Total devices 1 FS bytes used 176.22GiB
> size 233.59GiB used 233.59GiB

The notorious -EBLOAT. But don't despair just yet.

> Basically it's been data allocation happy, since I haven't deleted
> 53GB at any point.  Unfortunately, none of the chunks are at 0% usage
> so a balance -dusage=0 finds nothing to drop.

Also try -musage=0..10, just for fun.

> Is this recoverable, or do I need to copy to another disk and back?

- if you have plenty RAM, make a big tmpfs (couple of GB), otherwise
  find some other storage to attach; any external drive is fine.

- else on tmpfs or spare fs:
  - fallocate --length <n>GiB <image>,
    where n: whatever you can spare and image: an arbitrary filename
  - losetup /dev/loop0 </path/to/image>

- btrfs device add /dev/loop0 <your hosed mountpoint>

You now have more unused space to fill up. \o/

Delete snapshots, make clean/.tar.gz some of those trees and run
balance -dusage/-musage again.

Another neat trick that will free up space is to convert to single
metadata: -mconvert=single -f (to force). A subsequent balance
with -musage=0..10 will likely free up quite some space.

When you're back in calmer waters you can btrfs device remove whatever
you added. That should fill your main drive right back up. :)

> This is a really unfortunate failure mode for BTRFS.  Usually I catch
> it before I get exactly 100% used and can use a balance to get it back
> into shape.

For now all we can do is run balance -d/-m regularly.

> What causes it to keep allocating datablocks when it's got so much
> free space?  The workload is pretty standard (for devs, at least): git
> and kernel builds, and git and android builds.

That particular workload seems to cause the block allocator to go
on a spending spree; you're not the first to see this.

Good luck!

-h


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.16 Managed to ENOSPC with <80% used
  2014-09-24 22:23 ` Holger Hoffstätte
@ 2014-09-25 21:05   ` Dan Merillat
  2014-09-25 21:21     ` Holger Hoffstätte
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Merillat @ 2014-09-25 21:05 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: BTRFS

On Wed, Sep 24, 2014 at 6:23 PM, Holger Hoffstätte
<holger.hoffstaette@googlemail.com> wrote:

>> Basically it's been data allocation happy, since I haven't deleted
>> 53GB at any point.  Unfortunately, none of the chunks are at 0% usage
>> so a balance -dusage=0 finds nothing to drop.
>
> Also try -musage=0..10, just for fun.

Tried a few of them.  When it's completely wedged, balance with any
usage above zero won't work, because it needs one allocatable group to
move to.   I'm not sure if it was needing a new data chunk to merge
partials into, or if it thought it needed more metadata space to write
out the changes.  (Metadata was also only 75% used).

>> Is this recoverable, or do I need to copy to another disk and back?
>
> Another neat trick that will free up space is to convert to single
> metadata: -mconvert=single -f (to force). A subsequent balance
> with -musage=0..10 will likely free up quite some space.

Deleting files or dropping snapshots is difficult when it's wedged as
well, a lot of disk activity (journal thrash?) and no persistent
progress - a reboot brigs the deleted files back.  I eventually
managed to empty a single data chunk and after that it was a trivial
recovery.

> That particular workload seems to cause the block allocator to go
> on a spending spree; you're not the first to see this.

I could see normal-user usage patterns getting ignored, but this is
the patterns of the people working on BTRFS.   Maybe they need to
remove their balance cronjobs for a while. :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.16 Managed to ENOSPC with <80% used
  2014-09-25 21:05   ` Dan Merillat
@ 2014-09-25 21:21     ` Holger Hoffstätte
  2014-09-26 14:18       ` Rich Freeman
  0 siblings, 1 reply; 6+ messages in thread
From: Holger Hoffstätte @ 2014-09-25 21:21 UTC (permalink / raw)
  To: linux-btrfs

On Thu, 25 Sep 2014 17:05:11 -0400, Dan Merillat wrote:

> Deleting files or dropping snapshots is difficult when it's wedged as
> well, a lot of disk activity (journal thrash?) and no persistent
> progress - a reboot brigs the deleted files back.  I eventually
> managed to empty a single data chunk and after that it was a trivial
> recovery.

That's why I mentioned adding a second device - that will immediately
allow cleanup with headroom. An additional 8GB tmpfs volume can works
wonders.

>> That particular workload seems to cause the block allocator to go
>> on a spending spree; you're not the first to see this.
> 
> I could see normal-user usage patterns getting ignored, but this is
> the patterns of the people working on BTRFS.   Maybe they need to
> remove their balance cronjobs for a while. :)

Well..some remedy is coming:
https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=integration&id=47ab2a6c689913db23ccae38349714edf8365e0a

Not sure how many preliminaries that patch needs, but if you are
comfortable with patching/building your kernel you could give that
a try. I'm sure Josef would love some feedback.

Glad you're back on track. :)

Holger


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.16 Managed to ENOSPC with <80% used
  2014-09-25 21:21     ` Holger Hoffstätte
@ 2014-09-26 14:18       ` Rich Freeman
  2014-09-27  2:36         ` Duncan
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Freeman @ 2014-09-26 14:18 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Btrfs BTRFS

On Thu, Sep 25, 2014 at 5:21 PM, Holger Hoffstätte
<holger.hoffstaette@googlemail.com> wrote:
> That's why I mentioned adding a second device - that will immediately
> allow cleanup with headroom. An additional 8GB tmpfs volume can works
> wonders.
>

If you add a single 8GB tmpfs to a RAID1 btrfs array, is it safe to
assume that you'll still always have a redundant copy of everything on
a disk somewhere during the recovery?  Would only a single tmpfs
volume actually help in this case?  I get a bit nervous about doing a
cleanup that involves moving metadata to tmpfs of all places, since
some kind of deadlock/etc could result in unrecoverable data loss.

Doing the same thing with an actual hard drive would concern me less.

--
Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.16 Managed to ENOSPC with <80% used
  2014-09-26 14:18       ` Rich Freeman
@ 2014-09-27  2:36         ` Duncan
  0 siblings, 0 replies; 6+ messages in thread
From: Duncan @ 2014-09-27  2:36 UTC (permalink / raw)
  To: linux-btrfs

Rich Freeman posted on Fri, 26 Sep 2014 10:18:37 -0400 as excerpted:

> On Thu, Sep 25, 2014 at 5:21 PM, Holger Hoffstätte
> <holger.hoffstaette@googlemail.com> wrote:
>> That's why I mentioned adding a second device - that will immediately
>> allow cleanup with headroom. An additional 8GB tmpfs volume can works
>> wonders.
>>
>>
> If you add a single 8GB tmpfs to a RAID1 btrfs array, is it safe to
> assume that you'll still always have a redundant copy of everything on a
> disk somewhere during the recovery?  Would only a single tmpfs volume
> actually help in this case?  I get a bit nervous about doing a cleanup
> that involves moving metadata to tmpfs of all places, since some kind of
> deadlock/etc could result in unrecoverable data loss.
> 
> Doing the same thing with an actual hard drive would concern me less.

That has been my concern too, and why I'd be leery about using a loopback 
on tmpfs, even for the few minutes (more like seconds since I'm on SSD 
and we /are/ talking memory-backed tmpfs) it'd take to free a minimal 
number of chunks (say usage=2% or 5% or whatever, the smallest number 
that actually frees anything).

With SSD and with backups I'd probably do it, but it's not something I 
could recommend, and I'm not sure I'd do it on slower spinning rust, just 
because the time is longer.

I'd probably use a thumb drive or the like, instead, and would certainly 
recommend that to others, altho if they're comfortable with it and want 
to risk it, a loopback file on tmpfs should work fine, provided the power 
doesn't go out in the middle or something.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-09-27  2:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-24 20:43 3.16 Managed to ENOSPC with <80% used Dan Merillat
2014-09-24 22:23 ` Holger Hoffstätte
2014-09-25 21:05   ` Dan Merillat
2014-09-25 21:21     ` Holger Hoffstätte
2014-09-26 14:18       ` Rich Freeman
2014-09-27  2:36         ` Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).