Out of space on small-ish partition, clobber and other methods haven't worked

All of lore.kernel.org
 help / color / mirror / Atom feed

* Out of space on small-ish partition, clobber and other methods haven't worked
@ 2016-01-20 21:22 Jerry Steinhauer
  2016-01-21  2:28 ` Chris Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Jerry Steinhauer @ 2016-01-20 21:22 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2393 bytes --]

Hello,

We are deploying BTRFS as part of an embedded system.  We're very
pleased with it overall.  In our case, the entire partition size is
~500Mb.  The problem that I encounter is that when I run the partition
out of space, I begin receiving out of space errors that I can't
clear.  I've looked on earlier list posts, and tried the following:
 a) Clobbering the file. cat /dev/null > a.file yields out of space.
 b) Unlinking the file.  rm a.file yields out of space.

% cat /dev/zero > a.file
cat: write error: No space left on device
% ls -la a.file
-rw-r--r-- 1 root root 479723520 Jan 20 20:59 a.file

So far, so good.

% rm a.file
rm: cannot remove 'a.file': No space left on device
% cat /dev/null > a.file
-sh: a.file: No space left on device
% btrfs fi df /data
System, single: total=32.00MiB, used=4.00KiB
Data+Metadata, single: total=506.00MiB, used=500.39MiB
GlobalReserve, single: total=12.00MiB, used=6.45MiB

We've also tried at other times rebalancing, with no success.

We encountered this on 3.1, so we upgraded as far as our distro would
take us (yocto) to 4.1.  Same issue persists.

Questions:
* Are there any other ways to exit this state?
* What debug tools can we use to determine where btrfs is running out
of space on the unlink?
* Failing these, is there someone we can work with to determine root
cause?  We'd like to work with you all to fix the issue, if we can.

Background information such as version:

% btrfs --version
btrfs-progs v4.1.2-dirty
% btrfs fi df /data
System, single: total=32.00MiB, used=4.00KiB
Data+Metadata, single: total=200.00MiB, used=41.99MiB
GlobalReserve, single: total=4.00MiB, used=0.00B
% btrfs fi show
Label: 'd_1.2.0_p1021'  uuid: e4a1f9f1-0e34-4ed1-b788-8a54f735d296
        Total devices 1 FS bytes used 42.00MiB
        devid    1 size 539.00MiB used 232.00MiB path /dev/sda8

btrfs-progs v4.1.2-dirty

dmesg and kernel config are attached.

We notice when trying this same flow against a debian 8 build (using
the exact same disks as we use here) that the size of the file is
smaller by ~70Mb.  The debian kernel is able to clear this issue (rm
works, clobber works).  We theorize that btrfs is calculating reserves
differently in our case.

We've tried on 3.1 (yocto), 3.14 (debian) and 4.1 (yocto).  The
command line output is from 4.1.

If we can provide other detail to help track this down, please let us know.

 - Jerry

[-- Attachment #2: config.gz --]
[-- Type: application/x-gzip, Size: 26821 bytes --]

[-- Attachment #3: dmesg.log.gz --]
[-- Type: application/x-gzip, Size: 16498 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
  2016-01-20 21:22 Out of space on small-ish partition, clobber and other methods haven't worked Jerry Steinhauer
@ 2016-01-21  2:28 ` Chris Murphy
  2016-01-21 10:41   ` Duncan
       [not found]   ` <CAHRikPHt7SmFhzQsZ-XKLYSbwCAgCeccEFXbw+YXBobJx8w1Ew@mail.gmail.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Chris Murphy @ 2016-01-21  2:28 UTC (permalink / raw)
  To: Jerry Steinhauer; +Cc: Btrfs BTRFS

On Wed, Jan 20, 2016 at 2:22 PM, Jerry Steinhauer
<jerry.steinhauer@singlewire.com> wrote:

> % rm a.file
> rm: cannot remove 'a.file': No space left on device
> % cat /dev/null > a.file
> -sh: a.file: No space left on device
> % btrfs fi df /data
> System, single: total=32.00MiB, used=4.00KiB
> Data+Metadata, single: total=506.00MiB, used=500.39MiB
> GlobalReserve, single: total=12.00MiB, used=6.45MiB

I see somewhere between 6MiB and 12MiB that should be available for
file removal. Since delete is cow, the fs still needs free space. But
I'd think rm would need very little cow metadata space to work and
then free things up.

I can't say for sure but sounds like a bug, or at least an unintended behavior.

> We encountered this on 3.1, so we upgraded as far as our distro would
> take us (yocto) to 4.1.  Same issue persists.

I suggest two things, neither of which fixes the problem:

1. Remount with -o enospc_debug, and reproduce the problem.
2. See if you can go to 4.1.15. There are quite a few Btrfs backports
from the 4.1.8 you're using up until 4.1.15.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
  2016-01-21  2:28 ` Chris Murphy
@ 2016-01-21 10:41   ` Duncan
  2016-01-21 17:40     ` Chris Murphy
  2016-01-22 12:11     ` Jerry Steinhauer
       [not found]   ` <CAHRikPHt7SmFhzQsZ-XKLYSbwCAgCeccEFXbw+YXBobJx8w1Ew@mail.gmail.com>
  1 sibling, 2 replies; 7+ messages in thread
From: Duncan @ 2016-01-21 10:41 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Wed, 20 Jan 2016 19:28:35 -0700 as excerpted:

> On Wed, Jan 20, 2016 at 2:22 PM, Jerry Steinhauer
> <jerry.steinhauer@singlewire.com> wrote:
> 
>> % rm a.file
>> rm: cannot remove 'a.file': No space left on device
>> % cat /dev/null > a.file
>> -sh: a.file: No space left on device

>> % btrfs fi df /data
>> System, single: total=32.00MiB, used=4.00KiB
>> Data+Metadata, single: total=506.00MiB, used=500.39MiB
>> GlobalReserve, single: total=12.00MiB, used=6.45MiB
> 
> 
> I see somewhere between 6MiB and 12MiB that should be available for file
> removal.

I don't.  See that global reserve?  6.45 MiB into its emergency reserve, 
so effectively -6.45 MiB of space available for file removal.

First of all, any time global reserve is used at all the filesystem is in 
very dire straits, and he's 6.45 MiB into the 12.00 MiB global reserve, 
so that alone tells us "we're not in Kansas any more!" =8^0

Second, the btrfs fi show (which you didn't quote) says 540 MiB capacity.

System		 32 MiB total, can't be used for anything else
Data+Metadata	506 MiB total, shared data/metadata as it's a small 
filesystem  (See why I didn't list global reserve here, below.)

Total		538 MiB chunked out.  While that's 2 MiB from the 
reported 540 capacity, I don't believe system includes the reserved space 
(for boot loader, etc) at the beginning of the partition.  Between that 
and the limits of the chunk-allocator, he's likely all chunked-out, no 
possibility of allocating further chunks.

Global reserve is normally reserved from metadata, which of course is 
shared data/metadata here, due to the size of the filesystem (which makes 
shared a practical necessity, the problems would be much worse if data 
and metadata chunks were separate!).

So of the 506 MiB in data/metadata, 12 MiB are global reserve.  Which 
means there's only 494 MiB of normal data/metadata space, plus 12 MiB of 
global reserve.

But the DF shows 500.39 MiB of data/metadata used, which means we're 
roughly 6.4 MiB past normal data/metadata usage into the emergency use 
only global reserve, which is indeed (roughly) what global reserve shows, 
6.45 MiB used.

So as I said, that btrfs is in pretty severely dire straits!  Not only is 
all the available data/metadata space used, but we're well past half way 
into the emergency global reserve as well.  No WONDER there's no space 
left even to delete a file (which because btrfs is COW, copy-on-write, 
requires metadata space even to delete a file, as the metadata block 
containing the original data cannot be rewritten in place and must be 
written elsewhere... thus answering the question of why btrfs needs space 
even on the unlink).

As for solutions, there's still a couple things (plus one already tried) 
to try to get out of the situation:

0) Try clobbering the file, reducing it to zero size, but you did and 
that didn't work.  It might have if the btrfs wasn't already so far into 
global reserve.

1) As CMurphy says (with two Chris Ms on the list that isn't clear, so 
CMurphy it is), try a later kernel, either 4.1.x or 4.4.  AFAIK there 
were a few patches having to do with ENOSPC errors and allowing file 
deletes to take from global reserve, as the result should be more room 
afterward and that's exactly the sort of thing global reserve is supposed 
to be there for.  Tho it's just a try, no guarantees.

2) This could be difficult on embedded, but the other option is 
temporarily adding a second device (btrfs device add), to give the 
filesystem a bit of work with.  That takes space as well, but luckily, I 
believe it's system-chunk space, and there's plenty of room there, so it 
should be possible.

The idea is to get enough metadata space to work with to get out of the 
fix by deleting a file or the like (normally, a balance could help as 
well, but that primarily helps to reclaim empty chunks from say data, so 
they can be reassigned to metadata, and since this is shared data/
metadata, that's unlikely to help).

Then when the filesystem is back to usable and enough has been deleted so 
what's on the temporary second device will fit back on the first device 
again, btrfs device delete the second one.

I'm unfamiliar with how small an added device can be and still be useful 
at that level, or more precisely, how the system chunk shrinks with total 
device size, but the one small data point I have here is a 256 MiB /boot, 
which has a 16 MiB system chunk, so I'm guessing it should shrink at 
least that far.

So let's say 16 MiB system, and it's into global reserve by ~6.5 MiB, so 
we want to give it at least that much more, plus something to work with.

So I'd suggest a 24 MiB or if it's available, 32 MiB, second device, at 
minimum.  Smaller can be tried, with the hope that the system chunk 
shrinks to say 8 MiB or smaller if the device is small enough, but I'm 
not sure it will.

As for actually making available a device on embedded, if there's no USB 
port available and thus the "simple" solution of plugging in a thumb 
drive is out of the question... maybe there's enough memory to create a 
tmpfs and do a loopback file on it, then add that loopback file as the 
temporary second device.  Of course if the power dies or the system 
otherwise crashes when part of the filesystem's on that tmpfs... not good 
news.  And obviously in that case it /better/ be temporary, because you 
can't reboot without losing that tmpfs and with it the loopback.  But if 
there's no other way to get access to a suitable device and the system 
and power is stable enough...

So that answers the what to do to exit that state question, and in a 
parenthetical  I answered the question of why it's requiring space to 
unlink -- btrfs is cow, copy-on-write, so even unlinking a file requires 
space to copy the metadata block containing the information about that 
file for the write.  And it makes the third question moot, as we have the 
root cause already -- the cow nature of btrfs.

Meanwhile, one more thing to address.  Despite what various distros may 
claim, here on this list, btrfs is considered "stablizING, but not yet 
fully stable or mature."  Production usage, particularly without backups, 
isn't recommended, occasional bugs can be expected, and the standard 
recommendation is using no older than the last two of either current 
kernels or LTS kernels.   With the just released 4.4 being an LTS kernel, 
that makes 4.1 the previous one back and the oldest recommended kernel, 
tho with 4.4 being so new, still being on the LTS before that, 3.18, 
would still be somewhat acceptable if you're already working on updating 
to 4.1.  But before that, while we'll try to support as best we can, 
chances are very good among the first requests is going to be to update 
to something not so ancient.

Under those conditions, honestly, it may be that btrfs isn't yet stable 
enough to be the right choice, particularly for embedded projects that 
are supposed to be field-usable without backups and without available 
technical maintenance for some years.  As I said, we're stabilizing, and 
actually, I'm not sure about the devs (I'm a list regular and btrfs user, 
not a dev) and other list regulars, but it may be that with LTS 4.4 we'll 
extend the informal support scope to three LTS series and thus support 
3.18 awhile longer, but btrfs is definitely not where /I'd/ recommend 
using btrfs on designed to be field usable without ready backups or 
direct tech supervision embedded, just yet.

OTOH, if it's embedded but with backups and direct tech supervision, then 
btrfs may be just fine, if you're willing to put up with the occasional 
bug and accept that you must be prepared to actually have to use those 
backups, should one of those occasional bugs require it, and if keeping 
generally to the last two LTS (or current) kernel series is acceptable.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
  2016-01-21 10:41   ` Duncan
@ 2016-01-21 17:40     ` Chris Murphy
  2016-01-22 12:08       ` Duncan
  2016-01-22 12:11     ` Jerry Steinhauer
  1 sibling, 1 reply; 7+ messages in thread
From: Chris Murphy @ 2016-01-21 17:40 UTC (permalink / raw)
  To: Duncan; +Cc: Btrfs BTRFS

On Thu, Jan 21, 2016 at 3:41 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Wed, 20 Jan 2016 19:28:35 -0700 as excerpted:
>
>> On Wed, Jan 20, 2016 at 2:22 PM, Jerry Steinhauer
>> <jerry.steinhauer@singlewire.com> wrote:
>>
>>> % rm a.file
>>> rm: cannot remove 'a.file': No space left on device
>>> % cat /dev/null > a.file
>>> -sh: a.file: No space left on device
>
>>> % btrfs fi df /data
>>> System, single: total=32.00MiB, used=4.00KiB
>>> Data+Metadata, single: total=506.00MiB, used=500.39MiB
>>> GlobalReserve, single: total=12.00MiB, used=6.45MiB
>>
>>
>> I see somewhere between 6MiB and 12MiB that should be available for file
>> removal.
>
> I don't.  See that global reserve?  6.45 MiB into its emergency reserve,
> so effectively -6.45 MiB of space available for file removal.

Data+Metadata, single: total=506.00MiB, used=500.39MiB = 5.61MiB
GlobalReserve, single: total=12.00MiB, used=6.45MiB = 5.55

5.61+5.55= 11.16 MiB

I see somewhere between 5.61MiB and 11.16MiB "not used". Per usual
Btrfs is coy about what's actually available for writing, it just
tells us a total and what's used, leaving it up to us to guess what's
not used.

Qu has previously said Global Reserve is actually baked into Metadata
which is why I'm giving it a range, which excludes GlobalReserve at
the small end. So in any case there is ~5MiB that is not used either
in Metadata or in GlobalReserve, depending on your point of view.

And in any case, a single rm command should not take 5MiB to cow. If
there are no snapshots or reflinks then it's probably less than 100K
of writes.

But even if that's wrong, then the reserve is inadequate. There should
be no such thing as a filesystem, even cow, wedging itself into a
situation where it will allow too much data with normal write commands
to fill the file system up such that no files can then be deleted. A
non-privileged user could do that easily and totally nerf the file
system, don't you think that's a security risk?

>
> First of all, any time global reserve is used at all the filesystem is in
> very dire straits, and he's 6.45 MiB into the 12.00 MiB global reserve,
> so that alone tells us "we're not in Kansas any more!" =8^0
>
> Second, the btrfs fi show (which you didn't quote) says 540 MiB capacity.
>
> System           32 MiB total, can't be used for anything else
> Data+Metadata   506 MiB total, shared data/metadata as it's a small
> filesystem  (See why I didn't list global reserve here, below.)
>
>
> Total           538 MiB chunked out.  While that's 2 MiB from the
> reported 540 capacity, I don't believe system includes the reserved space
> (for boot loader, etc) at the beginning of the partition.  Between that
> and the limits of the chunk-allocator, he's likely all chunked-out, no
> possibility of allocating further chunks.

>
> Global reserve is normally reserved from metadata, which of course is
> shared data/metadata here, due to the size of the filesystem (which makes
> shared a practical necessity, the problems would be much worse if data
> and metadata chunks were separate!).
>
> So of the 506 MiB in data/metadata, 12 MiB are global reserve.  Which
> means there's only 494 MiB of normal data/metadata space, plus 12 MiB of
> global reserve.
>
> But the DF shows 500.39 MiB of data/metadata used, which means we're
> roughly 6.4 MiB past normal data/metadata usage into the emergency use
> only global reserve, which is indeed (roughly) what global reserve shows,
> 6.45 MiB used.

OK that's a fine explanation, but the UI is not explaining this at
all. In fact it's completely misleading *away* from the explanation
you give. It's suggesting there's free space by the fact "used" is
less than "total".

So no matter what, there's more than one bug here. The file system
shouldn't get itself into this situation, that's first and foremost,
it should have started to ENOSPC before it can no longer delete files.
And the user space tools shouldn't mislead the user about how much
free space there is, or require the user to do a bunch of side
calculations to find out that it's saying something other than what
it's saying.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
  2016-01-21 17:40     ` Chris Murphy
@ 2016-01-22 12:08       ` Duncan
  0 siblings, 0 replies; 7+ messages in thread
From: Duncan @ 2016-01-22 12:08 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Thu, 21 Jan 2016 10:40:51 -0700 as excerpted:

>> Global reserve is normally reserved from metadata, which of course is
>> shared data/metadata here, due to the size of the filesystem (which
>> makes shared a practical necessity, the problems would be much worse if
>> data and metadata chunks were separate!).
>>
>> So of the 506 MiB in data/metadata, 12 MiB are global reserve.  Which
>> means there's only 494 MiB of normal data/metadata space, plus 12 MiB
>> of global reserve.
>>
>> But the DF shows 500.39 MiB of data/metadata used, which means we're
>> roughly 6.4 MiB past normal data/metadata usage into the emergency use
>> only global reserve, which is indeed (roughly) what global reserve
>> shows,
>> 6.45 MiB used.
> 
> OK that's a fine explanation, but the UI is not explaining this at all.
> In fact it's completely misleading *away* from the explanation you give.
> It's suggesting there's free space by the fact "used" is less than
> "total".

... Which is why there's already a proposal (and patch, IIRC) to indent 
global reserve under metadata, increasing the "used" metadata value by 
the "total" value of global reserve, tho there's an alternate viewpoint 
in the discussion saying that global reserve shouldn't be shown at all, 
just included in the metadata "used" figure, unless an option is given to 
show it.

With David, IIRC, as another viewpoint, saying that fi df is fine as-is, 
as it's intended for dev and advanced users who know the score, while fi 
usage should be recommended for normal users.

But as I pointed out (my viewpoint) until David's patches that made it 
into -progs 4.4, usage didn't work in some cases, namely --mixed mode, 
with a warning at the top and often returning 8 EiB (on 64-bit anyway) as 
something was going negative due to a screwy formula (thus the 
unsupported warning) but then being reported as the ridiculously high 
unsigned value.  Between that and the fact that fi usage is itself 
relatively new compared to the old fi show and fi df recommendation, it's 
simply impractical to recommend fi usage in the generic case, without a 
big hairy explanation of why it might not even be there (too old 
userspace) or be reporting absolutely crazy values (--mixed mode before 
it was properly supported in the still very new 4.4 userspace).  So 
recommending users post their btrfs fi show and btrfs fi df is going to 
remain the only practical option for another year or two, and  
interpreting the global reserve line in btrfs fi df is as much a 
challenge for those not "in the know" as is interpreting the global total 
vs. device total in btrfs fi show is.  In both cases, interpretation is 
definitely a form of art, only doable by those who know the inside tricks 
of interpretation.

So yeah, interpretation of btrfs fi df, particularly the global reserve 
as part of metadata, is tricky, requiring internal knowledge to do 
correctly, but so is interpreting btrfs fi show, and we've gotten so used 
to dealing with that, that it's hardly ever remarked on any more, except 
perhaps in explaining to newbies that the simplest thing to do there is 
to ignore the global total line and only look at the individual device 
lines.  (The numbers in show's global total have a value, but they're not 
what people intuitively think they are, and it's often easier to simply 
tell people to pretend that line doesn't exist than to explain where the 
number actually comes from, doing the required math[1] along the way.)

Meanwhile, simply adding global reserve total to metadata used, tends to 
work pretty well, and if that total is then more than metadata total, the 
difference is found in global reserved used, which if non-zero, really 
does indicate a filesystem in pretty dire straits.

And yes, it's a bug if btrfs gets into a jam so tight it can't even 
delete a file to make more room, but the reporter wasn't on the latest 
kernel either, and there's a reason the latest is recommended.  As I 
said, I think I saw some patches go by that should I believe be in 4.4, 
that may very well allow btrfs to use the global reserve for file 
deletions.  And if so, that bug is not only known, but already fixed.

---
[1] Doing the required math: Heh, I just typoed that as doing the 
required meth... which might explain some things about these "gotta know 
the inside story to interpret" lines in both sub-commands!  =;^p

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
  2016-01-21 10:41   ` Duncan
  2016-01-21 17:40     ` Chris Murphy
@ 2016-01-22 12:11     ` Jerry Steinhauer
  1 sibling, 0 replies; 7+ messages in thread
From: Jerry Steinhauer @ 2016-01-22 12:11 UTC (permalink / raw)
  To: Duncan; +Cc: Btrfs BTRFS

Thanks, Duncan for the thorough and thoughtful reply.  We appreciate
that btrfs is stabilizing (emphasis understood).  In our case, we
think the benefits for cow for running on compact flash will outweigh
the potential for a btrfs-specific hiccup.  In this particular case,
these systems do not contain unique data.  The background you give
strengthens our resolve to have in place contingency plans in case of
a hiccup.

We've moved to 4.1 in our production release, and before asking people
here for support in future, we'll try our symptom against the latest
kernel rev available.

 - Jerry




On Thu, Jan 21, 2016 at 4:41 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Wed, 20 Jan 2016 19:28:35 -0700 as excerpted:
>
>> On Wed, Jan 20, 2016 at 2:22 PM, Jerry Steinhauer
>> <jerry.steinhauer@singlewire.com> wrote:
>>
>>> % rm a.file
>>> rm: cannot remove 'a.file': No space left on device
>>> % cat /dev/null > a.file
>>> -sh: a.file: No space left on device
>
>>> % btrfs fi df /data
>>> System, single: total=32.00MiB, used=4.00KiB
>>> Data+Metadata, single: total=506.00MiB, used=500.39MiB
>>> GlobalReserve, single: total=12.00MiB, used=6.45MiB
>>
>>
>> I see somewhere between 6MiB and 12MiB that should be available for file
>> removal.
>
> I don't.  See that global reserve?  6.45 MiB into its emergency reserve,
> so effectively -6.45 MiB of space available for file removal.
>
> First of all, any time global reserve is used at all the filesystem is in
> very dire straits, and he's 6.45 MiB into the 12.00 MiB global reserve,
> so that alone tells us "we're not in Kansas any more!" =8^0
>
> Second, the btrfs fi show (which you didn't quote) says 540 MiB capacity.
>
> System           32 MiB total, can't be used for anything else
> Data+Metadata   506 MiB total, shared data/metadata as it's a small
> filesystem  (See why I didn't list global reserve here, below.)
>
>
> Total           538 MiB chunked out.  While that's 2 MiB from the
> reported 540 capacity, I don't believe system includes the reserved space
> (for boot loader, etc) at the beginning of the partition.  Between that
> and the limits of the chunk-allocator, he's likely all chunked-out, no
> possibility of allocating further chunks.
>
> Global reserve is normally reserved from metadata, which of course is
> shared data/metadata here, due to the size of the filesystem (which makes
> shared a practical necessity, the problems would be much worse if data
> and metadata chunks were separate!).
>
> So of the 506 MiB in data/metadata, 12 MiB are global reserve.  Which
> means there's only 494 MiB of normal data/metadata space, plus 12 MiB of
> global reserve.
>
> But the DF shows 500.39 MiB of data/metadata used, which means we're
> roughly 6.4 MiB past normal data/metadata usage into the emergency use
> only global reserve, which is indeed (roughly) what global reserve shows,
> 6.45 MiB used.
>
> So as I said, that btrfs is in pretty severely dire straits!  Not only is
> all the available data/metadata space used, but we're well past half way
> into the emergency global reserve as well.  No WONDER there's no space
> left even to delete a file (which because btrfs is COW, copy-on-write,
> requires metadata space even to delete a file, as the metadata block
> containing the original data cannot be rewritten in place and must be
> written elsewhere... thus answering the question of why btrfs needs space
> even on the unlink).
>
>
> As for solutions, there's still a couple things (plus one already tried)
> to try to get out of the situation:
>
> 0) Try clobbering the file, reducing it to zero size, but you did and
> that didn't work.  It might have if the btrfs wasn't already so far into
> global reserve.
>
> 1) As CMurphy says (with two Chris Ms on the list that isn't clear, so
> CMurphy it is), try a later kernel, either 4.1.x or 4.4.  AFAIK there
> were a few patches having to do with ENOSPC errors and allowing file
> deletes to take from global reserve, as the result should be more room
> afterward and that's exactly the sort of thing global reserve is supposed
> to be there for.  Tho it's just a try, no guarantees.
>
> 2) This could be difficult on embedded, but the other option is
> temporarily adding a second device (btrfs device add), to give the
> filesystem a bit of work with.  That takes space as well, but luckily, I
> believe it's system-chunk space, and there's plenty of room there, so it
> should be possible.
>
> The idea is to get enough metadata space to work with to get out of the
> fix by deleting a file or the like (normally, a balance could help as
> well, but that primarily helps to reclaim empty chunks from say data, so
> they can be reassigned to metadata, and since this is shared data/
> metadata, that's unlikely to help).
>
> Then when the filesystem is back to usable and enough has been deleted so
> what's on the temporary second device will fit back on the first device
> again, btrfs device delete the second one.
>
> I'm unfamiliar with how small an added device can be and still be useful
> at that level, or more precisely, how the system chunk shrinks with total
> device size, but the one small data point I have here is a 256 MiB /boot,
> which has a 16 MiB system chunk, so I'm guessing it should shrink at
> least that far.
>
> So let's say 16 MiB system, and it's into global reserve by ~6.5 MiB, so
> we want to give it at least that much more, plus something to work with.
>
> So I'd suggest a 24 MiB or if it's available, 32 MiB, second device, at
> minimum.  Smaller can be tried, with the hope that the system chunk
> shrinks to say 8 MiB or smaller if the device is small enough, but I'm
> not sure it will.
>
> As for actually making available a device on embedded, if there's no USB
> port available and thus the "simple" solution of plugging in a thumb
> drive is out of the question... maybe there's enough memory to create a
> tmpfs and do a loopback file on it, then add that loopback file as the
> temporary second device.  Of course if the power dies or the system
> otherwise crashes when part of the filesystem's on that tmpfs... not good
> news.  And obviously in that case it /better/ be temporary, because you
> can't reboot without losing that tmpfs and with it the loopback.  But if
> there's no other way to get access to a suitable device and the system
> and power is stable enough...
>
>
> So that answers the what to do to exit that state question, and in a
> parenthetical  I answered the question of why it's requiring space to
> unlink -- btrfs is cow, copy-on-write, so even unlinking a file requires
> space to copy the metadata block containing the information about that
> file for the write.  And it makes the third question moot, as we have the
> root cause already -- the cow nature of btrfs.
>
>
> Meanwhile, one more thing to address.  Despite what various distros may
> claim, here on this list, btrfs is considered "stablizING, but not yet
> fully stable or mature."  Production usage, particularly without backups,
> isn't recommended, occasional bugs can be expected, and the standard
> recommendation is using no older than the last two of either current
> kernels or LTS kernels.   With the just released 4.4 being an LTS kernel,
> that makes 4.1 the previous one back and the oldest recommended kernel,
> tho with 4.4 being so new, still being on the LTS before that, 3.18,
> would still be somewhat acceptable if you're already working on updating
> to 4.1.  But before that, while we'll try to support as best we can,
> chances are very good among the first requests is going to be to update
> to something not so ancient.
>
> Under those conditions, honestly, it may be that btrfs isn't yet stable
> enough to be the right choice, particularly for embedded projects that
> are supposed to be field-usable without backups and without available
> technical maintenance for some years.  As I said, we're stabilizing, and
> actually, I'm not sure about the devs (I'm a list regular and btrfs user,
> not a dev) and other list regulars, but it may be that with LTS 4.4 we'll
> extend the informal support scope to three LTS series and thus support
> 3.18 awhile longer, but btrfs is definitely not where /I'd/ recommend
> using btrfs on designed to be field usable without ready backups or
> direct tech supervision embedded, just yet.
>
> OTOH, if it's embedded but with backups and direct tech supervision, then
> btrfs may be just fine, if you're willing to put up with the occasional
> bug and accept that you must be prepared to actually have to use those
> backups, should one of those occasional bugs require it, and if keeping
> generally to the last two LTS (or current) kernel series is acceptable.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Out of space on small-ish partition, clobber and other methods haven't worked
       [not found]   ` <CAHRikPHt7SmFhzQsZ-XKLYSbwCAgCeccEFXbw+YXBobJx8w1Ew@mail.gmail.com>
@ 2016-01-23 15:23     ` Jerry Steinhauer
  0 siblings, 0 replies; 7+ messages in thread
From: Jerry Steinhauer @ 2016-01-23 15:23 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

OK, I have a call stack.  Does this help?  If I can provide anything
else to help narrow this down, please let me know.

% echo 8 > /proc/sys/kernel/printk

% btrfs fi df /data
System, single: total=32.00MiB, used=4.00KiB
Data+Metadata, single: total=200.00MiB, used=44.64MiB
GlobalReserve, single: total=4.00MiB, used=0.00B

% mount -o remount,enospc_debug,relatime,rw,space_cache /data
% mount | grep /data
/dev/sda8 on /data type btrfs (rw,relatime,space_cache,enospc_debug)
% cat /dev/zero > a.file
cat: write error: No space left on device
% rm a.file
rm: cannot remove 'a.file': No space left on device
% btrfs fi df /data
System, single: total=32.00MiB, used=4.00KiB
Data+Metadata, single: total=506.00MiB, used=499.62MiB
GlobalReserve, single: total=12.00MiB, used=5.69MiB

[67333.553850] ------------[ cut here ]------------
[67333.553931] WARNING: CPU: 0 PID: 14400 at
/home/builder/workspace/BuildAppImage/poky/build/tmp/work-shared/qemux86/kernel-source/fs/btrfs/extent-tree.c:7539
btrfs_alloc_tree_block+0xd2/0x41b()
[67333.553997] BTRFS: block rsv returned -28
[67333.554010] Modules linked in: nf_log_ipv4 nf_log_common xt_limit
xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_hashlimit xt_conntrack
nf_conntrack iptable_mangle iptable_filter ip_tables xt_LOG x_tables
vmw_balloon pcspkr floppy parport_pc parport vmw_vmci i2c_piix4
[67333.554206] CPU: 0 PID: 14400 Comm: kworker/u2:5 Not tainted
4.1.15-yocto-standard #1
[67333.554234] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[67333.554289] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[67333.554312]  00000000 de5d7c04 de5d7bdc c16a0203 de5d7bf4 c10394ed
c11fa00d deef90c0
[67333.554344]  00000000 ffffffe4 de5d7c0c c103952f 00000009 de5d7c04
c18ca448 de5d7c20
[67333.554375]  de5d7c74 c11fa00d c18c9f0d 00001d73 c18ca448 ffffffe4
00000145 de5d7c8c
[67333.554407] Call Trace:
[67333.554445]  [<c16a0203>] dump_stack+0x16/0x18
[67333.554477]  [<c10394ed>] warn_slowpath_common+0x7c/0x93
[67333.554496]  [<c11fa00d>] ? btrfs_alloc_tree_block+0xd2/0x41b
[67333.554516]  [<c103952f>] warn_slowpath_fmt+0x2b/0x2f
[67333.554533]  [<c11fa00d>] btrfs_alloc_tree_block+0xd2/0x41b
[67333.554560]  [<c124bf99>] ? btrfs_add_delayed_tree_ref+0xf8/0x132
[67333.554652]  [<c11e858b>] __btrfs_cow_block+0x11f/0x457
[67333.554701]  [<c11e8a50>] btrfs_cow_block+0x12e/0x194
[67333.554756]  [<c11ea50e>] push_leaf_right+0x97/0x12d
[67333.554798]  [<c11eac50>] split_leaf+0xb9/0x529
[67333.554839]  [<c11eb5ba>] btrfs_search_slot+0x4fa/0x6bb
[67333.554883]  [<c120073d>] btrfs_csum_file_blocks+0x1db/0x5a9
[67333.554929]  [<c120cab9>] add_pending_csums.isra.5+0x40/0x56
[67333.554974]  [<c12120ae>] btrfs_finish_ordered_io+0x384/0x4fa
[67333.555020]  [<c121243c>] finish_ordered_fn+0x12/0x14
[67333.555062]  [<c12340dc>] btrfs_scrubnc_helper+0xf6/0x2ec
[67333.555106]  [<c1234348>] btrfs_endio_write_helper+0xd/0xf
[67333.555151]  [<c10491f5>] process_one_work+0x17a/0x2e9
[67333.555193]  [<c1049b63>] worker_thread+0x267/0x346
[67333.555235]  [<c10498fc>] ? rescuer_thread+0x281/0x281
[67333.555278]  [<c104ceb3>] kthread+0xa3/0xa8
[67333.555318]  [<c16a5380>] ret_from_kernel_thread+0x20/0x30
[67333.555362]  [<c104ce10>] ? kthread_worker_fn+0x132/0x132
[67333.555406] ---[ end trace df2e60be34a6df48 ]---

 - Jerry




On Fri, Jan 22, 2016 at 5:54 AM, Jerry Steinhauer
<jerry.steinhauer@singlewire.com> wrote:
> Thanks, Chris.
>
> 1) Re: enospec_debug, I've rebuilt with CONFIG_BTRFS_DEBUG:
>
> root@singlewire:~# zcat /proc/config.gz | grep BTRFS
> CONFIG_BTRFS_FS=y
> CONFIG_BTRFS_FS_POSIX_ACL=y
> # CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set
> # CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set
> CONFIG_BTRFS_DEBUG=y
> # CONFIG_BTRFS_ASSERT is not set
>
> I've also remounted the partition with enospc_debug:
>
> root@singlewire:~# mount | grep /data
> /dev/sda8 on /data type btrfs (rw,relatime,space_cache,enospc_debug)
>
> But, neither dmesg nor /var/log/messages have output from btrfs when I run
> the system out of space.  My C is rusty, but it looks like a macro called
> DEBUG needs to be defined.  What do I need to do to in kernel config to turn
> that on?  (google searches are not helpful with this particular term :) )
>
> 2) We went to 4.1.15 yesterday.  No change in symptom.
>
>
>  - Jerry
>
>
>
> On Wed, Jan 20, 2016 at 8:28 PM, Chris Murphy <lists@colorremedies.com>
> wrote:
>>
>> On Wed, Jan 20, 2016 at 2:22 PM, Jerry Steinhauer
>> <jerry.steinhauer@singlewire.com> wrote:
>>
>> > % rm a.file
>> > rm: cannot remove 'a.file': No space left on device
>> > % cat /dev/null > a.file
>> > -sh: a.file: No space left on device
>> > % btrfs fi df /data
>> > System, single: total=32.00MiB, used=4.00KiB
>> > Data+Metadata, single: total=506.00MiB, used=500.39MiB
>> > GlobalReserve, single: total=12.00MiB, used=6.45MiB
>>
>>
>> I see somewhere between 6MiB and 12MiB that should be available for
>> file removal. Since delete is cow, the fs still needs free space. But
>> I'd think rm would need very little cow metadata space to work and
>> then free things up.
>>
>> I can't say for sure but sounds like a bug, or at least an unintended
>> behavior.
>>
>>
>> > We encountered this on 3.1, so we upgraded as far as our distro would
>> > take us (yocto) to 4.1.  Same issue persists.
>>
>> I suggest two things, neither of which fixes the problem:
>>
>> 1. Remount with -o enospc_debug, and reproduce the problem.
>> 2. See if you can go to 4.1.15. There are quite a few Btrfs backports
>> from the 4.1.8 you're using up until 4.1.15.
>>
>>
>>
>> --
>> Chris Murphy
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-23 15:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-20 21:22 Out of space on small-ish partition, clobber and other methods haven't worked Jerry Steinhauer
2016-01-21  2:28 ` Chris Murphy
2016-01-21 10:41   ` Duncan
2016-01-21 17:40     ` Chris Murphy
2016-01-22 12:08       ` Duncan
2016-01-22 12:11     ` Jerry Steinhauer
     [not found]   ` <CAHRikPHt7SmFhzQsZ-XKLYSbwCAgCeccEFXbw+YXBobJx8w1Ew@mail.gmail.com>
2016-01-23 15:23     ` Jerry Steinhauer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.