linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* file system full on a single disk?
@ 2020-01-13 22:18 Christian Kujau
  2020-01-13 22:55 ` Chris Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Kujau @ 2020-01-13 22:18 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I realize that this comes up every now and then but always for slightly 
more complicated setups, or so I thought:


============================================================
# df -h /
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/luks-root  825G  389G     0 100% /

# btrfs filesystem show /
Label: 'root'  uuid: 75a6d93a-5a5c-48e0-a237-007b2e812477
        Total devices 1 FS bytes used 388.00GiB
        devid    1 size 824.40GiB used 395.02GiB path /dev/mapper/luks-root

# blockdev --getsize64 /dev/mapper/luks-root | awk '{print $1/1024^3, "GB"}'
824.398 GB

# btrfs filesystem df /
Data, single: total=388.01GiB, used=387.44GiB
System, single: total=4.00MiB, used=64.00KiB
Metadata, single: total=2.01GiB, used=1.57GiB
GlobalReserve, single: total=512.00MiB, used=80.00KiB
============================================================


This is on a Fedora 31 (5.4.8-200.fc31.x86_64) workstation. Where did the 
other 436 GB go? Or, why are only 395 GB allocated from the 824 GB device?

I'm running a --full-balance now and it's progressing, slowly. I've seen 
tricks on the interwebs to temporarily add a ramdisk, run another balance, 
remove the ramdisk again - but that seems hackish.

Isn't there a way to prevent this from happening? (Apart from better 
monitoring, so I can run the balance at an earlier stage next time).


Thanks,
Christian.


# btrfs filesystem usage -T /
Overall:
    Device size:                 824.40GiB
    Device allocated:            395.02GiB
    Device unallocated:          429.38GiB
    Device missing:                  0.00B
    Used:                        388.00GiB
    Free (estimated):            435.94GiB      (min: 435.94GiB)
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 0.00B)

                         Data      Metadata System              
Id Path                  single    single   single   Unallocated
-- --------------------- --------- -------- -------- -----------
 1 /dev/mapper/luks-root 393.01GiB  2.01GiB  4.00MiB   429.38GiB
-- --------------------- --------- -------- -------- -----------
   Total                 393.01GiB  2.01GiB  4.00MiB   429.38GiB
   Used                  386.45GiB  1.55GiB 64.00KiB            


-- 
BOFH excuse #326:

We need a licensed electrician to replace the light bulbs in the computer room.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 22:18 file system full on a single disk? Christian Kujau
@ 2020-01-13 22:55 ` Chris Murphy
  2020-01-13 23:21   ` Christian Kujau
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2020-01-13 22:55 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Btrfs BTRFS

On Mon, Jan 13, 2020 at 3:28 PM Christian Kujau <lists@nerdbynature.de> wrote:
>
> Hi,
>
> I realize that this comes up every now and then but always for slightly
> more complicated setups, or so I thought:
>
>
> ============================================================
> # df -h /
> Filesystem             Size  Used Avail Use% Mounted on
> /dev/mapper/luks-root  825G  389G     0 100% /
>
> # btrfs filesystem show /
> Label: 'root'  uuid: 75a6d93a-5a5c-48e0-a237-007b2e812477
>         Total devices 1 FS bytes used 388.00GiB
>         devid    1 size 824.40GiB used 395.02GiB path /dev/mapper/luks-root
>
> # blockdev --getsize64 /dev/mapper/luks-root | awk '{print $1/1024^3, "GB"}'
> 824.398 GB
>
> # btrfs filesystem df /
> Data, single: total=388.01GiB, used=387.44GiB
> System, single: total=4.00MiB, used=64.00KiB
> Metadata, single: total=2.01GiB, used=1.57GiB
> GlobalReserve, single: total=512.00MiB, used=80.00KiB
> ============================================================
>
>
> This is on a Fedora 31 (5.4.8-200.fc31.x86_64) workstation. Where did the
> other 436 GB go? Or, why are only 395 GB allocated from the 824 GB device?

It's a reporting bug. File system is fine.


> I'm running a --full-balance now and it's progressing, slowly. I've seen
> tricks on the interwebs to temporarily add a ramdisk, run another balance,
> remove the ramdisk again - but that seems hackish.

I'd stop the balance. Balancing metadata in particular appears to make
the problem more common. And you're right, it's hackish, it's not a
great work around for anything these days, and if it is, good chance
it's a bug.


> Isn't there a way to prevent this from happening? (Apart from better
> monitoring, so I can run the balance at an earlier stage next time).

In theory it should be enough to unmount then remount the file system;
of course for sysroot that'd be a reboot. There may be certain
workloads that encourage it, that could be worked around temporarily
using mount option metadata_ratio=1.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 22:55 ` Chris Murphy
@ 2020-01-13 23:21   ` Christian Kujau
  2020-01-13 23:29     ` Chris Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Kujau @ 2020-01-13 23:21 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Mon, 13 Jan 2020, Chris Murphy wrote:
> It's a reporting bug. File system is fine.

Well, I received some ENOSPC notifications from various apps, so it was a 
real problem.

> > I'm running a --full-balance now and it's progressing, slowly. I've seen
> > tricks on the interwebs to temporarily add a ramdisk, run another balance,
> > remove the ramdisk again - but that seems hackish.
> 
> I'd stop the balance. Balancing metadata in particular appears to make
> the problem more common. And you're right, it's hackish, it's not a
> great work around for anything these days, and if it is, good chance
> it's a bug.

For now, the balancing "helped", but the fs still shows only 391 GB 
allocated from the 924 GB device:

=======================================================================
# btrfs filesystem show /
Label: 'root'  uuid: 75a6d93a-5a5c-48e0-a237-007b2e812477
        Total devices 1 FS bytes used 388.00GiB
        devid    1 size 824.40GiB used 391.03GiB path /dev/mapper/luks-root

# df -h /
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/luks-root  825G  390G  433G  48% /
=======================================================================

> In theory it should be enough to unmount then remount the file system;
> of course for sysroot that'd be a reboot.

OK, I'll try a reboot next time.

> There may be certain workloads that encourage it, that could be worked 
> around temporarily using mount option metadata_ratio=1.

I'll do that after it happens again, to see if this was a one-off or 
happens regularily. The file system is rather new (created Dec 14) and 
apart from spinning up some libvirt VMs (but no snapshots involved) the 
workload is a mix of web browsing and compiling things, no nothing too 
fancy.

Thanks for your input, and thanks for taking the time to respond.

Christian.
-- 
BOFH excuse #69:

knot in cables caused data stream to become twisted and kinked

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 23:21   ` Christian Kujau
@ 2020-01-13 23:29     ` Chris Murphy
  2020-01-13 23:38       ` Christian Kujau
  2020-01-13 23:41       ` Chris Murphy
  0 siblings, 2 replies; 9+ messages in thread
From: Chris Murphy @ 2020-01-13 23:29 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Btrfs BTRFS

On Mon, Jan 13, 2020 at 4:21 PM Christian Kujau <lists@nerdbynature.de> wrote:
>
> On Mon, 13 Jan 2020, Chris Murphy wrote:
> > It's a reporting bug. File system is fine.
>
> Well, I received some ENOSPC notifications from various apps, so it was a
> real problem.

Oh it's a real problem and a real bug. But the file system itself is OK.

>
> > > I'm running a --full-balance now and it's progressing, slowly. I've seen
> > > tricks on the interwebs to temporarily add a ramdisk, run another balance,
> > > remove the ramdisk again - but that seems hackish.
> >
> > I'd stop the balance. Balancing metadata in particular appears to make
> > the problem more common. And you're right, it's hackish, it's not a
> > great work around for anything these days, and if it is, good chance
> > it's a bug.
>
> For now, the balancing "helped", but the fs still shows only 391 GB
> allocated from the 924 GB device:
>
> =======================================================================
> # btrfs filesystem show /
> Label: 'root'  uuid: 75a6d93a-5a5c-48e0-a237-007b2e812477
>         Total devices 1 FS bytes used 388.00GiB
>         devid    1 size 824.40GiB used 391.03GiB path /dev/mapper/luks-root
>
> # df -h /
> Filesystem             Size  Used Avail Use% Mounted on
> /dev/mapper/luks-root  825G  390G  433G  48% /
> =======================================================================
>
> > In theory it should be enough to unmount then remount the file system;
> > of course for sysroot that'd be a reboot.
>
> OK, I'll try a reboot next time.
>
> > There may be certain workloads that encourage it, that could be worked
> > around temporarily using mount option metadata_ratio=1.
>
> I'll do that after it happens again, to see if this was a one-off or
> happens regularily. The file system is rather new (created Dec 14) and
> apart from spinning up some libvirt VMs (but no snapshots involved) the
> workload is a mix of web browsing and compiling things, no nothing too
> fancy.

A less janky option is to use 5.3.18, or grab 5.5.0-rc6 from koji.
I've been using 5.5.0 for a while for other reasons (i915 gotchas),
and the one Btrfs bug I ran into related to compression has been fixed
as of rc5.

https://koji.fedoraproject.org/koji/buildinfo?buildID=1428886


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 23:29     ` Chris Murphy
@ 2020-01-13 23:38       ` Christian Kujau
  2020-01-13 23:51         ` Chris Murphy
  2020-01-13 23:41       ` Chris Murphy
  1 sibling, 1 reply; 9+ messages in thread
From: Christian Kujau @ 2020-01-13 23:38 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Mon, 13 Jan 2020, Chris Murphy wrote:
> > Well, I received some ENOSPC notifications from various apps, so it was a
> > real problem.
> 
> Oh it's a real problem and a real bug. But the file system itself is OK.

Ah, OK. Good to know.

> > For now, the balancing "helped", but the fs still shows only 391 GB
> > allocated from the 924 GB device:

The first "balance start --full-balance /" finshed, with the following 
message, of course:

   ERROR: error during balancing '/': No space left on device

But afterwards at least "df" was happy and reported 48% usage again. While 
writing the last email I started another "balance start --full-balance /" 
to balance the extents that could not be balanced before because the file 
system was at 100%. But this failed with the same message and now I'm back 
to square one:

=============================================================
# btrfs filesystem df -h /
Data, single: total=391.00GiB, used=386.38GiB
System, single: total=32.00MiB, used=80.00KiB
Metadata, single: total=2.00GiB, used=1.55GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# df -h /
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/luks-root  825G  389G     0 100% /
=============================================================

Sigh. I can't reboot right now, will do later on and will try another 
balance now.

> A less janky option is to use 5.3.18, or grab 5.5.0-rc6 from koji.
> I've been using 5.5.0 for a while for other reasons (i915 gotchas),
> and the one Btrfs bug I ran into related to compression has been fixed
> as of rc5.
> 
> https://koji.fedoraproject.org/koji/buildinfo?buildID=1428886

OK, thanks for the hint, I'll do that in a few hours when I'm able to 
reboot.

Thanks,
Christian.
-- 
BOFH excuse #69:

knot in cables caused data stream to become twisted and kinked

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 23:29     ` Chris Murphy
  2020-01-13 23:38       ` Christian Kujau
@ 2020-01-13 23:41       ` Chris Murphy
  2020-01-14  2:16         ` Christian Kujau
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2020-01-13 23:41 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Christian Kujau, Btrfs BTRFS

On Mon, Jan 13, 2020 at 4:29 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Mon, Jan 13, 2020 at 4:21 PM Christian Kujau <lists@nerdbynature.de> wrote:
> >
> > On Mon, 13 Jan 2020, Chris Murphy wrote:
> > > It's a reporting bug. File system is fine.
> >
> > Well, I received some ENOSPC notifications from various apps, so it was a
> > real problem.
>
> Oh it's a real problem and a real bug. But the file system itself is OK.
>
> >
> > > > I'm running a --full-balance now and it's progressing, slowly. I've seen
> > > > tricks on the interwebs to temporarily add a ramdisk, run another balance,
> > > > remove the ramdisk again - but that seems hackish.
> > >
> > > I'd stop the balance. Balancing metadata in particular appears to make
> > > the problem more common. And you're right, it's hackish, it's not a
> > > great work around for anything these days, and if it is, good chance
> > > it's a bug.
> >
> > For now, the balancing "helped", but the fs still shows only 391 GB
> > allocated from the 924 GB device:
> >
> > =======================================================================
> > # btrfs filesystem show /
> > Label: 'root'  uuid: 75a6d93a-5a5c-48e0-a237-007b2e812477
> >         Total devices 1 FS bytes used 388.00GiB
> >         devid    1 size 824.40GiB used 391.03GiB path /dev/mapper/luks-root
> >
> > # df -h /
> > Filesystem             Size  Used Avail Use% Mounted on
> > /dev/mapper/luks-root  825G  390G  433G  48% /
> > =======================================================================
> >
> > > In theory it should be enough to unmount then remount the file system;
> > > of course for sysroot that'd be a reboot.
> >
> > OK, I'll try a reboot next time.
> >
> > > There may be certain workloads that encourage it, that could be worked
> > > around temporarily using mount option metadata_ratio=1.
> >
> > I'll do that after it happens again, to see if this was a one-off or
> > happens regularily. The file system is rather new (created Dec 14) and
> > apart from spinning up some libvirt VMs (but no snapshots involved) the
> > workload is a mix of web browsing and compiling things, no nothing too
> > fancy.
>
> A less janky option is to use 5.3.18, or grab 5.5.0-rc6 from koji.
> I've been using 5.5.0 for a while for other reasons (i915 gotchas),
> and the one Btrfs bug I ran into related to compression has been fixed
> as of rc5.
>
> https://koji.fedoraproject.org/koji/buildinfo?buildID=1428886
>

This is the latest patchset as of about a week ago, and actually I'm
not seeing it in 5.5rc6. A tested fix may not be ready yet.
https://patchwork.kernel.org/project/linux-btrfs/list/?series=223921

Your best bet is likely to stick with 5.4.10 and just use mount option
metadata_ratio=1. This won't cause some other weird thing to happen.
It'll just ask Btrfs to allocate a metadata block group each time a
data block group is created, or approximately 256M metadata BG for
each 1G data BG. And also it's useful to know if that doesn't help. I
myself haven't run into this bug or I'd try it.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 23:38       ` Christian Kujau
@ 2020-01-13 23:51         ` Chris Murphy
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2020-01-13 23:51 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Chris Murphy, Btrfs BTRFS

On Mon, Jan 13, 2020 at 4:38 PM Christian Kujau <lists@nerdbynature.de> wrote:
>
> On Mon, 13 Jan 2020, Chris Murphy wrote:
> > > Well, I received some ENOSPC notifications from various apps, so it was a
> > > real problem.
> >
> > Oh it's a real problem and a real bug. But the file system itself is OK.
>
> Ah, OK. Good to know.
>
> > > For now, the balancing "helped", but the fs still shows only 391 GB
> > > allocated from the 924 GB device:
>
> The first "balance start --full-balance /" finshed, with the following
> message, of course:
>
>    ERROR: error during balancing '/': No space left on device
>
> But afterwards at least "df" was happy and reported 48% usage again. While
> writing the last email I started another "balance start --full-balance /"
> to balance the extents that could not be balanced before because the file
> system was at 100%. But this failed with the same message and now I'm back
> to square one:


That's why I suggesting cancelling the balance.


>
> =============================================================
> # btrfs filesystem df -h /
> Data, single: total=391.00GiB, used=386.38GiB
> System, single: total=32.00MiB, used=80.00KiB
> Metadata, single: total=2.00GiB, used=1.55GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> # df -h /
> Filesystem             Size  Used Avail Use% Mounted on
> /dev/mapper/luks-root  825G  389G     0 100% /
> =============================================================
>
> Sigh. I can't reboot right now, will do later on and will try another
> balance now.

While it won't make things worse, it won't make it better either.

Use mount option metadata_ratio=1 instead; man 5 btrfs if you want to
know more about it's doing.

The bug is a consequence of a series of older bugs that got exposed in
5.4 with a change in how metadata is overcommitted, so now all those
older bugs will get fixed, but in the meantime the problem is more
likely triggered if you have recently balanced metadata block groups.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-13 23:41       ` Chris Murphy
@ 2020-01-14  2:16         ` Christian Kujau
  2020-01-14  3:39           ` Chris Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Kujau @ 2020-01-14  2:16 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Mon, 13 Jan 2020, Chris Murphy wrote:
> This is the latest patchset as of about a week ago, and actually I'm
> not seeing it in 5.5rc6. A tested fix may not be ready yet.
> https://patchwork.kernel.org/project/linux-btrfs/list/?series=223921
> 
> Your best bet is likely to stick with 5.4.10 and just use mount option
> metadata_ratio=1. This won't cause some other weird thing to happen.

I have remounted the file system with that option set when it was still 
running, but it didn't do anything (as expected I'd assume), its usage was 
still at 100 percent.

Now I had a chance to reboot the system (with that option set), but usage 
was still at 100%, so I ran "btrfs balance" once more (although you 
recommended against it :)) and a reboot later everything seems "normal" 
again.

Thanks for the explanations and hints. I must admit it's kinda surprising 
to me that these ENOSPC errors are still happening with btrfs, I somehow 
assumed that these kinks had been ironed out by now. But as you said, this 
may have re-appeared with 5.4 and it's not a big deal for me right now, so 
I can live with the mount option set and wait for 5.5 to be released :-)

Thanks again,
Christian.
-- 
BOFH excuse #123:

user to computer ratio too high.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: file system full on a single disk?
  2020-01-14  2:16         ` Christian Kujau
@ 2020-01-14  3:39           ` Chris Murphy
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2020-01-14  3:39 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Btrfs BTRFS

On Mon, Jan 13, 2020 at 7:16 PM Christian Kujau <lists@nerdbynature.de> wrote:
>
> On Mon, 13 Jan 2020, Chris Murphy wrote:
> > This is the latest patchset as of about a week ago, and actually I'm
> > not seeing it in 5.5rc6. A tested fix may not be ready yet.
> > https://patchwork.kernel.org/project/linux-btrfs/list/?series=223921
> >
> > Your best bet is likely to stick with 5.4.10 and just use mount option
> > metadata_ratio=1. This won't cause some other weird thing to happen.
>
> I have remounted the file system with that option set when it was still
> running, but it didn't do anything (as expected I'd assume), its usage was
> still at 100 percent.
>
> Now I had a chance to reboot the system (with that option set), but usage
> was still at 100%, so I ran "btrfs balance" once more (although you
> recommended against it :)) and a reboot later everything seems "normal"
> again.
>
> Thanks for the explanations and hints. I must admit it's kinda surprising
> to me that these ENOSPC errors are still happening with btrfs, I somehow
> assumed that these kinks had been ironed out by now. But as you said, this
> may have re-appeared with 5.4 and it's not a big deal for me right now, so
> I can live with the mount option set and wait for 5.5 to be released :-)

If you have a bugzilla account file a bug. You can put me on the cc,
use bugzilla@ instead of lists@, and then you'll get a notification
when this is fixed in a future Fedora kernel. You can just wipe out
the whole template and copy/paste your first email in it. That's
enough info I think.




-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-14  3:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-13 22:18 file system full on a single disk? Christian Kujau
2020-01-13 22:55 ` Chris Murphy
2020-01-13 23:21   ` Christian Kujau
2020-01-13 23:29     ` Chris Murphy
2020-01-13 23:38       ` Christian Kujau
2020-01-13 23:51         ` Chris Murphy
2020-01-13 23:41       ` Chris Murphy
2020-01-14  2:16         ` Christian Kujau
2020-01-14  3:39           ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).