All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tomasz Chmielewski <mangoo@wpkg.org>
To: E V <eliventer@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: how to run balance successfully (No space left on device)?
Date: Fri, 10 Nov 2017 16:42:42 +0900	[thread overview]
Message-ID: <011ae8c4281f0f8799d48189f540a302@wpkg.org> (raw)
In-Reply-To: <CAJtFHUQ34uyt-iAQKuQ-WqXMrCqxsPeqFc5LvYmZHrz+Rxs66A@mail.gmail.com>

On 2017-11-07 23:49, E V wrote:

> Hmm, I used to see these phantom no space issues quite a bit on older
> 4.x kernels, and haven't seen them since switching to space_cache=v2.
> So it could be space cache corruption. You might try either clearing
> you space cache, or mounting with nospace_cache, or try converting to
> space_cache=v2 after reading up on it's caveats.

We have space_cache=v2.

Unfortunately yet one more system running 4.14-rc8 with "No space left" 
during balance:


[68443.535664] BTRFS info (device sdb3): relocating block group 
591771009024 flags data|raid1
[68463.203330] BTRFS info (device sdb3): found 8578 extents
[68492.238676] BTRFS info (device sdb3): found 8559 extents
[68500.751792] BTRFS info (device sdb3): 1 enospc errors during balance


# btrfs balance start /var/lib/lxd
WARNING:

         Full balance without filters requested. This operation is very
         intense and takes potentially very long. It is recommended to
         use the balance filters to narrow down the balanced data.
         Use 'btrfs balance start --full-balance' option to skip this
         warning. The operation will start in 10 seconds.
         Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/var/lib/lxd': No space left on device
There may be more info in syslog - try dmesg | tail


# btrfs fi usage /var/lib/lxd
Overall:
     Device size:                 846.26GiB
     Device allocated:            622.27GiB
     Device unallocated:          223.99GiB
     Device missing:                  0.00B
     Used:                        606.40GiB
     Free (estimated):            116.68GiB      (min: 116.68GiB)
     Data ratio:                       2.00
     Metadata ratio:                   2.00
     Global reserve:              512.00MiB      (used: 0.00B)

Data,RAID1: Size:306.00GiB, Used:301.31GiB
    /dev/sda3     306.00GiB
    /dev/sdb3     306.00GiB

Metadata,RAID1: Size:5.10GiB, Used:1.89GiB
    /dev/sda3       5.10GiB
    /dev/sdb3       5.10GiB

System,RAID1: Size:32.00MiB, Used:80.00KiB
    /dev/sda3      32.00MiB
    /dev/sdb3      32.00MiB

Unallocated:
    /dev/sda3     112.00GiB
    /dev/sdb3     112.00GiB


# btrfs fi show /var/lib/lxd
Label: 'btrfs'  uuid: 6340f5de-f635-4d09-bbb2-1e03b1e1b160
         Total devices 2 FS bytes used 303.20GiB
         devid    1 size 423.13GiB used 311.13GiB path /dev/sda3
         devid    2 size 423.13GiB used 311.13GiB path /dev/sdb3


# btrfs fi df /var/lib/lxd
Data, RAID1: total=306.00GiB, used=301.32GiB
System, RAID1: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=5.10GiB, used=1.89GiB
GlobalReserve, single: total=512.00MiB, used=0.00B



So far out of all systems which were giving us "No space left on device" 
with 4.13.x, all but one are still giving us "No space left on device" 
during balance with 4.14-rc7 and later.
We've seen it on a mix of servers with SSD or HDD disks, with 
filesystems ranging from 0.5 TB to 20 TB, and use % from 30% to 90%.

Combined with evidence that "No space left on device" during balance can 
lead to various file corruption (we've witnessed it with MySQL), I'd day 
btrfs balance is a dangerous operation and decision to use it should be 
considered very thoroughly.


Shouldn't "Balance" be marked as "mostly OK" or "Unstable" here? Giving 
it "OK" status is misleading.

https://btrfs.wiki.kernel.org/index.php/Status


Tomasz Chmielewski
https://lxadm.com

  parent reply	other threads:[~2017-11-10  7:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-17 15:02 how to run balance successfully (No space left on device)? Tomasz Chmielewski
2017-09-18  1:50 ` Duncan
2017-09-18  8:20 ` Tomasz Chmielewski
2017-09-18  8:29   ` Andrei Borzenkov
2017-09-18  9:27     ` Tomasz Chmielewski
2017-09-18 13:44       ` Peter Becker
2017-09-18 13:50         ` Tomasz Chmielewski
2017-09-19  2:59       ` Duncan
2017-10-31 14:18   ` Tomasz Chmielewski
2017-10-31 14:51     ` Tomasz Chmielewski
2017-11-07  5:13     ` Tomasz Chmielewski
     [not found]       ` <CAJtFHUQ34uyt-iAQKuQ-WqXMrCqxsPeqFc5LvYmZHrz+Rxs66A@mail.gmail.com>
2017-11-10  7:42         ` Tomasz Chmielewski [this message]
2017-11-10 21:51           ` Chris Murphy
2017-11-10 22:18             ` Martin Raiber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=011ae8c4281f0f8799d48189f540a302@wpkg.org \
    --to=mangoo@wpkg.org \
    --cc=eliventer@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.