linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* No space left after add device and balance
@ 2020-09-09  8:15 Miloslav Hůla
  2020-09-10 19:18 ` Chris Murphy
  0 siblings, 1 reply; 5+ messages in thread
From: Miloslav Hůla @ 2020-09-09  8:15 UTC (permalink / raw)
  To: linux-btrfs

Hello,

we are using btrfs RAID-10 (/data, 4.7TB) on a physical Supermicro 
server with Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz and 125GB of RAM.

We added two more drives:

btrfs device add /dev/sds /data
btrfs device add /dev/sdt /data

and than we run balance:

btrfs filesystem balance /data


Everything run fine about 15 hours. Then we got error "No space left" 
(dmesg output below) and /data remounted into read-only.

So we unmounted /data and mounted again, everything looks OK in 
read-write. Than we run full balance again:

btrfs filesystem balance /data

After ~15.5 hours finished successfully. Unfortunetally, I have no exact 
free space report before first balance, but it looked roughly like:

Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
         Total devices 16 FS bytes used 3.49TiB
         devid    1 size 558.41GiB used 448.66GiB path /dev/sda
         devid    2 size 558.41GiB used 448.66GiB path /dev/sdb
         devid    4 size 558.41GiB used 448.66GiB path /dev/sdd
         devid    5 size 558.41GiB used 448.66GiB path /dev/sde
         devid    7 size 558.41GiB used 448.66GiB path /dev/sdg
         devid    8 size 558.41GiB used 448.66GiB path /dev/sdh
         devid    9 size 558.41GiB used 448.66GiB path /dev/sdf
         devid   10 size 558.41GiB used 448.66GiB path /dev/sdi
         devid   11 size 558.41GiB used 448.66GiB path /dev/sdj
         devid   13 size 558.41GiB used 448.66GiB path /dev/sdk
         devid   14 size 558.41GiB used 448.66GiB path /dev/sdc
         devid   15 size 558.41GiB used 448.66GiB path /dev/sdl
         devid   16 size 558.41GiB used 448.66GiB path /dev/sdm
         devid   17 size 558.41GiB used 448.66GiB path /dev/sdn
         devid   18 size 837.84GiB used 448.66GiB path /dev/sdr
         devid   19 size 837.84GiB used 448.66GiB path /dev/sdq
         devid   20 size 837.84GiB used   0.00GiB path /dev/sds
         devid   21 size 837.84GiB used   0.00GiB path /dev/sdt


Are we doing something wrong? I found posts, where problems with balace 
of full filesystem are described. And as a recommendation is "add empty 
device, run balance, remove device".

Are there some requirements on free space for balance even if you add 
new device?

Maybe "fixed" in newer kernel?


Thank you. With kind regards
Milo



# uname -a
Linux imap 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64 
GNU/Linux


# btrfs --version
btrfs-progs v4.7.3


# btrfs fi show (after successfull balance)
Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
         Total devices 18 FS bytes used 3.49TiB
         devid    1 size 558.41GiB used 398.27GiB path /dev/sda
         devid    2 size 558.41GiB used 398.27GiB path /dev/sdb
         devid    4 size 558.41GiB used 398.27GiB path /dev/sdd
         devid    5 size 558.41GiB used 398.27GiB path /dev/sde
         devid    7 size 558.41GiB used 398.27GiB path /dev/sdg
         devid    8 size 558.41GiB used 398.27GiB path /dev/sdh
         devid    9 size 558.41GiB used 398.27GiB path /dev/sdf
         devid   10 size 558.41GiB used 398.27GiB path /dev/sdi
         devid   11 size 558.41GiB used 398.27GiB path /dev/sdj
         devid   13 size 558.41GiB used 398.27GiB path /dev/sdk
         devid   14 size 558.41GiB used 398.27GiB path /dev/sdc
         devid   15 size 558.41GiB used 398.27GiB path /dev/sdl
         devid   16 size 558.41GiB used 398.27GiB path /dev/sdm
         devid   17 size 558.41GiB used 398.27GiB path /dev/sdn
         devid   18 size 837.84GiB used 398.27GiB path /dev/sdr
         devid   19 size 837.84GiB used 398.27GiB path /dev/sdq
         devid   20 size 837.84GiB used 398.27GiB path /dev/sds
         devid   21 size 837.84GiB used 398.27GiB path /dev/sdt


# btrfs fi df /data/ (after successfull balance)
ata, RAID10: total=3.48TiB, used=3.48TiB
System, RAID10: total=144.00MiB, used=304.00KiB
Metadata, RAID10: total=20.25GiB, used=17.96GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


# dmesg
[Tue Sep  8 06:45:36 2020] BTRFS info (device sda): relocating block 
group 33991231012864 flags 68
[Tue Sep  8 06:47:56 2020] BTRFS info (device sda): found 63581 extents
[Tue Sep  8 06:47:58 2020] BTRFS info (device sda): relocating block 
group 33990157271040 flags 68
[Tue Sep  8 06:48:34 2020] ------------[ cut here ]------------
[Tue Sep  8 06:48:34 2020] WARNING: CPU: 21 PID: 1342 at 
/build/linux-xZ5nrU/linux-4.9.210/fs/btrfs/extent-tree.c:2967 
btrfs_run_delayed_refs+0x27e/0x2b0 [btrfs]
[Tue Sep  8 06:48:34 2020] BTRFS: Transaction aborted (error -28)
[Tue Sep  8 06:48:34 2020] Modules linked in: fuse ufs qnx4 hfsplus hfs 
minix ntfs vfat msdos fat jfs xfs dm_mod ipt_REJECT nf_reject_ipv4 nfsv3 
nfs_acl nfs lockd grace fscache xt_multiport iptable_filter intel_rapl 
sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp 
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel intel_cstate iTCO_wdt iTCO_vendor_support mxm_wmi 
intel_uncore ast intel_rapl_perf pcspkr ttm drm_kms_helper drm mei_me 
lpc_ich joydev sg mfd_core mei evdev ioatdma shpchp ipmi_si 
ipmi_msghandler wmi acpi_power_meter acpi_pad button sunrpc ip_tables 
x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache btrfs raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor 
raid6_pq libcrc32c crc32c_generic raid0 multipath linear raid1 md_mod 
ses enclosure
[Tue Sep  8 06:48:34 2020]  scsi_transport_sas sd_mod hid_generic usbhid 
hid crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul 
ablk_helper cryptd ahci xhci_pci libahci ehci_pci xhci_hcd ehci_hcd igb 
i2c_i801 libata megaraid_sas i2c_smbus i2c_algo_bit usbcore dca ptp 
usb_common pps_core scsi_mod
[Tue Sep  8 06:48:34 2020] CPU: 21 PID: 1342 Comm: btrfs-transacti Not 
tainted 4.9.0-12-amd64 #1 Debian 4.9.210-1
[Tue Sep  8 06:48:34 2020] Hardware name: Supermicro X10DRi/X10DRi, BIOS 
2.1 09/13/2016
[Tue Sep  8 06:48:34 2020]  0000000000000000 ffffffff97936bfe 
ffffb0d54d8a7d68 0000000000000000
[Tue Sep  8 06:48:34 2020]  ffffffff9767b98b ffffffffc04cbc28 
ffffb0d54d8a7dc0 ffff8f09538faa48
[Tue Sep  8 06:48:34 2020]  ffff8f1c38030000 ffff8f1c36f830c0 
000000000023d11a ffffffff9767ba0f
[Tue Sep  8 06:48:34 2020] Call Trace:
[Tue Sep  8 06:48:34 2020]  [<ffffffff97936bfe>] ? dump_stack+0x66/0x88
[Tue Sep  8 06:48:34 2020]  [<ffffffff9767b98b>] ? __warn+0xcb/0xf0
[Tue Sep  8 06:48:34 2020]  [<ffffffff9767ba0f>] ? 
warn_slowpath_fmt+0x5f/0x80
[Tue Sep  8 06:48:34 2020]  [<ffffffffc04295fe>] ? 
btrfs_run_delayed_refs+0x27e/0x2b0 [btrfs]
[Tue Sep  8 06:48:34 2020]  [<ffffffffc043f823>] ? 
btrfs_commit_transaction+0x53/0xa10 [btrfs]
[Tue Sep  8 06:48:34 2020]  [<ffffffffc0440276>] ? 
start_transaction+0x96/0x480 [btrfs]
[Tue Sep  8 06:48:34 2020]  [<ffffffffc043a67c>] ? 
transaction_kthread+0x1dc/0x200 [btrfs]
[Tue Sep  8 06:48:34 2020]  [<ffffffffc043a4a0>] ? 
btrfs_cleanup_transaction+0x590/0x590 [btrfs]
[Tue Sep  8 06:48:34 2020]  [<ffffffff9769bdc9>] ? kthread+0xd9/0xf0
[Tue Sep  8 06:48:34 2020]  [<ffffffff97c1e2f1>] ? __switch_to_asm+0x41/0x70
[Tue Sep  8 06:48:34 2020]  [<ffffffff9769bcf0>] ? kthread_park+0x60/0x60
[Tue Sep  8 06:48:34 2020]  [<ffffffff97c1e377>] ? ret_from_fork+0x57/0x70
[Tue Sep  8 06:48:34 2020] ---[ end trace 7be8e0e64ac370b3 ]---
[Tue Sep  8 06:48:34 2020] BTRFS: error (device sda) in 
btrfs_run_delayed_refs:2967: errno=-28 No space left
[Tue Sep  8 06:48:34 2020] BTRFS info (device sda): forced readonly
[Tue Sep  8 07:31:16 2020] BTRFS info (device sda): disk space caching 
is enabled
[Tue Sep  8 07:31:16 2020] BTRFS error (device sda): Remounting 
read-write after error is not allowed
[Tue Sep  8 07:32:50 2020] BTRFS error (device sda): cleaner transaction 
attach returned -30
[Tue Sep  8 07:33:01 2020] BTRFS info (device sdt): disk space caching 
is enabled
[Tue Sep  8 07:33:01 2020] BTRFS info (device sdt): has skinny extents
[Tue Sep  8 07:35:38 2020] BTRFS info (device sdt): checking UUID tree
[Tue Sep  8 07:35:38 2020] BTRFS info (device sdt): continuing balance
[Tue Sep  8 07:35:38 2020] BTRFS info (device sdt): relocating block 
group 40004587880448 flags 68
[Tue Sep  8 07:38:49 2020] BTRFS info (device sdt): found 32320 extents
[Tue Sep  8 07:38:50 2020] BTRFS info (device sdt): relocating block 
group 39945397862400 flags 68
[Tue Sep  8 07:42:07 2020] BTRFS info (device sdt): found 53227 extents
[Tue Sep  8 07:42:08 2020] BTRFS info (device sdt): relocating block 
group 39944189902848 flags 68
[Tue Sep  8 07:44:29 2020] BTRFS info (device sdt): found 57961 extents
[Tue Sep  8 07:44:30 2020] BTRFS info (device sdt): relocating block 
group 33985728086016 flags 66
[Tue Sep  8 07:44:30 2020] BTRFS info (device sdt): found 20 extents

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: No space left after add device and balance
  2020-09-09  8:15 No space left after add device and balance Miloslav Hůla
@ 2020-09-10 19:18 ` Chris Murphy
  2020-09-11 10:29   ` Miloslav Hůla
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2020-09-10 19:18 UTC (permalink / raw)
  To: Miloslav Hůla; +Cc: Btrfs BTRFS

On Wed, Sep 9, 2020 at 2:15 AM Miloslav Hůla <miloslav.hula@gmail.com> wrote:

> After ~15.5 hours finished successfully. Unfortunetally, I have no exact
> free space report before first balance, but it looked roughly like:
>
> Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
>          Total devices 16 FS bytes used 3.49TiB
>          devid    1 size 558.41GiB used 448.66GiB path /dev/sda
>          devid    2 size 558.41GiB used 448.66GiB path /dev/sdb
>          devid    4 size 558.41GiB used 448.66GiB path /dev/sdd
>          devid    5 size 558.41GiB used 448.66GiB path /dev/sde
>          devid    7 size 558.41GiB used 448.66GiB path /dev/sdg
>          devid    8 size 558.41GiB used 448.66GiB path /dev/sdh
>          devid    9 size 558.41GiB used 448.66GiB path /dev/sdf
>          devid   10 size 558.41GiB used 448.66GiB path /dev/sdi
>          devid   11 size 558.41GiB used 448.66GiB path /dev/sdj
>          devid   13 size 558.41GiB used 448.66GiB path /dev/sdk
>          devid   14 size 558.41GiB used 448.66GiB path /dev/sdc
>          devid   15 size 558.41GiB used 448.66GiB path /dev/sdl
>          devid   16 size 558.41GiB used 448.66GiB path /dev/sdm
>          devid   17 size 558.41GiB used 448.66GiB path /dev/sdn
>          devid   18 size 837.84GiB used 448.66GiB path /dev/sdr
>          devid   19 size 837.84GiB used 448.66GiB path /dev/sdq
>          devid   20 size 837.84GiB used   0.00GiB path /dev/sds
>          devid   21 size 837.84GiB used   0.00GiB path /dev/sdt
>
>
> Are we doing something wrong? I found posts, where problems with balace
> of full filesystem are described. And as a recommendation is "add empty
> device, run balance, remove device".

It's raid10, so in this case, you probably need to add 4 devices. It's
not required they be equal sizes but it's ideal.

> Are there some requirements on free space for balance even if you add
> new device?

The free space is reported upon adding the two devices; but the raid10
profile requires more than just free space, it needs free space on
four devices.

> [Tue Sep  8 06:48:34 2020] BTRFS info (device sda): forced readonly
> [Tue Sep  8 07:31:16 2020] BTRFS info (device sda): disk space caching
> is enabled
> [Tue Sep  8 07:31:16 2020] BTRFS error (device sda): Remounting
> read-write after error is not allowed

Yeah it's confused. You need to reboot.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: No space left after add device and balance
  2020-09-10 19:18 ` Chris Murphy
@ 2020-09-11 10:29   ` Miloslav Hůla
  2020-09-11 15:33     ` Zygo Blaxell
  0 siblings, 1 reply; 5+ messages in thread
From: Miloslav Hůla @ 2020-09-11 10:29 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Dne 10.09.2020 v 21:18 Chris Murphy napsal(a):
> On Wed, Sep 9, 2020 at 2:15 AM Miloslav Hůla <miloslav.hula@gmail.com> wrote:
> 
>> After ~15.5 hours finished successfully. Unfortunetally, I have no exact
>> free space report before first balance, but it looked roughly like:
>>
>> Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
>>           Total devices 16 FS bytes used 3.49TiB
>>           devid    1 size 558.41GiB used 448.66GiB path /dev/sda
>>           devid    2 size 558.41GiB used 448.66GiB path /dev/sdb
>>           devid    4 size 558.41GiB used 448.66GiB path /dev/sdd
>>           devid    5 size 558.41GiB used 448.66GiB path /dev/sde
>>           devid    7 size 558.41GiB used 448.66GiB path /dev/sdg
>>           devid    8 size 558.41GiB used 448.66GiB path /dev/sdh
>>           devid    9 size 558.41GiB used 448.66GiB path /dev/sdf
>>           devid   10 size 558.41GiB used 448.66GiB path /dev/sdi
>>           devid   11 size 558.41GiB used 448.66GiB path /dev/sdj
>>           devid   13 size 558.41GiB used 448.66GiB path /dev/sdk
>>           devid   14 size 558.41GiB used 448.66GiB path /dev/sdc
>>           devid   15 size 558.41GiB used 448.66GiB path /dev/sdl
>>           devid   16 size 558.41GiB used 448.66GiB path /dev/sdm
>>           devid   17 size 558.41GiB used 448.66GiB path /dev/sdn
>>           devid   18 size 837.84GiB used 448.66GiB path /dev/sdr
>>           devid   19 size 837.84GiB used 448.66GiB path /dev/sdq
>>           devid   20 size 837.84GiB used   0.00GiB path /dev/sds
>>           devid   21 size 837.84GiB used   0.00GiB path /dev/sdt
>>
>>
>> Are we doing something wrong? I found posts, where problems with balace
>> of full filesystem are described. And as a recommendation is "add empty
>> device, run balance, remove device".
> 
> It's raid10, so in this case, you probably need to add 4 devices. It's
> not required they be equal sizes but it's ideal.
> 
>> Are there some requirements on free space for balance even if you add
>> new device?
> 
> The free space is reported upon adding the two devices; but the raid10
> profile requires more than just free space, it needs free space on
> four devices.

I didn't realize that. It makes me sense now. So we are probably "wrong" 
with 18 devices. Multiple of 4 would be better I guess.

Thank you!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: No space left after add device and balance
  2020-09-11 10:29   ` Miloslav Hůla
@ 2020-09-11 15:33     ` Zygo Blaxell
  2020-09-15 10:08       ` Miloslav Hůla
  0 siblings, 1 reply; 5+ messages in thread
From: Zygo Blaxell @ 2020-09-11 15:33 UTC (permalink / raw)
  To: Miloslav Hůla; +Cc: Chris Murphy, Btrfs BTRFS

On Fri, Sep 11, 2020 at 12:29:32PM +0200, Miloslav Hůla wrote:
> Dne 10.09.2020 v 21:18 Chris Murphy napsal(a):
> > On Wed, Sep 9, 2020 at 2:15 AM Miloslav Hůla <miloslav.hula@gmail.com> wrote:
> > 
> > > After ~15.5 hours finished successfully. Unfortunetally, I have no exact
> > > free space report before first balance, but it looked roughly like:
> > > 
> > > Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
> > >           Total devices 16 FS bytes used 3.49TiB
> > >           devid    1 size 558.41GiB used 448.66GiB path /dev/sda
> > >           devid    2 size 558.41GiB used 448.66GiB path /dev/sdb
> > >           devid    4 size 558.41GiB used 448.66GiB path /dev/sdd
> > >           devid    5 size 558.41GiB used 448.66GiB path /dev/sde
> > >           devid    7 size 558.41GiB used 448.66GiB path /dev/sdg
> > >           devid    8 size 558.41GiB used 448.66GiB path /dev/sdh
> > >           devid    9 size 558.41GiB used 448.66GiB path /dev/sdf
> > >           devid   10 size 558.41GiB used 448.66GiB path /dev/sdi
> > >           devid   11 size 558.41GiB used 448.66GiB path /dev/sdj
> > >           devid   13 size 558.41GiB used 448.66GiB path /dev/sdk
> > >           devid   14 size 558.41GiB used 448.66GiB path /dev/sdc
> > >           devid   15 size 558.41GiB used 448.66GiB path /dev/sdl
> > >           devid   16 size 558.41GiB used 448.66GiB path /dev/sdm
> > >           devid   17 size 558.41GiB used 448.66GiB path /dev/sdn
> > >           devid   18 size 837.84GiB used 448.66GiB path /dev/sdr
> > >           devid   19 size 837.84GiB used 448.66GiB path /dev/sdq
> > >           devid   20 size 837.84GiB used   0.00GiB path /dev/sds
> > >           devid   21 size 837.84GiB used   0.00GiB path /dev/sdt
> > > 
> > > 
> > > Are we doing something wrong? I found posts, where problems with balace
> > > of full filesystem are described. And as a recommendation is "add empty
> > > device, run balance, remove device".
> > 
> > It's raid10, so in this case, you probably need to add 4 devices. It's
> > not required they be equal sizes but it's ideal.

Something is wrong there.  Each new balanced chunk will free space
on the first 16 drives (each new chunk is 9GB, each old one is 8GB,
so the number of chunks on each of the old disks required to hold the
same data decreases by 1/8th each time a chunk is relocated in balance).
Every drive had at least 100GB of unallocated space at the start, so the
first 9GB chunk should have been allocated without issue.  Assuming nobody
was aggressively writing to the disk during the balance to consume all
available space, it should not have run out of space in a full balance.

You might have hit a metadata space reservation bug, especially on an
older kernel.  It's hard to know what happened without a log of the
'btrfs fi usage' data over time and a stack trace of the ENOSPC event,
but whatever it was, it was probably fixed some time in the last 4
years.

There's a bug (up to at least 5.4) where scrub locks chunks and triggers
aggressive metadata overallocations, which can lead to this result.
It forms a feedback loop where scrub keeps locking the metadata chunk that
balance is using, so balance allocates another metadata chunk, then scrub
moves on and locks that new metadata chunk, repeat until out of space,
abort transaction, all the allocations get rolled back and disappear.
Only relevant if you were running a scrub at the same time as balance.

> > > Are there some requirements on free space for balance even if you add
> > > new device?
> > 
> > The free space is reported upon adding the two devices; but the raid10
> > profile requires more than just free space, it needs free space on
> > four devices.
> 
> I didn't realize that. It makes me sense now. So we are probably "wrong"
> with 18 devices. Multiple of 4 would be better I guess.

Not really.  btrfs profile "raid10" distribute stripes over pairs of
mirrored drives.  It will allocate raid10 chunks on 4 + N * 2 drives at a
time, but each chunk can use different drives so all the space is filled.
Any number of disks above 4 is OK (including odd numbers), but sequential
read performance will only increase when the number of drives increases
to the next even number so the next highest stripe width can be used.

i.e. 5 drives will provide more space than 4, but 5 drives will not be
significantly faster than 4.  6 is faster and larger than 4 or 5.
7 will be larger, 8 will be larger and faster, 9 will be larger, etc.
(assuming all the drives are identical)

At some point there's a chunk size limit that kicks in and limits the
number of drives in a chunk, but I'm not sure what the limit is (the
kernel code does math on struct and block sizes and the result isn't
obvious to me).  I think it ends up being 10 of something, but not sure
if the unit is drives (in which case your 16 drives were already over
the limit) or gigabytes of logical chunk size (for raid10 that means
20 drives).  Once that limit is reached, adding more drives only increases
space and does not improve sequential read performance any further.

> Thank you!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: No space left after add device and balance
  2020-09-11 15:33     ` Zygo Blaxell
@ 2020-09-15 10:08       ` Miloslav Hůla
  0 siblings, 0 replies; 5+ messages in thread
From: Miloslav Hůla @ 2020-09-15 10:08 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Chris Murphy, Btrfs BTRFS

Dne 11.09.2020 v 17:33 Zygo Blaxell napsal(a):
> On Fri, Sep 11, 2020 at 12:29:32PM +0200, Miloslav Hůla wrote:
>> Dne 10.09.2020 v 21:18 Chris Murphy napsal(a):
>>> On Wed, Sep 9, 2020 at 2:15 AM Miloslav Hůla <miloslav.hula@gmail.com> wrote:
>>>
>>>> After ~15.5 hours finished successfully. Unfortunetally, I have no exact
>>>> free space report before first balance, but it looked roughly like:
>>>>
>>>> Label: 'DATA'  uuid: 5b285a46-e55d-4191-924f-0884fa06edd8
>>>>            Total devices 16 FS bytes used 3.49TiB
>>>>            devid    1 size 558.41GiB used 448.66GiB path /dev/sda
>>>>            devid    2 size 558.41GiB used 448.66GiB path /dev/sdb
>>>>            devid    4 size 558.41GiB used 448.66GiB path /dev/sdd
>>>>            devid    5 size 558.41GiB used 448.66GiB path /dev/sde
>>>>            devid    7 size 558.41GiB used 448.66GiB path /dev/sdg
>>>>            devid    8 size 558.41GiB used 448.66GiB path /dev/sdh
>>>>            devid    9 size 558.41GiB used 448.66GiB path /dev/sdf
>>>>            devid   10 size 558.41GiB used 448.66GiB path /dev/sdi
>>>>            devid   11 size 558.41GiB used 448.66GiB path /dev/sdj
>>>>            devid   13 size 558.41GiB used 448.66GiB path /dev/sdk
>>>>            devid   14 size 558.41GiB used 448.66GiB path /dev/sdc
>>>>            devid   15 size 558.41GiB used 448.66GiB path /dev/sdl
>>>>            devid   16 size 558.41GiB used 448.66GiB path /dev/sdm
>>>>            devid   17 size 558.41GiB used 448.66GiB path /dev/sdn
>>>>            devid   18 size 837.84GiB used 448.66GiB path /dev/sdr
>>>>            devid   19 size 837.84GiB used 448.66GiB path /dev/sdq
>>>>            devid   20 size 837.84GiB used   0.00GiB path /dev/sds
>>>>            devid   21 size 837.84GiB used   0.00GiB path /dev/sdt
>>>>
>>>>
>>>> Are we doing something wrong? I found posts, where problems with balace
>>>> of full filesystem are described. And as a recommendation is "add empty
>>>> device, run balance, remove device".
>>>
>>> It's raid10, so in this case, you probably need to add 4 devices. It's
>>> not required they be equal sizes but it's ideal.
> 
> Something is wrong there.  Each new balanced chunk will free space
> on the first 16 drives (each new chunk is 9GB, each old one is 8GB,
> so the number of chunks on each of the old disks required to hold the
> same data decreases by 1/8th each time a chunk is relocated in balance).
> Every drive had at least 100GB of unallocated space at the start, so the
> first 9GB chunk should have been allocated without issue.  Assuming nobody
> was aggressively writing to the disk during the balance to consume all
> available space, it should not have run out of space in a full balance.
> 
> You might have hit a metadata space reservation bug, especially on an
> older kernel.  It's hard to know what happened without a log of the
> 'btrfs fi usage' data over time and a stack trace of the ENOSPC event,
> but whatever it was, it was probably fixed some time in the last 4
> years.

It happend with 4.9.210-1 (it's ~4 years old), now running 4.19.132-1 
after Debian Buster upgrade.

> There's a bug (up to at least 5.4) where scrub locks chunks and triggers
> aggressive metadata overallocations, which can lead to this result.
> It forms a feedback loop where scrub keeps locking the metadata chunk that
> balance is using, so balance allocates another metadata chunk, then scrub
> moves on and locks that new metadata chunk, repeat until out of space,
> abort transaction, all the allocations get rolled back and disappear.
> Only relevant if you were running a scrub at the same time as balance.

We were not running scrub in time of crash. But we were running rsync of 
whole btrfs to NetApp via NFS at the same sime.

>>>> Are there some requirements on free space for balance even if you add
>>>> new device?
>>>
>>> The free space is reported upon adding the two devices; but the raid10
>>> profile requires more than just free space, it needs free space on
>>> four devices.
>>
>> I didn't realize that. It makes me sense now. So we are probably "wrong"
>> with 18 devices. Multiple of 4 would be better I guess.
> 
> Not really.  btrfs profile "raid10" distribute stripes over pairs of
> mirrored drives.  It will allocate raid10 chunks on 4 + N * 2 drives at a
> time, but each chunk can use different drives so all the space is filled.
> Any number of disks above 4 is OK (including odd numbers), but sequential
> read performance will only increase when the number of drives increases
> to the next even number so the next highest stripe width can be used.
> 
> i.e. 5 drives will provide more space than 4, but 5 drives will not be
> significantly faster than 4.  6 is faster and larger than 4 or 5.
> 7 will be larger, 8 will be larger and faster, 9 will be larger, etc.
> (assuming all the drives are identical)

Thanks for the clarification. I used to think that. It was one of pros 
to make a decision for btrfs RAID-10 few years before. I'm glad that's 
right.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-15 10:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-09  8:15 No space left after add device and balance Miloslav Hůla
2020-09-10 19:18 ` Chris Murphy
2020-09-11 10:29   ` Miloslav Hůla
2020-09-11 15:33     ` Zygo Blaxell
2020-09-15 10:08       ` Miloslav Hůla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).