All of lore.kernel.org
 help / color / mirror / Atom feed
* ENOSPC with mkdir and rename
@ 2014-08-02 23:35 Peter Waller
  2014-08-03  0:28 ` Mitch Harder
                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-02 23:35 UTC (permalink / raw)
  To: linux-btrfs

Hi All,

My TL;DR questions are at the bottom, before the stack trace.

I'm running Ubuntu 14.04. I wonder if this problem is related to the
thread titled "Machine lockup due to btrfs-transaction on AWS EC2
Ubuntu 14.04" which I started on the 29th of July:

> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224

Kernel: 3.15.7-031507-generic

I'm on a single block device system, i.e, no RAID.

I was observing ENOSPC from `mkdir` and `rename` on this system, with
a good amount of free disk space (df -h reports 62 GB remain). I added
enospc_debug (full umount/mount, not just mount -o remount), but this
had no apparent effect when receiving ENOSPC from userland.

$ sudo btrfs fi df /path/to/volume
Data, single: total=489.97GiB, used=427.75GiB
System, DUP: total=8.00MiB, used=60.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=5.00GiB, used=4.50GiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=512.00MiB, used=820.00KiB

After a thorough search of the internet for ENOSPC BTRFS I found
various resources and came to understand a little bit more. One thing
which broke my intuition severely is that I expected if there is a
large number of free GiB, I should expect things to continue to work.

In this case, for example, metadata has 0.5GiB free ("sounds like
plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
would I get ENOSPC for a file rename?

I expected that if metadata needed more space, it would just eat it
from the 'data'. Now I believe this not to be the case and that it
wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.

I tried a rebalance with btrfs balance start -dusage=10 and tried
increasing the value until I saw reallocations in dmesg.

This spat out a large number of messages in dmesg, of this form:

> [376096.546353] BTRFS info (device dm-0): relocating block group 530457821184 flags 1
> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance

(and a full stack trace at the end of this message).

The rebalance printed:

> ERROR: error during balancing '/path/to/volume' - No space left on device
> There may be more info in syslog - try dmesg | tail

Eventually, not knowing what else to do I had to take my escape hatch
and enlarge the volume. When I did this, metadata grew by 1GiB:

> Data, single: total=490.97GiB, used=427.75GiB
> System, DUP: total=8.00MiB, used=60.00KiB
> System, single: total=4.00MiB, used=0.00
> Metadata, DUP: total=5.50GiB, used=4.50GiB
> Metadata, single: total=8.00MiB, used=0.00
> unknown, single: total=512.00MiB, used=0.00

A few questions:

* Why didn't the metadata grow before enlarging the disk?
* Why didn't the rebalance enable the metadata to grow?
* Why is it necessary to rebalance? Can't it automatically take some
free space from 'data'?
* Are my machine lockups related to the fact I was low on space?
* Can we improve the documentation/FAQ for this? I was scratching my
head in particular because my notion of free space definitely does not
match up with BTRFS', and I didn't find the FAQ very helpful for
getting out of this mess.
* It isn't documented on the wiki what enospc_debug is supposed to do,
so I couldn't tell whether I should have expected it to tell me
anything in my circumstances.
* What is the best course of action to take (other than enlarging the
disk or deleting files) if I encounter this situation again?

Thanks in advance,

- Peter

[376007.681938] ------------[ cut here ]------------
[376007.681957] WARNING: CPU: 1 PID: 27021 at
/home/apw/COD/linux/fs/btrfs/extent-tree.c:6946
use_block_rsv+0xfd/0x1a0 [btrfs]()
[376007.681958] BTRFS: block rsv returned -28
[376007.681959] Modules linked in: softdog tcp_diag inet_diag dm_crypt
ppdev xen_fbfront fb_sys_fops syscopyarea sysfillrect sysimgblt
i2c_piix4 serio_raw parport_pc parport mac_hid isofs xt_tcpudp
iptable_filter xt_owner ip_tables x_tables btrfs xor raid6_pq
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd floppy psmouse
[376007.681980] CPU: 1 PID: 27021 Comm: pam_script_ses_ Tainted: G
   W     3.15.7-031507-generic #201407281235
[376007.681981] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014
[376007.681983]  0000000000001b22 ffff8800acca39d8 ffffffff8176f115
0000000000000007
[376007.681986]  ffff8800acca3a28 ffff8800acca3a18 ffffffff8106ceac
ffff8801efc37870
[376007.681989]  ffff88017db0ff00 ffff8801aedcd800 0000000000001000
ffff88001c987000
[376007.681992] Call Trace:
[376007.682000]  [<ffffffff8176f115>] dump_stack+0x46/0x58
[376007.682005]  [<ffffffff8106ceac>] warn_slowpath_common+0x8c/0xc0
[376007.682008]  [<ffffffff8106cf96>] warn_slowpath_fmt+0x46/0x50
[376007.682016]  [<ffffffffa00d9d1d>] use_block_rsv+0xfd/0x1a0 [btrfs]
[376007.682024]  [<ffffffffa00de687>] btrfs_alloc_free_block+0x57/0x220 [btrfs]
[376007.682027]  [<ffffffff8178033c>] ? __do_page_fault+0x28c/0x550
[376007.682031]  [<ffffffff8119749f>] ? page_add_file_rmap+0x6f/0xb0
[376007.682037]  [<ffffffffa00c8a3c>] btrfs_copy_root+0xfc/0x2b0 [btrfs]
[376007.682041]  [<ffffffff811c60b9>] ? memcg_check_events+0x29/0x50
[376007.682051]  [<ffffffffa013a583>] ? create_reloc_root+0x33/0x2c0 [btrfs]
[376007.682061]  [<ffffffffa013a743>] create_reloc_root+0x1f3/0x2c0 [btrfs]
[376007.682064]  [<ffffffff811dd073>] ? generic_permission+0xf3/0x120
[376007.682073]  [<ffffffffa0140eb8>] btrfs_init_reloc_root+0xb8/0xd0 [btrfs]
[376007.682082]  [<ffffffffa00ee967>]
record_root_in_trans.part.30+0x97/0x100 [btrfs]
[376007.682090]  [<ffffffffa00ee9f4>] record_root_in_trans+0x24/0x30 [btrfs]
[376007.682098]  [<ffffffffa00efeb1>]
btrfs_record_root_in_trans+0x51/0x80 [btrfs]
[376007.682106]  [<ffffffffa00f13d6>]
start_transaction.part.35+0x86/0x560 [btrfs]
[376007.682109]  [<ffffffff8132c197>] ? apparmor_capable+0x27/0x80
[376007.682117]  [<ffffffffa00f18d9>] start_transaction+0x29/0x30 [btrfs]
[376007.682125]  [<ffffffffa00f19a7>] btrfs_join_transaction+0x17/0x20 [btrfs]
[376007.682133]  [<ffffffffa00f7fa8>] btrfs_dirty_inode+0x58/0xe0 [btrfs]
[376007.682141]  [<ffffffffa00fcaf2>] btrfs_setattr+0xa2/0xf0 [btrfs]
[376007.682144]  [<ffffffff811eec74>] notify_change+0x1c4/0x3b0
[376007.682146]  [<ffffffff811dde96>] ? final_putname+0x26/0x50
[376007.682149]  [<ffffffff811d088d>] chown_common+0x16d/0x1a0
[376007.682153]  [<ffffffff811f2b08>] ? __mnt_want_write+0x58/0x70
[376007.682156]  [<ffffffff811d1a8f>] SyS_fchownat+0xbf/0x100
[376007.682159]  [<ffffffff811d1aed>] SyS_chown+0x1d/0x20
[376007.682163]  [<ffffffff817858bf>] tracesys+0xe1/0xe6
[376007.682165] ---[ end trace 1853311c87a5cd94 ]---

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-02 23:35 ENOSPC with mkdir and rename Peter Waller
@ 2014-08-03  0:28 ` Mitch Harder
  2014-08-03  1:52   ` Nick Krause
  2014-08-03  2:39 ` Russell Coker
  2014-08-04  1:38 ` Qu Wenruo
  2 siblings, 1 reply; 44+ messages in thread
From: Mitch Harder @ 2014-08-03  0:28 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs

On Sat, Aug 2, 2014 at 6:35 PM, Peter Waller <peter@scraperwiki.com> wrote:
> Hi All,
>
> My TL;DR questions are at the bottom, before the stack trace.
>
> I'm running Ubuntu 14.04. I wonder if this problem is related to the
> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
> Ubuntu 14.04" which I started on the 29th of July:
>
>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
>
> Kernel: 3.15.7-031507-generic
>
> I'm on a single block device system, i.e, no RAID.
>
> I was observing ENOSPC from `mkdir` and `rename` on this system, with
> a good amount of free disk space (df -h reports 62 GB remain). I added
> enospc_debug (full umount/mount, not just mount -o remount), but this
> had no apparent effect when receiving ENOSPC from userland.
>
> $ sudo btrfs fi df /path/to/volume
> Data, single: total=489.97GiB, used=427.75GiB
> System, DUP: total=8.00MiB, used=60.00KiB
> System, single: total=4.00MiB, used=0.00
> Metadata, DUP: total=5.00GiB, used=4.50GiB
> Metadata, single: total=8.00MiB, used=0.00
> unknown, single: total=512.00MiB, used=820.00KiB
>
> After a thorough search of the internet for ENOSPC BTRFS I found
> various resources and came to understand a little bit more. One thing
> which broke my intuition severely is that I expected if there is a
> large number of free GiB, I should expect things to continue to work.
>
> In this case, for example, metadata has 0.5GiB free ("sounds like
> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
> would I get ENOSPC for a file rename?
>
> I expected that if metadata needed more space, it would just eat it
> from the 'data'. Now I believe this not to be the case and that it
> wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.
>
> I tried a rebalance with btrfs balance start -dusage=10 and tried
> increasing the value until I saw reallocations in dmesg.
>
> This spat out a large number of messages in dmesg, of this form:
>
>> [376096.546353] BTRFS info (device dm-0): relocating block group 530457821184 flags 1
>> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance
>
> (and a full stack trace at the end of this message).
>
> The rebalance printed:
>
>> ERROR: error during balancing '/path/to/volume' - No space left on device
>> There may be more info in syslog - try dmesg | tail
>
> Eventually, not knowing what else to do I had to take my escape hatch
> and enlarge the volume. When I did this, metadata grew by 1GiB:
>
>> Data, single: total=490.97GiB, used=427.75GiB
>> System, DUP: total=8.00MiB, used=60.00KiB
>> System, single: total=4.00MiB, used=0.00
>> Metadata, DUP: total=5.50GiB, used=4.50GiB
>> Metadata, single: total=8.00MiB, used=0.00
>> unknown, single: total=512.00MiB, used=0.00
>
> A few questions:
>
> * Why didn't the metadata grow before enlarging the disk?
> * Why didn't the rebalance enable the metadata to grow?
> * Why is it necessary to rebalance? Can't it automatically take some
> free space from 'data'?
> * Are my machine lockups related to the fact I was low on space?
> * Can we improve the documentation/FAQ for this? I was scratching my
> head in particular because my notion of free space definitely does not
> match up with BTRFS', and I didn't find the FAQ very helpful for
> getting out of this mess.
> * It isn't documented on the wiki what enospc_debug is supposed to do,
> so I couldn't tell whether I should have expected it to tell me
> anything in my circumstances.
> * What is the best course of action to take (other than enlarging the
> disk or deleting files) if I encounter this situation again?
>

Looking at this line:

> Data, single: total=489.97GiB, used=427.75GiB

I see that btrfs has allocated almost the entire disk to Data, and it
appears you are starved for Metadata room.

Once btrfs allocates space for either Data or Metadata, there are
currently no build-in kernel mechanisms re-allocate that space.  We
have to use the userland balance tools.

I agree that this behavior can become a "gotcha".  Btrfs has the
capability to run in a mode where Data and Metadata are combined, but
there is a speed penalty running in Mixed Data/Metadata mode.

The btrfs balance tools have to ability to use filters to run a
quicker pass on just the mostly-empty blocks, skipping a full balance.

https://btrfs.wiki.kernel.org/index.php/Balance_Filters

I would suggest this as the next step.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-03  0:28 ` Mitch Harder
@ 2014-08-03  1:52   ` Nick Krause
  0 siblings, 0 replies; 44+ messages in thread
From: Nick Krause @ 2014-08-03  1:52 UTC (permalink / raw)
  To: Mitch Harder; +Cc: Peter Waller, linux-btrfs

On Sat, Aug 2, 2014 at 8:28 PM, Mitch Harder
<mitch.harder@sabayonlinux.org> wrote:
> On Sat, Aug 2, 2014 at 6:35 PM, Peter Waller <peter@scraperwiki.com> wrote:
>> Hi All,
>>
>> My TL;DR questions are at the bottom, before the stack trace.
>>
>> I'm running Ubuntu 14.04. I wonder if this problem is related to the
>> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
>> Ubuntu 14.04" which I started on the 29th of July:
>>
>>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
>>
>> Kernel: 3.15.7-031507-generic
>>
>> I'm on a single block device system, i.e, no RAID.
>>
>> I was observing ENOSPC from `mkdir` and `rename` on this system, with
>> a good amount of free disk space (df -h reports 62 GB remain). I added
>> enospc_debug (full umount/mount, not just mount -o remount), but this
>> had no apparent effect when receiving ENOSPC from userland.
>>
>> $ sudo btrfs fi df /path/to/volume
>> Data, single: total=489.97GiB, used=427.75GiB
>> System, DUP: total=8.00MiB, used=60.00KiB
>> System, single: total=4.00MiB, used=0.00
>> Metadata, DUP: total=5.00GiB, used=4.50GiB
>> Metadata, single: total=8.00MiB, used=0.00
>> unknown, single: total=512.00MiB, used=820.00KiB
>>
>> After a thorough search of the internet for ENOSPC BTRFS I found
>> various resources and came to understand a little bit more. One thing
>> which broke my intuition severely is that I expected if there is a
>> large number of free GiB, I should expect things to continue to work.
>>
>> In this case, for example, metadata has 0.5GiB free ("sounds like
>> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
>> would I get ENOSPC for a file rename?
>>
>> I expected that if metadata needed more space, it would just eat it
>> from the 'data'. Now I believe this not to be the case and that it
>> wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.
>>
>> I tried a rebalance with btrfs balance start -dusage=10 and tried
>> increasing the value until I saw reallocations in dmesg.
>>
>> This spat out a large number of messages in dmesg, of this form:
>>
>>> [376096.546353] BTRFS info (device dm-0): relocating block group 530457821184 flags 1
>>> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance
>>
>> (and a full stack trace at the end of this message).
>>
>> The rebalance printed:
>>
>>> ERROR: error during balancing '/path/to/volume' - No space left on device
>>> There may be more info in syslog - try dmesg | tail
>>
>> Eventually, not knowing what else to do I had to take my escape hatch
>> and enlarge the volume. When I did this, metadata grew by 1GiB:
>>
>>> Data, single: total=490.97GiB, used=427.75GiB
>>> System, DUP: total=8.00MiB, used=60.00KiB
>>> System, single: total=4.00MiB, used=0.00
>>> Metadata, DUP: total=5.50GiB, used=4.50GiB
>>> Metadata, single: total=8.00MiB, used=0.00
>>> unknown, single: total=512.00MiB, used=0.00
>>
>> A few questions:
>>
>> * Why didn't the metadata grow before enlarging the disk?
>> * Why didn't the rebalance enable the metadata to grow?
>> * Why is it necessary to rebalance? Can't it automatically take some
>> free space from 'data'?
>> * Are my machine lockups related to the fact I was low on space?
>> * Can we improve the documentation/FAQ for this? I was scratching my
>> head in particular because my notion of free space definitely does not
>> match up with BTRFS', and I didn't find the FAQ very helpful for
>> getting out of this mess.
>> * It isn't documented on the wiki what enospc_debug is supposed to do,
>> so I couldn't tell whether I should have expected it to tell me
>> anything in my circumstances.
>> * What is the best course of action to take (other than enlarging the
>> disk or deleting files) if I encounter this situation again?
>>
>
> Looking at this line:
>
>> Data, single: total=489.97GiB, used=427.75GiB
>
> I see that btrfs has allocated almost the entire disk to Data, and it
> appears you are starved for Metadata room.
>
> Once btrfs allocates space for either Data or Metadata, there are
> currently no build-in kernel mechanisms re-allocate that space.  We
> have to use the userland balance tools.
>
> I agree that this behavior can become a "gotcha".  Btrfs has the
> capability to run in a mode where Data and Metadata are combined, but
> there is a speed penalty running in Mixed Data/Metadata mode.
>
> The btrfs balance tools have to ability to use filters to run a
> quicker pass on just the mostly-empty blocks, skipping a full balance.
>
> https://btrfs.wiki.kernel.org/index.php/Balance_Filters
>
> I would suggest this as the next step.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch,
I have run into this error to and this seems to be a rather big issue as ext4
seems to never run of metadata room at least from my testing. I feel greatly
that this part of btrfs needs be improved and moved into a function or set
of functions for re balancing metadata in the kernel itself.
Regards Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-02 23:35 ENOSPC with mkdir and rename Peter Waller
  2014-08-03  0:28 ` Mitch Harder
@ 2014-08-03  2:39 ` Russell Coker
  2014-08-03  2:59   ` Nick Krause
  2014-08-04  1:38 ` Qu Wenruo
  2 siblings, 1 reply; 44+ messages in thread
From: Russell Coker @ 2014-08-03  2:39 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs

On Sun, 3 Aug 2014 00:35:28 Peter Waller wrote:
> I'm running Ubuntu 14.04. I wonder if this problem is related to the
> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
> 
> Ubuntu 14.04" which I started on the 29th of July:
> > http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
> 
> Kernel: 3.15.7-031507-generic

As an aside, I'm still on 3.14 kernels for my systems and have no immediate 
plans to use 3.15.  There has been discussion here about a number of problems 
with 3.15, so I don't think that any testing I do with 3.15 will help the 
developers and it will just take more of my time.

> $ sudo btrfs fi df /path/to/volume
> Data, single: total=489.97GiB, used=427.75GiB
> Metadata, DUP: total=5.00GiB, used=4.50GiB

As has been noted you are using all the space in 1G data chunks and the system 
can't allocate more 256M metadata chunks (which are allocated in pairs because 
it's "DUP" so allocating 512M at a time.

> In this case, for example, metadata has 0.5GiB free ("sounds like
> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
> would I get ENOSPC for a file rename?

Some space is always reserved.  Due to the way BTRFS works changes to a file 
requires writing a new copy of the tree.  So the amount of metadata space 
required for an operation that is conceptually simple can be significant.

One thing that can sometimes solve that problem is to delete a subvol.  But 
note that it can take a considerable amount of time to free the space, 
particularly if you are running out of metadata space.  So you could delete a 
couple of subvols, run "sync" a couple of times, and have a coffee break.

If possible avoid rebooting as that can make things much worse.  This was a 
particular problem with kernels 3.13 and earlier which could enter a CPU loop 
requiring a reboot and then you would have big problems.

> I tried a rebalance with btrfs balance start -dusage=10 and tried
> increasing the value until I saw reallocations in dmesg.

/sbin/btrfs fi balance start -dusage=30 -musage=10 /

It's a good idea to have a cron job running a rebalance.  Above is what I use 
on some of my systems, it will free data chunks that are up to 30% used and 
metadata chunks that are only 10% used.  It almost never frees metadata chunks 
and regularly frees data chunks which is what I want.

> and enlarge the volume. When I did this, metadata grew by 1GiB:
> > Data, single: total=490.97GiB, used=427.75GiB
> > System, DUP: total=8.00MiB, used=60.00KiB
> > System, single: total=4.00MiB, used=0.00
> > Metadata, DUP: total=5.50GiB, used=4.50GiB
> > Metadata, single: total=8.00MiB, used=0.00
> > unknown, single: total=512.00MiB, used=0.00

Now that you have solved that problem you could balance the filesystem 
(deallocating ~60 data chunks) and then shrink it.  In the past I've added a 
USB flash disk to a filesystem to give it enough space to allow a balance and 
then removed it (NB you have to do a btrfs remove before removing the USB 
stick).

> * Why didn't the metadata grow before enlarging the disk?
> * Why didn't the rebalance enable the metadata to grow?
> * Why is it necessary to rebalance? Can't it automatically take some
> free space from 'data'?

It would be nice if it could automatically rebalance.  It's theoretically 
possible as the btrfs program just asks the kernel to do it.  But there's 
nothing stopping you from having a regular cron job to do it.  You could even 
write a daemon to poll the status of a btrfs filesystem and run balance when 
appropriate if you were keen enough.

> * What is the best course of action to take (other than enlarging the
> disk or deleting files) if I encounter this situation again?

Have a cron job run a balance regularly.

On Sat, 2 Aug 2014 21:52:36 Nick Krause wrote:
> I have run into this error to and this seems to be a rather big issue as
> ext4 seems to never run of metadata room at least from my testing. I feel
> greatly that this part of btrfs needs be improved and moved into a function
> or set of functions for re balancing metadata in the kernel itself.

Ext4 has fixed size Inode tables that are assigned at mkfs time.  If you run 
out of Inodes then you can't create new files.  If you have too big Inode 
tables then you waste disk space and have a longer fsck time (at least before 
uninit_bg).

The other metadata for Ext4 is allocated from data blocks so it will run out 
when data space runs out (EG if mkdir fails due to lack of space on ext4 then 
you can delete a file to make it work).

But really BTRFS is just a totally different filesystem.  Ext4 lacks the 
features such as full data checksums and subvolume support that make these 
things difficult.

I always found the CP/M filesystem to be easier.  It was when they added 
support for directories that things started getting difficult.  :-#

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-03  2:39 ` Russell Coker
@ 2014-08-03  2:59   ` Nick Krause
  0 siblings, 0 replies; 44+ messages in thread
From: Nick Krause @ 2014-08-03  2:59 UTC (permalink / raw)
  To: russell; +Cc: Peter Waller, linux-btrfs@vger.kernel.org SYSTEM list:BTRFS FILE

On Sat, Aug 2, 2014 at 10:39 PM, Russell Coker <russell@coker.com.au> wrote:
> On Sun, 3 Aug 2014 00:35:28 Peter Waller wrote:
>> I'm running Ubuntu 14.04. I wonder if this problem is related to the
>> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
>>
>> Ubuntu 14.04" which I started on the 29th of July:
>> > http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
>>
>> Kernel: 3.15.7-031507-generic
>
> As an aside, I'm still on 3.14 kernels for my systems and have no immediate
> plans to use 3.15.  There has been discussion here about a number of problems
> with 3.15, so I don't think that any testing I do with 3.15 will help the
> developers and it will just take more of my time.
>
>> $ sudo btrfs fi df /path/to/volume
>> Data, single: total=489.97GiB, used=427.75GiB
>> Metadata, DUP: total=5.00GiB, used=4.50GiB
>
> As has been noted you are using all the space in 1G data chunks and the system
> can't allocate more 256M metadata chunks (which are allocated in pairs because
> it's "DUP" so allocating 512M at a time.
>
>> In this case, for example, metadata has 0.5GiB free ("sounds like
>> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
>> would I get ENOSPC for a file rename?
>
> Some space is always reserved.  Due to the way BTRFS works changes to a file
> requires writing a new copy of the tree.  So the amount of metadata space
> required for an operation that is conceptually simple can be significant.
>
> One thing that can sometimes solve that problem is to delete a subvol.  But
> note that it can take a considerable amount of time to free the space,
> particularly if you are running out of metadata space.  So you could delete a
> couple of subvols, run "sync" a couple of times, and have a coffee break.
>
> If possible avoid rebooting as that can make things much worse.  This was a
> particular problem with kernels 3.13 and earlier which could enter a CPU loop
> requiring a reboot and then you would have big problems.
>
>> I tried a rebalance with btrfs balance start -dusage=10 and tried
>> increasing the value until I saw reallocations in dmesg.
>
> /sbin/btrfs fi balance start -dusage=30 -musage=10 /
>
> It's a good idea to have a cron job running a rebalance.  Above is what I use
> on some of my systems, it will free data chunks that are up to 30% used and
> metadata chunks that are only 10% used.  It almost never frees metadata chunks
> and regularly frees data chunks which is what I want.
>
>> and enlarge the volume. When I did this, metadata grew by 1GiB:
>> > Data, single: total=490.97GiB, used=427.75GiB
>> > System, DUP: total=8.00MiB, used=60.00KiB
>> > System, single: total=4.00MiB, used=0.00
>> > Metadata, DUP: total=5.50GiB, used=4.50GiB
>> > Metadata, single: total=8.00MiB, used=0.00
>> > unknown, single: total=512.00MiB, used=0.00
>
> Now that you have solved that problem you could balance the filesystem
> (deallocating ~60 data chunks) and then shrink it.  In the past I've added a
> USB flash disk to a filesystem to give it enough space to allow a balance and
> then removed it (NB you have to do a btrfs remove before removing the USB
> stick).
>
>> * Why didn't the metadata grow before enlarging the disk?
>> * Why didn't the rebalance enable the metadata to grow?
>> * Why is it necessary to rebalance? Can't it automatically take some
>> free space from 'data'?
>
> It would be nice if it could automatically rebalance.  It's theoretically
> possible as the btrfs program just asks the kernel to do it.  But there's
> nothing stopping you from having a regular cron job to do it.  You could even
> write a daemon to poll the status of a btrfs filesystem and run balance when
> appropriate if you were keen enough.
>
>> * What is the best course of action to take (other than enlarging the
>> disk or deleting files) if I encounter this situation again?
>
> Have a cron job run a balance regularly.
>
> On Sat, 2 Aug 2014 21:52:36 Nick Krause wrote:
>> I have run into this error to and this seems to be a rather big issue as
>> ext4 seems to never run of metadata room at least from my testing. I feel
>> greatly that this part of btrfs needs be improved and moved into a function
>> or set of functions for re balancing metadata in the kernel itself.
>
> Ext4 has fixed size Inode tables that are assigned at mkfs time.  If you run
> out of Inodes then you can't create new files.  If you have too big Inode
> tables then you waste disk space and have a longer fsck time (at least before
> uninit_bg).
>
> The other metadata for Ext4 is allocated from data blocks so it will run out
> when data space runs out (EG if mkdir fails due to lack of space on ext4 then
> you can delete a file to make it work).
>
> But really BTRFS is just a totally different filesystem.  Ext4 lacks the
> features such as full data checksums and subvolume support that make these
> things difficult.
>
> I always found the CP/M filesystem to be easier.  It was when they added
> support for directories that things started getting difficult.  :-#
>
> --
> My Main Blog         http://etbe.coker.com.au/
> My Documents Blog    http://doc.coker.com.au/
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

No that's fine seems valid as of reading this message. Thanks again Russell.
Regards Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-02 23:35 ENOSPC with mkdir and rename Peter Waller
  2014-08-03  0:28 ` Mitch Harder
  2014-08-03  2:39 ` Russell Coker
@ 2014-08-04  1:38 ` Qu Wenruo
  2014-08-04  8:14   ` Peter Waller
  2 siblings, 1 reply; 44+ messages in thread
From: Qu Wenruo @ 2014-08-04  1:38 UTC (permalink / raw)
  To: Peter Waller, linux-btrfs

Hi, Peter

Some explain below inline.
-------- Original Message --------
Subject: ENOSPC with mkdir and rename
From: Peter Waller <peter@scraperwiki.com>
To: <linux-btrfs@vger.kernel.org>
Date: 2014年08月03日 07:35
> Hi All,
>
> My TL;DR questions are at the bottom, before the stack trace.
>
> I'm running Ubuntu 14.04. I wonder if this problem is related to the
> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
> Ubuntu 14.04" which I started on the 29th of July:
>
>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
> Kernel: 3.15.7-031507-generic
>
> I'm on a single block device system, i.e, no RAID.
>
> I was observing ENOSPC from `mkdir` and `rename` on this system, with
> a good amount of free disk space (df -h reports 62 GB remain). I added
> enospc_debug (full umount/mount, not just mount -o remount), but this
> had no apparent effect when receiving ENOSPC from userland.
>
> $ sudo btrfs fi df /path/to/volume
> Data, single: total=489.97GiB, used=427.75GiB
> System, DUP: total=8.00MiB, used=60.00KiB
> System, single: total=4.00MiB, used=0.00
> Metadata, DUP: total=5.00GiB, used=4.50GiB
In fact, all your metadata is used.
It seems strange since there should be 500MB(to be precious 512MiB) 
free, but I'll explain it below.
> Metadata, single: total=8.00MiB, used=0.00
> unknown, single: total=512.00MiB, used=820.00KiB
Here the "unknown" is in fact "global data reserve", reserved for COW 
tree write (except FS-tree and subvolume tree if I'm right)
If you use latest btrfs-progs, it will not show "unknown" but 
"GlobalReserve" and it should not be used under most cases, but it is 
used, which really shows the shortage of space.

So saddly, there is really no space for metadata for mkdir and rename(*).

*: since rename will modify the metadata and since btrfs will do COW for 
metadata tree, and rename/mkdir
will not use space from global reserve, so ENOSPC is normal.

The good thing is that rm will steel space from global reserve, so you 
should be OK to remove files and hope to free
enough metadata space.
Or you can try to add more device to this btrfs.

Thanks,
Qu
>
> After a thorough search of the internet for ENOSPC BTRFS I found
> various resources and came to understand a little bit more. One thing
> which broke my intuition severely is that I expected if there is a
> large number of free GiB, I should expect things to continue to work.
>
> In this case, for example, metadata has 0.5GiB free ("sounds like
> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
> would I get ENOSPC for a file rename?
>
> I expected that if metadata needed more space, it would just eat it
> from the 'data'. Now I believe this not to be the case and that it
> wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.
>
> I tried a rebalance with btrfs balance start -dusage=10 and tried
> increasing the value until I saw reallocations in dmesg.
>
> This spat out a large number of messages in dmesg, of this form:
>
>> [376096.546353] BTRFS info (device dm-0): relocating block group 530457821184 flags 1
>> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance
> (and a full stack trace at the end of this message).
>
> The rebalance printed:
>
>> ERROR: error during balancing '/path/to/volume' - No space left on device
>> There may be more info in syslog - try dmesg | tail
> Eventually, not knowing what else to do I had to take my escape hatch
> and enlarge the volume. When I did this, metadata grew by 1GiB:
>
>> Data, single: total=490.97GiB, used=427.75GiB
>> System, DUP: total=8.00MiB, used=60.00KiB
>> System, single: total=4.00MiB, used=0.00
>> Metadata, DUP: total=5.50GiB, used=4.50GiB
>> Metadata, single: total=8.00MiB, used=0.00
>> unknown, single: total=512.00MiB, used=0.00
> A few questions:
>
> * Why didn't the metadata grow before enlarging the disk?
> * Why didn't the rebalance enable the metadata to grow?
> * Why is it necessary to rebalance? Can't it automatically take some
> free space from 'data'?
> * Are my machine lockups related to the fact I was low on space?
> * Can we improve the documentation/FAQ for this? I was scratching my
> head in particular because my notion of free space definitely does not
> match up with BTRFS', and I didn't find the FAQ very helpful for
> getting out of this mess.
> * It isn't documented on the wiki what enospc_debug is supposed to do,
> so I couldn't tell whether I should have expected it to tell me
> anything in my circumstances.
> * What is the best course of action to take (other than enlarging the
> disk or deleting files) if I encounter this situation again?
>
> Thanks in advance,
>
> - Peter
>
> [376007.681938] ------------[ cut here ]------------
> [376007.681957] WARNING: CPU: 1 PID: 27021 at
> /home/apw/COD/linux/fs/btrfs/extent-tree.c:6946
> use_block_rsv+0xfd/0x1a0 [btrfs]()
> [376007.681958] BTRFS: block rsv returned -28
> [376007.681959] Modules linked in: softdog tcp_diag inet_diag dm_crypt
> ppdev xen_fbfront fb_sys_fops syscopyarea sysfillrect sysimgblt
> i2c_piix4 serio_raw parport_pc parport mac_hid isofs xt_tcpudp
> iptable_filter xt_owner ip_tables x_tables btrfs xor raid6_pq
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd floppy psmouse
> [376007.681980] CPU: 1 PID: 27021 Comm: pam_script_ses_ Tainted: G
>     W     3.15.7-031507-generic #201407281235
> [376007.681981] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014
> [376007.681983]  0000000000001b22 ffff8800acca39d8 ffffffff8176f115
> 0000000000000007
> [376007.681986]  ffff8800acca3a28 ffff8800acca3a18 ffffffff8106ceac
> ffff8801efc37870
> [376007.681989]  ffff88017db0ff00 ffff8801aedcd800 0000000000001000
> ffff88001c987000
> [376007.681992] Call Trace:
> [376007.682000]  [<ffffffff8176f115>] dump_stack+0x46/0x58
> [376007.682005]  [<ffffffff8106ceac>] warn_slowpath_common+0x8c/0xc0
> [376007.682008]  [<ffffffff8106cf96>] warn_slowpath_fmt+0x46/0x50
> [376007.682016]  [<ffffffffa00d9d1d>] use_block_rsv+0xfd/0x1a0 [btrfs]
> [376007.682024]  [<ffffffffa00de687>] btrfs_alloc_free_block+0x57/0x220 [btrfs]
> [376007.682027]  [<ffffffff8178033c>] ? __do_page_fault+0x28c/0x550
> [376007.682031]  [<ffffffff8119749f>] ? page_add_file_rmap+0x6f/0xb0
> [376007.682037]  [<ffffffffa00c8a3c>] btrfs_copy_root+0xfc/0x2b0 [btrfs]
> [376007.682041]  [<ffffffff811c60b9>] ? memcg_check_events+0x29/0x50
> [376007.682051]  [<ffffffffa013a583>] ? create_reloc_root+0x33/0x2c0 [btrfs]
> [376007.682061]  [<ffffffffa013a743>] create_reloc_root+0x1f3/0x2c0 [btrfs]
> [376007.682064]  [<ffffffff811dd073>] ? generic_permission+0xf3/0x120
> [376007.682073]  [<ffffffffa0140eb8>] btrfs_init_reloc_root+0xb8/0xd0 [btrfs]
> [376007.682082]  [<ffffffffa00ee967>]
> record_root_in_trans.part.30+0x97/0x100 [btrfs]
> [376007.682090]  [<ffffffffa00ee9f4>] record_root_in_trans+0x24/0x30 [btrfs]
> [376007.682098]  [<ffffffffa00efeb1>]
> btrfs_record_root_in_trans+0x51/0x80 [btrfs]
> [376007.682106]  [<ffffffffa00f13d6>]
> start_transaction.part.35+0x86/0x560 [btrfs]
> [376007.682109]  [<ffffffff8132c197>] ? apparmor_capable+0x27/0x80
> [376007.682117]  [<ffffffffa00f18d9>] start_transaction+0x29/0x30 [btrfs]
> [376007.682125]  [<ffffffffa00f19a7>] btrfs_join_transaction+0x17/0x20 [btrfs]
> [376007.682133]  [<ffffffffa00f7fa8>] btrfs_dirty_inode+0x58/0xe0 [btrfs]
> [376007.682141]  [<ffffffffa00fcaf2>] btrfs_setattr+0xa2/0xf0 [btrfs]
> [376007.682144]  [<ffffffff811eec74>] notify_change+0x1c4/0x3b0
> [376007.682146]  [<ffffffff811dde96>] ? final_putname+0x26/0x50
> [376007.682149]  [<ffffffff811d088d>] chown_common+0x16d/0x1a0
> [376007.682153]  [<ffffffff811f2b08>] ? __mnt_want_write+0x58/0x70
> [376007.682156]  [<ffffffff811d1a8f>] SyS_fchownat+0xbf/0x100
> [376007.682159]  [<ffffffff811d1aed>] SyS_chown+0x1d/0x20
> [376007.682163]  [<ffffffff817858bf>] tracesys+0xe1/0xe6
> [376007.682165] ---[ end trace 1853311c87a5cd94 ]---
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  1:38 ` Qu Wenruo
@ 2014-08-04  8:14   ` Peter Waller
  2014-08-04  9:22     ` Clemens Eisserer
                       ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-04  8:14 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Thanks for responses.

All of this is *very* surprising. I'm not new to BTRFS, I've been
using it on my own machines for multiple years. I didn't realise there
was an un-holstered footgun on my lap at this point. How can it be
made clear how to avoid the ENOSPC problem to myself and other
sysadmins? Or preferably not exist as a problem?

One thing which continues to puzzle me is "How do I make an alarm to
warn of an impending ENOSPC condition on BTRFS?". ENOSPC is a bad
place to be.

All of the standard monitoring tools warn on the output of `df`.

My first thought was to make a graph and put a threshold in `metadata
total - used`. However, I was fortunate enough in this case to know
about `btrfs fi df`. When I looked at "metadata free" I concluded that
there is plenty free, not knowing that it was allocated in blocks
larger than the amount presented as free (total - used = 0.5GiB). So
these numbers were quite misleading in this case. If I had seen
total=used, or available=0, the problem would have been much clearer.

Why present space as available when it can't be used?

In the end, it seems that metadata should be able to steal space from
"data" on demand. That would make the output of "df" more informative,
since you wouldn't see "60 GB free" and get ENOSPC, which is an
utterly confusing situation and harmful to production.

Is there something fundamental preventing that from happening or is it
just that no-one has gotten around to yet?

Thanks,

- Peter


On 4 August 2014 02:38, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:
> Hi, Peter
>
> Some explain below inline.
>
> -------- Original Message --------
> Subject: ENOSPC with mkdir and rename
> From: Peter Waller <peter@scraperwiki.com>
> To: <linux-btrfs@vger.kernel.org>
> Date: 2014年08月03日 07:35
>>
>> Hi All,
>>
>> My TL;DR questions are at the bottom, before the stack trace.
>>
>> I'm running Ubuntu 14.04. I wonder if this problem is related to the
>> thread titled "Machine lockup due to btrfs-transaction on AWS EC2
>> Ubuntu 14.04" which I started on the 29th of July:
>>
>>> http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224
>>
>> Kernel: 3.15.7-031507-generic
>>
>> I'm on a single block device system, i.e, no RAID.
>>
>> I was observing ENOSPC from `mkdir` and `rename` on this system, with
>> a good amount of free disk space (df -h reports 62 GB remain). I added
>> enospc_debug (full umount/mount, not just mount -o remount), but this
>> had no apparent effect when receiving ENOSPC from userland.
>>
>> $ sudo btrfs fi df /path/to/volume
>> Data, single: total=489.97GiB, used=427.75GiB
>> System, DUP: total=8.00MiB, used=60.00KiB
>> System, single: total=4.00MiB, used=0.00
>> Metadata, DUP: total=5.00GiB, used=4.50GiB
>
> In fact, all your metadata is used.
> It seems strange since there should be 500MB(to be precious 512MiB) free,
> but I'll explain it below.
>
>> Metadata, single: total=8.00MiB, used=0.00
>> unknown, single: total=512.00MiB, used=820.00KiB
>
> Here the "unknown" is in fact "global data reserve", reserved for COW tree
> write (except FS-tree and subvolume tree if I'm right)
> If you use latest btrfs-progs, it will not show "unknown" but
> "GlobalReserve" and it should not be used under most cases, but it is used,
> which really shows the shortage of space.
>
> So saddly, there is really no space for metadata for mkdir and rename(*).
>
> *: since rename will modify the metadata and since btrfs will do COW for
> metadata tree, and rename/mkdir
> will not use space from global reserve, so ENOSPC is normal.
>
> The good thing is that rm will steel space from global reserve, so you
> should be OK to remove files and hope to free
> enough metadata space.
> Or you can try to add more device to this btrfs.
>
> Thanks,
> Qu
>>
>>
>> After a thorough search of the internet for ENOSPC BTRFS I found
>> various resources and came to understand a little bit more. One thing
>> which broke my intuition severely is that I expected if there is a
>> large number of free GiB, I should expect things to continue to work.
>>
>> In this case, for example, metadata has 0.5GiB free ("sounds like
>> plenty for metadata for one mkdir to me"). Data has 62GiB free. Why
>> would I get ENOSPC for a file rename?
>>
>> I expected that if metadata needed more space, it would just eat it
>> from the 'data'. Now I believe this not to be the case and that it
>> wanted to allocate > 0.5GiB, and this is why I was getting ENOSPC.
>>
>> I tried a rebalance with btrfs balance start -dusage=10 and tried
>> increasing the value until I saw reallocations in dmesg.
>>
>> This spat out a large number of messages in dmesg, of this form:
>>
>>> [376096.546353] BTRFS info (device dm-0): relocating block group
>>> 530457821184 flags 1
>>> [376010.736879] BTRFS info (device dm-0): 40 enospc errors during balance
>>
>> (and a full stack trace at the end of this message).
>>
>> The rebalance printed:
>>
>>> ERROR: error during balancing '/path/to/volume' - No space left on device
>>> There may be more info in syslog - try dmesg | tail
>>
>> Eventually, not knowing what else to do I had to take my escape hatch
>> and enlarge the volume. When I did this, metadata grew by 1GiB:
>>
>>> Data, single: total=490.97GiB, used=427.75GiB
>>> System, DUP: total=8.00MiB, used=60.00KiB
>>> System, single: total=4.00MiB, used=0.00
>>> Metadata, DUP: total=5.50GiB, used=4.50GiB
>>> Metadata, single: total=8.00MiB, used=0.00
>>> unknown, single: total=512.00MiB, used=0.00
>>
>> A few questions:
>>
>> * Why didn't the metadata grow before enlarging the disk?
>> * Why didn't the rebalance enable the metadata to grow?
>> * Why is it necessary to rebalance? Can't it automatically take some
>> free space from 'data'?
>> * Are my machine lockups related to the fact I was low on space?
>> * Can we improve the documentation/FAQ for this? I was scratching my
>> head in particular because my notion of free space definitely does not
>> match up with BTRFS', and I didn't find the FAQ very helpful for
>> getting out of this mess.
>> * It isn't documented on the wiki what enospc_debug is supposed to do,
>> so I couldn't tell whether I should have expected it to tell me
>> anything in my circumstances.
>> * What is the best course of action to take (other than enlarging the
>> disk or deleting files) if I encounter this situation again?
>>
>> Thanks in advance,
>>
>> - Peter
>>
>> [376007.681938] ------------[ cut here ]------------
>> [376007.681957] WARNING: CPU: 1 PID: 27021 at
>> /home/apw/COD/linux/fs/btrfs/extent-tree.c:6946
>> use_block_rsv+0xfd/0x1a0 [btrfs]()
>> [376007.681958] BTRFS: block rsv returned -28
>> [376007.681959] Modules linked in: softdog tcp_diag inet_diag dm_crypt
>> ppdev xen_fbfront fb_sys_fops syscopyarea sysfillrect sysimgblt
>> i2c_piix4 serio_raw parport_pc parport mac_hid isofs xt_tcpudp
>> iptable_filter xt_owner ip_tables x_tables btrfs xor raid6_pq
>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
>> aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd floppy psmouse
>> [376007.681980] CPU: 1 PID: 27021 Comm: pam_script_ses_ Tainted: G
>>     W     3.15.7-031507-generic #201407281235
>> [376007.681981] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014
>> [376007.681983]  0000000000001b22 ffff8800acca39d8 ffffffff8176f115
>> 0000000000000007
>> [376007.681986]  ffff8800acca3a28 ffff8800acca3a18 ffffffff8106ceac
>> ffff8801efc37870
>> [376007.681989]  ffff88017db0ff00 ffff8801aedcd800 0000000000001000
>> ffff88001c987000
>> [376007.681992] Call Trace:
>> [376007.682000]  [<ffffffff8176f115>] dump_stack+0x46/0x58
>> [376007.682005]  [<ffffffff8106ceac>] warn_slowpath_common+0x8c/0xc0
>> [376007.682008]  [<ffffffff8106cf96>] warn_slowpath_fmt+0x46/0x50
>> [376007.682016]  [<ffffffffa00d9d1d>] use_block_rsv+0xfd/0x1a0 [btrfs]
>> [376007.682024]  [<ffffffffa00de687>] btrfs_alloc_free_block+0x57/0x220
>> [btrfs]
>> [376007.682027]  [<ffffffff8178033c>] ? __do_page_fault+0x28c/0x550
>> [376007.682031]  [<ffffffff8119749f>] ? page_add_file_rmap+0x6f/0xb0
>> [376007.682037]  [<ffffffffa00c8a3c>] btrfs_copy_root+0xfc/0x2b0 [btrfs]
>> [376007.682041]  [<ffffffff811c60b9>] ? memcg_check_events+0x29/0x50
>> [376007.682051]  [<ffffffffa013a583>] ? create_reloc_root+0x33/0x2c0
>> [btrfs]
>> [376007.682061]  [<ffffffffa013a743>] create_reloc_root+0x1f3/0x2c0
>> [btrfs]
>> [376007.682064]  [<ffffffff811dd073>] ? generic_permission+0xf3/0x120
>> [376007.682073]  [<ffffffffa0140eb8>] btrfs_init_reloc_root+0xb8/0xd0
>> [btrfs]
>> [376007.682082]  [<ffffffffa00ee967>]
>> record_root_in_trans.part.30+0x97/0x100 [btrfs]
>> [376007.682090]  [<ffffffffa00ee9f4>] record_root_in_trans+0x24/0x30
>> [btrfs]
>> [376007.682098]  [<ffffffffa00efeb1>]
>> btrfs_record_root_in_trans+0x51/0x80 [btrfs]
>> [376007.682106]  [<ffffffffa00f13d6>]
>> start_transaction.part.35+0x86/0x560 [btrfs]
>> [376007.682109]  [<ffffffff8132c197>] ? apparmor_capable+0x27/0x80
>> [376007.682117]  [<ffffffffa00f18d9>] start_transaction+0x29/0x30 [btrfs]
>> [376007.682125]  [<ffffffffa00f19a7>] btrfs_join_transaction+0x17/0x20
>> [btrfs]
>> [376007.682133]  [<ffffffffa00f7fa8>] btrfs_dirty_inode+0x58/0xe0 [btrfs]
>> [376007.682141]  [<ffffffffa00fcaf2>] btrfs_setattr+0xa2/0xf0 [btrfs]
>> [376007.682144]  [<ffffffff811eec74>] notify_change+0x1c4/0x3b0
>> [376007.682146]  [<ffffffff811dde96>] ? final_putname+0x26/0x50
>> [376007.682149]  [<ffffffff811d088d>] chown_common+0x16d/0x1a0
>> [376007.682153]  [<ffffffff811f2b08>] ? __mnt_want_write+0x58/0x70
>> [376007.682156]  [<ffffffff811d1a8f>] SyS_fchownat+0xbf/0x100
>> [376007.682159]  [<ffffffff811d1aed>] SyS_chown+0x1d/0x20
>> [376007.682163]  [<ffffffff817858bf>] tracesys+0xe1/0xe6
>> [376007.682165] ---[ end trace 1853311c87a5cd94 ]---
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  8:14   ` Peter Waller
@ 2014-08-04  9:22     ` Clemens Eisserer
  2014-08-04  9:39     ` Chris Samuel
  2014-08-05  8:51     ` Qu Wenruo
  2 siblings, 0 replies; 44+ messages in thread
From: Clemens Eisserer @ 2014-08-04  9:22 UTC (permalink / raw)
  To: linux-btrfs

Hi Peter,

> All of this is *very* surprising. I'm not new to BTRFS, I've been
> using it on my own machines for multiple years. I didn't realise there
> was an un-holstered footgun on my lap at this point. How can it be
> made clear how to avoid the ENOSPC problem to myself and other
> sysadmins? Or preferably not exist as a problem?

I've also found the fixed metadata/data split to be an uncomfortable
implementation detail, and some more flexible approach would be very
welcome from my side.

So far I've used BTRFS' mixed mode mentioned in the mkfs.btrfs man page:

> -M|--mixedMix data and metadata chunks together for more efficient space utilization.
> This feature incurs a performance penalty in larger filesystems.
> It is recommended for use with filesystems of 1 GiB or smaller.

However I didn't find any information on how large the mentioned
overhead is, or where it originates from.

Best regards, Clemens

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  8:14   ` Peter Waller
  2014-08-04  9:22     ` Clemens Eisserer
@ 2014-08-04  9:39     ` Chris Samuel
  2014-08-04  9:56       ` Clemens Eisserer
  2014-08-04 10:09       ` Peter Waller
  2014-08-05  8:51     ` Qu Wenruo
  2 siblings, 2 replies; 44+ messages in thread
From: Chris Samuel @ 2014-08-04  9:39 UTC (permalink / raw)
  To: linux-btrfs

On Mon, 4 Aug 2014 09:14:19 AM Peter Waller wrote:

> All of this is *very* surprising.

Hmm, it shouldn't be, the ENOSPC issues are well known and have been discussed 
here for years.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  9:39     ` Chris Samuel
@ 2014-08-04  9:56       ` Clemens Eisserer
  2014-08-04 10:24         ` Chris Samuel
  2014-08-04 10:09       ` Peter Waller
  1 sibling, 1 reply; 44+ messages in thread
From: Clemens Eisserer @ 2014-08-04  9:56 UTC (permalink / raw)
  To: linux-btrfs

Hi Chris,

> Hmm, it shouldn't be, the ENOSPC issues are well known and have been discussed
> here for years.

Which doesn't protect the *average* user from running into issues like this.
Just because it has been discussed, doesn't mean nothing can/should be
done about it ;)

However, as I am only a user too and can't contribute in terms of
code, I keep patient and observe how btrfs is evolving.
One day or another, the ENOSPC issues will get fixed or worked arround,

Regards, Clemens

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  9:39     ` Chris Samuel
  2014-08-04  9:56       ` Clemens Eisserer
@ 2014-08-04 10:09       ` Peter Waller
  2014-08-04 10:22         ` Hugo Mills
                           ` (2 more replies)
  1 sibling, 3 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-04 10:09 UTC (permalink / raw)
  To: Chris Samuel; +Cc: linux-btrfs

On 4 August 2014 10:39, Chris Samuel <chris@csamuel.org> wrote:
> On Mon, 4 Aug 2014 09:14:19 AM Peter Waller wrote:
>> All of this is *very* surprising.
>
> Hmm, it shouldn't be, the ENOSPC issues are well known and have been discussed
> here for years.

I accept that. It's all very well if you read the BTRFS list and/or
are a BTRFS developer. But if you're trying to work it out in the heat
of battle, as we have sysadmins who would have to, there is a
combination of things here that makes it unreasonable and harmful for
production.

I was in a situation where I was getting sporadic ENOSPC and none of
the instructions I could find helped. I did a thorough search of the
wiki and mailing list - I found a plethora of similar sounding
problems and none of the advice given helped.

Our usage is a simple case: no RAID, no subvolumes, no snapshots. We
had >60GiB free and apparently some metadata free.

I still can't find a clear answer to the question "How do I make an
alarm to warn of an impending ENOSPC condition on BTRFS?"

Is that because there is no clear answer?

The nature of "running out of disk space" as a problem means you won't
hit it until you've been using it for a long while, which makes this
problem of the form "a ticking time bomb". Is there no way to make
this operationally easier? or should only BTRFS developers use BTRFS?

I'm breaking the rest out below if you are interested to try and
understand more the problems I was having.

Thanks,

- Peter

More thoughts to illustrate the problems with the existing documentation:

Getting started contains no warning of what's different about free
space compared with other filesystems one might be familiar with:

  https://btrfs.wiki.kernel.org/index.php/Getting_started

The sysadmin guide doesn't appear to mention free space at all:

  https://btrfs.wiki.kernel.org/index.php/SysadminGuide

The FAQ has a question:

  https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21

Which starts out "Free space is a tricky concept in Btrfs" but then
doesn't explain it very well. None of the advice given there helped in
my case. There is talk about a mixed mode, but not how to move an
existing filesystem to it. I'm yet to find an explanation of
rebalancing which isn't focussed on what it means for RAID, and it
still isn't crystal clear to me what rebalancing means for
metadata/data on one disk. Rebalancing didn't work in my case. Must I
construct an image of the underlying BTRFS datastructures in my head?
I'm fine if I have to do that, but nowhere makes it clear what mental
tools I need to tackle this.

This link is mentioned by the above but not directly linked to by it
(and has "are" and "is" changed compared with the above text):

https://btrfs.wiki.kernel.org/index.php/FAQ#Why_are_there_so_many_ways_to_check_the_amount_of_free_space.3F

This link would have helped a bit but wasn't cross referenced by any
of the other materials which I did find, so I couldn't find it in the
heat of battle:

https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space

One problem is that it isn't clear what "chunks" are. Does an operator
of a BTRFS filesystem need to understand this in the simple case of no
snapshots, no RAID?

How did the whole disk come to be allocated to data given that we
hadn't used all of it? Is it because the data is using chunks
inefficiently? How does this come to be in the simple case?

The documentation could use some illustrations to make this clear.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:09       ` Peter Waller
@ 2014-08-04 10:22         ` Hugo Mills
  2014-08-04 10:31           ` Peter Waller
  2014-08-04 11:04           ` Clemens Eisserer
  2014-08-04 10:50         ` Chris Samuel
  2014-08-10 17:26         ` Martin Steigerwald
  2 siblings, 2 replies; 44+ messages in thread
From: Hugo Mills @ 2014-08-04 10:22 UTC (permalink / raw)
  To: Peter Waller; +Cc: Chris Samuel, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5495 bytes --]

On Mon, Aug 04, 2014 at 11:09:23AM +0100, Peter Waller wrote:
> On 4 August 2014 10:39, Chris Samuel <chris@csamuel.org> wrote:
> > On Mon, 4 Aug 2014 09:14:19 AM Peter Waller wrote:
> >> All of this is *very* surprising.
> >
> > Hmm, it shouldn't be, the ENOSPC issues are well known and have been discussed
> > here for years.
> 
> I accept that. It's all very well if you read the BTRFS list and/or
> are a BTRFS developer. But if you're trying to work it out in the heat
> of battle, as we have sysadmins who would have to, there is a
> combination of things here that makes it unreasonable and harmful for
> production.
> 
> I was in a situation where I was getting sporadic ENOSPC and none of
> the instructions I could find helped. I did a thorough search of the
> wiki and mailing list - I found a plethora of similar sounding
> problems and none of the advice given helped.
> 
> Our usage is a simple case: no RAID, no subvolumes, no snapshots. We
> had >60GiB free and apparently some metadata free.
> 
> I still can't find a clear answer to the question "How do I make an
> alarm to warn of an impending ENOSPC condition on BTRFS?"

   On the 3.15+ kernels, the block reserve is split out of metadata
and reported separately. This helps with the following process:

 * btrfs fi show
    - look at the total and used values. If used < total, you're OK.
      If used == total, then you could potentially hit ENOSPC.

 * btrfs fi df
    - look at metadata used vs total. If these are close to zero (on
      3.15+) or close to 512 MiB (on <3.15), then you are in danger of
      ENOSPC.

    - look at data used vs total. If the used is much smaller than
      total, you can reclaim some of the allocation with a filtered
      balance (btrfs balance start -dusage=5), which will then give
      you unallocated space again (see the btrfs fi show test).

> Is that because there is no clear answer?
> 
> The nature of "running out of disk space" as a problem means you won't
> hit it until you've been using it for a long while, which makes this
> problem of the form "a ticking time bomb". Is there no way to make
> this operationally easier? or should only BTRFS developers use BTRFS?
>
> I'm breaking the rest out below if you are interested to try and
> understand more the problems I was having.
> 
> Thanks,
> 
> - Peter
> 
> More thoughts to illustrate the problems with the existing documentation:
> 
> Getting started contains no warning of what's different about free
> space compared with other filesystems one might be familiar with:
> 
>   https://btrfs.wiki.kernel.org/index.php/Getting_started
> 
> The sysadmin guide doesn't appear to mention free space at all:
> 
>   https://btrfs.wiki.kernel.org/index.php/SysadminGuide
> 
> The FAQ has a question:
> 
>   https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21
> 
> Which starts out "Free space is a tricky concept in Btrfs" but then
> doesn't explain it very well. None of the advice given there helped in
> my case. There is talk about a mixed mode, but not how to move an
> existing filesystem to it. I'm yet to find an explanation of
> rebalancing which isn't focussed on what it means for RAID, and it
> still isn't crystal clear to me what rebalancing means for
> metadata/data on one disk. Rebalancing didn't work in my case. Must I
> construct an image of the underlying BTRFS datastructures in my head?
> I'm fine if I have to do that, but nowhere makes it clear what mental
> tools I need to tackle this.

   This FAQ entry is pretty horrible, I'm afraid. I actually started
rewriting it here to try to make it clearer what's going on. I'll try
to work on it a bit more this week and put out a better version for
the wiki.

> This link is mentioned by the above but not directly linked to by it
> (and has "are" and "is" changed compared with the above text):
>
> https://btrfs.wiki.kernel.org/index.php/FAQ#Why_are_there_so_many_ways_to_check_the_amount_of_free_space.3F
> 
> This link would have helped a bit but wasn't cross referenced by any
> of the other materials which I did find, so I couldn't find it in the
> heat of battle:
> 
> https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space
> 
> One problem is that it isn't clear what "chunks" are. Does an operator
> of a BTRFS filesystem need to understand this in the simple case of no
> snapshots, no RAID?
> 
> How did the whole disk come to be allocated to data given that we
> hadn't used all of it? Is it because the data is using chunks
> inefficiently? How does this come to be in the simple case?

   Two ways: Write lots of data, delete it again. (This could also
happen with snapshots). Alternatively, kernels earlier than about 3.10
had a bug that massively overallocated data chunks when it didn't need
to.

   Please do feel free to add more crosslinks or text to the wiki to
make it clearer where to look. The "pretty horrible" FAQ entry
mentioned above is the canonical location for dealing with early
ENOSPC problems, so other things should probably point at that.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- You stay in the theatre because you're afraid of having no ---    
                         money? There's irony...                         

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  9:56       ` Clemens Eisserer
@ 2014-08-04 10:24         ` Chris Samuel
  2014-08-05  8:06           ` Duncan
  0 siblings, 1 reply; 44+ messages in thread
From: Chris Samuel @ 2014-08-04 10:24 UTC (permalink / raw)
  To: linux-btrfs

On Mon, 4 Aug 2014 11:56:46 AM Clemens Eisserer wrote:

> Which doesn't protect the *average* user from running into issues like this.

No, but they need to be aware of it.

> Just because it has been discussed, doesn't mean nothing can/should be done
> about it

Indeed, and a lot of work has been done over the years on it and it's a lot 
better than it used to be. :-)

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:22         ` Hugo Mills
@ 2014-08-04 10:31           ` Peter Waller
  2014-08-04 10:39             ` Hugo Mills
  2014-08-04 17:09             ` Austin S Hemmelgarn
  2014-08-04 11:04           ` Clemens Eisserer
  1 sibling, 2 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-04 10:31 UTC (permalink / raw)
  To: Hugo Mills, Peter Waller, Chris Samuel, linux-btrfs

Thanks Hugo, this is the most informative e-mail yet! (more inline)

On 4 August 2014 11:22, Hugo Mills <hugo@carfax.org.uk> wrote:
>
>  * btrfs fi show
>     - look at the total and used values. If used < total, you're OK.
>       If used == total, then you could potentially hit ENOSPC.

Another thing which is unclear and undocumented anywhere I can find is
what the meaning of `btrfs fi show` is.

I'm sure it is totally obvious if you are a developer or if you have
used it for long enough. But it isn't covered in the manpage, nor in
the oracle documentation, nor anywhere on the wiki that I could find.

When I looked at it in my problematic situation, it said "500 GiB /
500 GiB". That sounded fine to me because I interpreted the output as
what fraction of which RAID devices BTRFS was using. In other words, I
thought "Oh, BTRFS will just make use of the whole device that's
available to it.". I thought that `btrfs fi df` was the source of
information for how much space was free inside of that.

>  * btrfs fi df
>     - look at metadata used vs total. If these are close to zero (on
>       3.15+) or close to 512 MiB (on <3.15), then you are in danger of
>       ENOSPC.

Hmm. It's unfortunate that this could indicate an amount of space
which is free when it actually isn't.

>     - look at data used vs total. If the used is much smaller than
>       total, you can reclaim some of the allocation with a filtered
>       balance (btrfs balance start -dusage=5), which will then give
>       you unallocated space again (see the btrfs fi show test).

So the filtered balance didn't help in my situation. I understand it's
something to do with the "5" parameter. But I do not understand what
the impact of changing this parameter is. It is something to do with a
fraction of something, but those things are still not present in my
mental model despite a large amount of reading. Is there an
illustration which could clear this up?

Among other things I also got the kernel stack trace I pasted at the
bottom of the first e-mail to this thread when I did the rebalance.

>    This FAQ entry is pretty horrible, I'm afraid. I actually started
> rewriting it here to try to make it clearer what's going on. I'll try
> to work on it a bit more this week and put out a better version for
> the wiki.

This is great to hear! :)

Thanks for your response Hugo, that really cleared up a lot of mental
model problems. I hope the documentation can be improved so that
others can learn from my mistakes.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:31           ` Peter Waller
@ 2014-08-04 10:39             ` Hugo Mills
  2014-08-04 10:48               ` Peter Waller
  2014-08-04 17:09             ` Austin S Hemmelgarn
  1 sibling, 1 reply; 44+ messages in thread
From: Hugo Mills @ 2014-08-04 10:39 UTC (permalink / raw)
  To: Peter Waller; +Cc: Chris Samuel, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4027 bytes --]

On Mon, Aug 04, 2014 at 11:31:57AM +0100, Peter Waller wrote:
> Thanks Hugo, this is the most informative e-mail yet! (more inline)
> 
> On 4 August 2014 11:22, Hugo Mills <hugo@carfax.org.uk> wrote:
> >
> >  * btrfs fi show
> >     - look at the total and used values. If used < total, you're OK.
> >       If used == total, then you could potentially hit ENOSPC.
> 
> Another thing which is unclear and undocumented anywhere I can find is
> what the meaning of `btrfs fi show` is.
> 
> I'm sure it is totally obvious if you are a developer or if you have
> used it for long enough. But it isn't covered in the manpage, nor in
> the oracle documentation, nor anywhere on the wiki that I could find.
> 
> When I looked at it in my problematic situation, it said "500 GiB /
> 500 GiB". That sounded fine to me because I interpreted the output as
> what fraction of which RAID devices BTRFS was using. In other words, I
> thought "Oh, BTRFS will just make use of the whole device that's
> available to it.". I thought that `btrfs fi df` was the source of
> information for how much space was free inside of that.

   That's actually pretty much accurate. The problem is that btrfs
distinguishes between "space available for data" and "space available
for metadata", and doesn't trade off one for the other once they've
been allocated. The balance operation frees up some of the allocation,
allowing the newly-freed space to be allocated again for something
else.

   All of the information about the data/metadata split, and what's
used out of that, is revealed by btrfs fi df.

> >  * btrfs fi df
> >     - look at metadata used vs total. If these are close to zero (on
> >       3.15+) or close to 512 MiB (on <3.15), then you are in danger of
> >       ENOSPC.
> 
> Hmm. It's unfortunate that this could indicate an amount of space
> which is free when it actually isn't.

   That's why the 512 MiB block reserve was split out of metadata --
so that you don't look at metadata and say "oh, I've got half a gig
free, that's OK".

> >     - look at data used vs total. If the used is much smaller than
> >       total, you can reclaim some of the allocation with a filtered
> >       balance (btrfs balance start -dusage=5), which will then give
> >       you unallocated space again (see the btrfs fi show test).
> 
> So the filtered balance didn't help in my situation. I understand it's
> something to do with the "5" parameter. But I do not understand what
> the impact of changing this parameter is. It is something to do with a
> fraction of something, but those things are still not present in my
> mental model despite a large amount of reading. Is there an
> illustration which could clear this up?

   The 5 is 5%. So, it'll only look at chunks which are less than 5%
full. David Sterba published a patch that would balance the
(approximately N) least-used chunks, which is a considerably more
usable approach, but I don't know what happened to that one.

> Among other things I also got the kernel stack trace I pasted at the
> bottom of the first e-mail to this thread when I did the rebalance.

   OK, I'll go back and read that. You probably shouldn't have had it,
though. :)

> >    This FAQ entry is pretty horrible, I'm afraid. I actually started
> > rewriting it here to try to make it clearer what's going on. I'll try
> > to work on it a bit more this week and put out a better version for
> > the wiki.
> 
> This is great to hear! :)
> 
> Thanks for your response Hugo, that really cleared up a lot of mental
> model problems. I hope the documentation can be improved so that
> others can learn from my mistakes.

   I do try to work on it every so often. Note to self: win lottery,
or get cloned.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- You stay in the theatre because you're afraid of having no ---    
                         money? There's irony...                         

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:39             ` Hugo Mills
@ 2014-08-04 10:48               ` Peter Waller
  2014-08-04 11:29                 ` Hugo Mills
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Waller @ 2014-08-04 10:48 UTC (permalink / raw)
  To: Hugo Mills, Peter Waller, Chris Samuel, linux-btrfs

On 4 August 2014 11:39, Hugo Mills <hugo@carfax.org.uk> wrote:
>> >  * btrfs fi df
>> >     - look at metadata used vs total. If these are close to zero (on
>> >       3.15+) or close to 512 MiB (on <3.15), then you are in danger of
>> >       ENOSPC.
>>
>> Hmm. It's unfortunate that this could indicate an amount of space
>> which is free when it actually isn't.
>
>    That's why the 512 MiB block reserve was split out of metadata --
> so that you don't look at metadata and say "oh, I've got half a gig
> free, that's OK".

I don't quite follow this. Is it a recent development I missed? When
was it "split out"? More recently than the software I'm using?
Otherwise I'm having difficulty parsing this.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:09       ` Peter Waller
  2014-08-04 10:22         ` Hugo Mills
@ 2014-08-04 10:50         ` Chris Samuel
  2014-08-04 10:59           ` Peter Waller
  2014-08-10 17:26         ` Martin Steigerwald
  2 siblings, 1 reply; 44+ messages in thread
From: Chris Samuel @ 2014-08-04 10:50 UTC (permalink / raw)
  To: linux-btrfs

On Mon, 4 Aug 2014 11:09:23 AM Peter Waller wrote:

> I accept that. It's all very well if you read the BTRFS list and/or
> are a BTRFS developer. But if you're trying to work it out in the heat
> of battle, as we have sysadmins who would have to, there is a
> combination of things here that makes it unreasonable and harmful for
> production.

To be honest I'm not sure I'd suggest btrfs for production use at all at 
present, it's only recently been unmarked as experimental and to be honest I 
feel that was premature. :-(

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:50         ` Chris Samuel
@ 2014-08-04 10:59           ` Peter Waller
  2014-08-04 21:27             ` Chris Samuel
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Waller @ 2014-08-04 10:59 UTC (permalink / raw)
  To: Chris Samuel; +Cc: linux-btrfs

On 4 August 2014 11:50, Chris Samuel <chris@csamuel.org> wrote:
> To be honest I'm not sure I'd suggest btrfs for production use at all at
> present, it's only recently been unmarked as experimental and to be honest I
> feel that was premature. :-(

Thanks for the honest answer.

There are very positive signals out there which I had perhaps taken
too literally. I'd love to see it become ready, there are a lot of things
about BTRFS which appeal greatly. So I hope I'm helping by trying
to make it clear the problems that I encountered.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:22         ` Hugo Mills
  2014-08-04 10:31           ` Peter Waller
@ 2014-08-04 11:04           ` Clemens Eisserer
  2014-08-04 11:32             ` Hugo Mills
  1 sibling, 1 reply; 44+ messages in thread
From: Clemens Eisserer @ 2014-08-04 11:04 UTC (permalink / raw)
  To: linux-btrfs

Hi Hugo,

>    On the 3.15+ kernels, the block reserve is split out of metadata
> and reported separately. This helps with the following process:

Thanks a lot for pointing this out, I hadn't noticed this change until now.

One thing I didn't find any information about is the overhead
introduced by mixied-mode.
It would be great if you could explain it in a few sentences.

Thank you in advance, Clemens

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:48               ` Peter Waller
@ 2014-08-04 11:29                 ` Hugo Mills
  0 siblings, 0 replies; 44+ messages in thread
From: Hugo Mills @ 2014-08-04 11:29 UTC (permalink / raw)
  To: Peter Waller; +Cc: Chris Samuel, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1658 bytes --]

On Mon, Aug 04, 2014 at 11:48:17AM +0100, Peter Waller wrote:
> On 4 August 2014 11:39, Hugo Mills <hugo@carfax.org.uk> wrote:
> >> >  * btrfs fi df
> >> >     - look at metadata used vs total. If these are close to zero (on
> >> >       3.15+) or close to 512 MiB (on <3.15), then you are in danger of
> >> >       ENOSPC.
> >>
> >> Hmm. It's unfortunate that this could indicate an amount of space
> >> which is free when it actually isn't.
> >
> >    That's why the 512 MiB block reserve was split out of metadata --
> > so that you don't look at metadata and say "oh, I've got half a gig
> > free, that's OK".
> 
> I don't quite follow this. Is it a recent development I missed? When
> was it "split out"? More recently than the software I'm using?
> Otherwise I'm having difficulty parsing this.

   It's purely a change in the way that the kernel reports this info.
Before 3.15, the block reserve was included in the "Metadata" report
in btrfs fi df. After 3.15, the kernel reports the block reserve as
its own separate item in btrfs fi df (either as "BlockRsv", or
"unknown", depending on how old your userspace is). The theory is, the
change is made to make it clearer how much is used/reserved/free and
thus to make this kind of calculation simpler in the long run.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Reading Mein Kampf won't make you a Nazi. Reading Das Kapital ---  
         won't make you a communist. But most trolls started out         
                    with a copy of Lord of the Rings.                    

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 11:04           ` Clemens Eisserer
@ 2014-08-04 11:32             ` Hugo Mills
  2014-08-04 13:17               ` Peter Waller
  0 siblings, 1 reply; 44+ messages in thread
From: Hugo Mills @ 2014-08-04 11:32 UTC (permalink / raw)
  To: Clemens Eisserer; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 943 bytes --]

On Mon, Aug 04, 2014 at 01:04:25PM +0200, Clemens Eisserer wrote:
> Hi Hugo,
> 
> >    On the 3.15+ kernels, the block reserve is split out of metadata
> > and reported separately. This helps with the following process:
> 
> Thanks a lot for pointing this out, I hadn't noticed this change until now.
> 
> One thing I didn't find any information about is the overhead
> introduced by mixied-mode.
> It would be great if you could explain it in a few sentences.

   I don't know, I'm afraid. I don't think we've got any benchmarks on
the scale of the slowdown.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- Reading Mein Kampf won't make you a Nazi. Reading Das Kapital ---  
         won't make you a communist. But most trolls started out         
                    with a copy of Lord of the Rings.                    

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 11:32             ` Hugo Mills
@ 2014-08-04 13:17               ` Peter Waller
  2014-08-04 13:35                 ` Hugo Mills
                                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-04 13:17 UTC (permalink / raw)
  To: linux-btrfs

For anyone else having this problem, this article is fairly useful for
understanding disk full problems and rebalance:

http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html

It actually covers the problem that I had, which is that a rebalance
can't take place because it is full.

I still am unsure what is really wrong with this whole situation. Is
it that I wasn't careful to do a rebalance when I should have been
doing? Is it that BTRFS doesn't do a rebalance automatically when it
could in principle?

It's pretty bad to end up in a situation (with spare space) where the
only way out is to add more storage, which may be impractical,
difficult or expensive.

The other thing that I still don't understand I've seen repeated in a
few places, from the above article:

"because the filesystem is only 55% full, I can ask balance to rewrite
all chunks that are more than 55% full"

Then he uses `btrfs balance start -dusage=55 /mnt/btrfs_pool1`. I
don't understand the relationship between "the FS is 55% full" and
"chunks more than 55% full". What's going on here?

I conclude that now since I have added more storage, the rebalance
won't fail and if I keep rebalancing from a cron job I won't hit this
problem again (unless the filesystem fills up very fast! what then?).
I don't know however what value to assign to `-dusage` in general for
the cron rebalance. Any hints?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 13:17               ` Peter Waller
@ 2014-08-04 13:35                 ` Hugo Mills
  2014-08-04 14:02                 ` Austin S Hemmelgarn
  2014-08-04 14:47                 ` Russell Coker
  2 siblings, 0 replies; 44+ messages in thread
From: Hugo Mills @ 2014-08-04 13:35 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3269 bytes --]

On Mon, Aug 04, 2014 at 02:17:02PM +0100, Peter Waller wrote:
> For anyone else having this problem, this article is fairly useful for
> understanding disk full problems and rebalance:
> 
> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
> 
> It actually covers the problem that I had, which is that a rebalance
> can't take place because it is full.
> 
> I still am unsure what is really wrong with this whole situation. Is
> it that I wasn't careful to do a rebalance when I should have been
> doing? Is it that BTRFS doesn't do a rebalance automatically when it
> could in principle?

   This latter one.

   Well, actually two things: the FS should be capable of autonomously
rebalancing at low bandwidth to prevent this problem, but nobody's got
round to implementing it yet. Secondly, it should not be possible to
get into a state where you can't run the balance -- Josef spent about
three kernel revisions fixing the block reserve code to that end.
However, since about 3.14, there's been more cases like yours show up,
so I think there's been a regression. It's not very common, though. I
think we've had maybe a dozen reported instances in the last 6 months.
Someone on IRC had it just now, though, and captured a metadata image,
so at least we've got some (meta)data to work with now.

> It's pretty bad to end up in a situation (with spare space) where the
> only way out is to add more storage, which may be impractical,
> difficult or expensive.
> 
> The other thing that I still don't understand I've seen repeated in a
> few places, from the above article:
> 
> "because the filesystem is only 55% full, I can ask balance to rewrite
> all chunks that are more than 55% full"
> 
> Then he uses `btrfs balance start -dusage=55 /mnt/btrfs_pool1`. I
> don't understand the relationship between "the FS is 55% full" and
> "chunks more than 55% full". What's going on here?

   Pigeonhole principle -- if the FS is 55% full, there must be at
least one chunk <= 55% full.

> I conclude that now since I have added more storage, the rebalance
> won't fail and if I keep rebalancing from a cron job I won't hit this
> problem again (unless the filesystem fills up very fast! what then?).
> I don't know however what value to assign to `-dusage` in general for
> the cron rebalance. Any hints?

   Try with increasing values until you've moved as many chunks as you
want to. This is what David's "balance at least N chunks" patch did.
I'd suggest start with 5, and go up in increments of 5, if you're
making it an automatic process. Stop when you reach some threshold
(like, say, 80), or when it reports that it's actually moved some
chunks.

   Doing it manually, I usually recommend 5, 10, 20, 50, 80.

   Hugo.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Well, you don't get to be a kernel hacker simply by looking ---   
                    good in Speedos. -- Rusty Russell                    

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 13:17               ` Peter Waller
  2014-08-04 13:35                 ` Hugo Mills
@ 2014-08-04 14:02                 ` Austin S Hemmelgarn
  2014-08-04 14:11                   ` Peter Waller
  2014-08-04 14:47                 ` Russell Coker
  2 siblings, 1 reply; 44+ messages in thread
From: Austin S Hemmelgarn @ 2014-08-04 14:02 UTC (permalink / raw)
  To: Peter Waller, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4437 bytes --]

On 2014-08-04 09:17, Peter Waller wrote:
> For anyone else having this problem, this article is fairly useful for
> understanding disk full problems and rebalance:
> 
> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
> 
> It actually covers the problem that I had, which is that a rebalance
> can't take place because it is full.
> 
> I still am unsure what is really wrong with this whole situation. Is
> it that I wasn't careful to do a rebalance when I should have been
> doing? Is it that BTRFS doesn't do a rebalance automatically when it
> could in principle?
> 
> It's pretty bad to end up in a situation (with spare space) where the
> only way out is to add more storage, which may be impractical,
> difficult or expensive.
I really disagree with the statement that adding more storage is
difficult or expensive, all you need to do is plug in a 2G USB flash
drive, or allocate a ramdisk, and add the device to the filesystem only
long enough to do a full balance.
> 
> The other thing that I still don't understand I've seen repeated in a
> few places, from the above article:
> 
> "because the filesystem is only 55% full, I can ask balance to rewrite
> all chunks that are more than 55% full"
> 
> Then he uses `btrfs balance start -dusage=55 /mnt/btrfs_pool1`. I
> don't understand the relationship between "the FS is 55% full" and
> "chunks more than 55% full". What's going on here?
To understand this, you have to understand that BTRFS uses a two level
allocation scheme, at the top level, you have chunks, which are
contiguous regions of the disk that get used for storing a specific
block type.  For data chunks, these default to 1G in size, for metadata,
they default to 256M in size.  When a filesystem is created, you get the
minimum number of chunks of each type based on the replication profiles
chosen for each chunk type; with no extra options, this means 1 data
chunk and 2 metadata chunks for a single disk filesystem.  Within each
chunk, BTRFS then allocates and frees individual blocks on demand, these
blocks are the analogue of blocks in most other filesystems.  When there
are no free blocks in any chunks of a given type, BTRFS then allocates
new chunks of that type based on the replication profile.  Unlike blocks
however, chunks aren't freed automatically (there are good reasons for
this behavior, but they are kind of long to explain here), this is where
balance comes in, it takes all of the blocks in the filesystem, and
sends them back through the block allocator.  This usually causes all of
the free blocks to end up in a single chunk, and frees the unneeded chunks.

When someone talks about a chunk being x% full, they mean that x% of the
space in that chunk is used by allocated blocks.  Talking about how full
the filesystem is can get tricky because of the replication profiles,
but the usual consensus is to treat that as the percentage of the
filesystem that contains blocks that are being used.

It should say LESS than 55% full in the various articles, as the
-dusage=x option tells balance to only consider chunks that are less
than 55% full for balancing.  In general, if your filesystem is totally
full, you should use numbers starting with 0, and working your way up
from there.  You may even get lucky, and using -dusage=0 -musage=0 may
free up enough chunks that you don't need to add more storage.
> 
> I conclude that now since I have added more storage, the rebalance
> won't fail and if I keep rebalancing from a cron job I won't hit this
> problem again (unless the filesystem fills up very fast! what then?).
> I don't know however what value to assign to `-dusage` in general for
> the cron rebalance. Any hints?
I've found that something between 25 and 50 tends to do well, much
outside of that range and you start to get diminishing returns.  The
exact value tends to be more personal preference, I use 25 on most of my
systems, because I don't like saturating the disks with I/O for very
long.  Do make sure however to add -musage=x as well, metadata also
should be balanced (especially if you have very large numbers of small
files).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 14:02                 ` Austin S Hemmelgarn
@ 2014-08-04 14:11                   ` Peter Waller
  2014-08-04 14:26                     ` Austin S Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Waller @ 2014-08-04 14:11 UTC (permalink / raw)
  To: Austin S Hemmelgarn; +Cc: linux-btrfs

On 4 August 2014 15:02, Austin S Hemmelgarn <ahferroin7@gmail.com> wrote:
> I really disagree with the statement that adding more storage is
> difficult or expensive, all you need to do is plug in a 2G USB flash
> drive, or allocate a ramdisk, and add the device to the filesystem only
> long enough to do a full balance.

What if the machine is a server in a datacenter you don't have
physical access to and the problem is an emergency preventing your
users from being able to get work done?

What happens if you use a RAM disk and there is a power failure?

Thanks for the other explanations and advice also,

- Peter

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 14:11                   ` Peter Waller
@ 2014-08-04 14:26                     ` Austin S Hemmelgarn
  0 siblings, 0 replies; 44+ messages in thread
From: Austin S Hemmelgarn @ 2014-08-04 14:26 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 991 bytes --]

On 2014-08-04 10:11, Peter Waller wrote:
> On 4 August 2014 15:02, Austin S Hemmelgarn <ahferroin7@gmail.com> wrote:
>> I really disagree with the statement that adding more storage is
>> difficult or expensive, all you need to do is plug in a 2G USB flash
>> drive, or allocate a ramdisk, and add the device to the filesystem only
>> long enough to do a full balance.
> 
> What if the machine is a server in a datacenter you don't have
> physical access to and the problem is an emergency preventing your
> users from being able to get work done?
> 
> What happens if you use a RAM disk and there is a power failure?
> 
I'm not saying that either option is a perfect solution.  In fact, the
only reason that I even mentioned the ramdisk is because I have had good
success with that on my laptop, but then laptops essentially have a
built-in UPS.  I personally wouldn't use a ramdisk except as a last
resort if you don't have some sort of UPS or redundancy in the PSU.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 13:17               ` Peter Waller
  2014-08-04 13:35                 ` Hugo Mills
  2014-08-04 14:02                 ` Austin S Hemmelgarn
@ 2014-08-04 14:47                 ` Russell Coker
  2014-08-04 15:19                   ` Mitch Harder
  2 siblings, 1 reply; 44+ messages in thread
From: Russell Coker @ 2014-08-04 14:47 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs

On Mon, 4 Aug 2014 14:17:02 Peter Waller wrote:
> For anyone else having this problem, this article is fairly useful for
> understanding disk full problems and rebalance:
> 
> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-> Full-Problems.html
> 
> It actually covers the problem that I had, which is that a rebalance
> can't take place because it is full.
> 
> I still am unsure what is really wrong with this whole situation. Is
> it that I wasn't careful to do a rebalance when I should have been
> doing? Is it that BTRFS doesn't do a rebalance automatically when it
> could in principle?

Yes and yes.  The fact that BTRFS can't avoid getting into such situations and 
can't recover when it does are both bugs in BTRFS.  The fact that you didn't 
run a balance to prevent this is due to not being careful enough with a 
filesystem that's still in a development stage.

> It's pretty bad to end up in a situation (with spare space) where the
> only way out is to add more storage, which may be impractical,
> difficult or expensive.

Absolutely.

> I conclude that now since I have added more storage, the rebalance
> won't fail and if I keep rebalancing from a cron job I won't hit this
> problem again

Yes.

> (unless the filesystem fills up very fast! what then?).
> I don't know however what value to assign to `-dusage` in general for
> the cron rebalance. Any hints?

If you regularly run a scrub with options such as "-dusage=50 -musage=10" then 
the amount of free space in metadata chunks will tend to be a lot greater than 
that in data chunks.

Another option I've considered is to write a program that creates millions of 
files with 1000 byte random file names.  After creating a filesystem I could 
run that program to cause a sufficient number of metadata chunks to be 
allocated and then remove the subvol containing all those files (which 
incidentally is a lot faster than "rm -rf").

Another thing I've considered is making a filesystem for a file server with a 
RAID-1 array of SSDs and running the above program to allocate all chunks for 
metadata.  Then when the SSDs are totally assigned to metadata I would add a 
pair of SATA disks for data.  A filesystem with all metadata on SSD and all 
data on SATA disks should give great performance as well as having lots of 
space.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 14:47                 ` Russell Coker
@ 2014-08-04 15:19                   ` Mitch Harder
  0 siblings, 0 replies; 44+ messages in thread
From: Mitch Harder @ 2014-08-04 15:19 UTC (permalink / raw)
  To: russell; +Cc: Peter Waller, linux-btrfs

On Mon, Aug 4, 2014 at 9:47 AM, Russell Coker <russell@coker.com.au> wrote:
> If you regularly run a scrub with options such as "-dusage=50 -musage=10" then
> the amount of free space in metadata chunks will tend to be a lot greater than
> that in data chunks.
>

Just to clarify for posterity, I'm pretty sure you meant 'balance'
with "-dusage=50 -musage=10" instead of 'scrub'.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:31           ` Peter Waller
  2014-08-04 10:39             ` Hugo Mills
@ 2014-08-04 17:09             ` Austin S Hemmelgarn
  2014-08-05  8:20               ` Duncan
  1 sibling, 1 reply; 44+ messages in thread
From: Austin S Hemmelgarn @ 2014-08-04 17:09 UTC (permalink / raw)
  To: Peter Waller; +Cc: Hugo Mills, Chris Samuel, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4234 bytes --]

On 2014-08-04 06:31, Peter Waller wrote:
> Thanks Hugo, this is the most informative e-mail yet! (more inline)
> 
> On 4 August 2014 11:22, Hugo Mills <hugo@carfax.org.uk> wrote:
>>
>>  * btrfs fi show
>>     - look at the total and used values. If used < total, you're OK.
>>       If used == total, then you could potentially hit ENOSPC.
> 
> Another thing which is unclear and undocumented anywhere I can find is
> what the meaning of `btrfs fi show` is.
> 
> I'm sure it is totally obvious if you are a developer or if you have
> used it for long enough. But it isn't covered in the manpage, nor in
> the oracle documentation, nor anywhere on the wiki that I could find.
> 
You didn't look very hard then, because there is information in the
manpage (oh wait, you mentioned Oracle, your probably using RHEL or
CentOS, which are the last thing you should be using if you want to use
stuff like BTRFS that is under heavy development), and it is documented
on the wiki.
> When I looked at it in my problematic situation, it said "500 GiB /
> 500 GiB". That sounded fine to me because I interpreted the output as
> what fraction of which RAID devices BTRFS was using. In other words, I
> thought "Oh, BTRFS will just make use of the whole device that's
> available to it.". I thought that `btrfs fi df` was the source of
> information for how much space was free inside of that.
> 
>>  * btrfs fi df
>>     - look at metadata used vs total. If these are close to zero (on
>>       3.15+) or close to 512 MiB (on <3.15), then you are in danger of
>>       ENOSPC.
> 
> Hmm. It's unfortunate that this could indicate an amount of space
> which is free when it actually isn't.
That depends on what you mean by 'free'.
> 
>>     - look at data used vs total. If the used is much smaller than
>>       total, you can reclaim some of the allocation with a filtered
>>       balance (btrfs balance start -dusage=5), which will then give
>>       you unallocated space again (see the btrfs fi show test).
> 
> So the filtered balance didn't help in my situation. I understand it's
> something to do with the "5" parameter. But I do not understand what
> the impact of changing this parameter is. It is something to do with a
> fraction of something, but those things are still not present in my
> mental model despite a large amount of reading. Is there an
> illustration which could clear this up?
> 
Think of each chunk like a box, and each block as a block, and that you
have two different types of block (data and metadata) and two different
types of box (also data and metadata). The data boxes are four times the
size of the metadata boxes, and they all have to fit in one really big
container (the device itself).  You can only put data blocks in the data
boxs, and you can only put metadata blocks in metadata boxes.  Say that
in total, you can fit 128 data boxes in the large container, or you can
replace one data box with up to four metadata boxes.  Even though you
may only have a few blocks in a given box, the box still takes up the
same amount of space in the larger container.  Thus, it's possible to
have only a few blocks stored, but not be able to add any more boxes to
the larger container.  A balance operation is essentially the equivalent
of taking all of the blocks of a given type, and fitting them into the
smallest number of boxes possible.
> Among other things I also got the kernel stack trace I pasted at the
> bottom of the first e-mail to this thread when I did the rebalance.
> 
>>    This FAQ entry is pretty horrible, I'm afraid. I actually started
>> rewriting it here to try to make it clearer what's going on. I'll try
>> to work on it a bit more this week and put out a better version for
>> the wiki.
> 
> This is great to hear! :)
> 
> Thanks for your response Hugo, that really cleared up a lot of mental
> model problems. I hope the documentation can be improved so that
> others can learn from my mistakes.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:59           ` Peter Waller
@ 2014-08-04 21:27             ` Chris Samuel
  0 siblings, 0 replies; 44+ messages in thread
From: Chris Samuel @ 2014-08-04 21:27 UTC (permalink / raw)
  To: linux-btrfs

Hi Peter,

On Mon, 4 Aug 2014 11:59:19 AM Peter Waller wrote:

> On 4 August 2014 11:50, Chris Samuel <chris@csamuel.org> wrote:
>
> > To be honest I'm not sure I'd suggest btrfs for production use at all at
> > present, it's only recently been unmarked as experimental and to be honest
> > I feel that was premature.
> 
> Thanks for the honest answer.

That's OK, I am enthusiastic about btrfs (being a pre-mainline merge user), 
but I don't think it serves it well to signal that it's more ready than it 
actually is.

> There are very positive signals out there which I had perhaps taken
> too literally. I'd love to see it become ready, there are a lot of things
> about BTRFS which appeal greatly. So I hope I'm helping by trying
> to make it clear the problems that I encountered.

Oh indeed, please don't take my brief replies as being anything other than 
brief due to preparing to do some astronomy! :-)

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:24         ` Chris Samuel
@ 2014-08-05  8:06           ` Duncan
  2014-08-05 12:20             ` Russell Coker
  0 siblings, 1 reply; 44+ messages in thread
From: Duncan @ 2014-08-05  8:06 UTC (permalink / raw)
  To: linux-btrfs

Chris Samuel posted on Mon, 04 Aug 2014 20:24:46 +1000 as excerpted:

> On Mon, 4 Aug 2014 11:56:46 AM Clemens Eisserer wrote:
> 
>> Which doesn't protect the *average* user from running into issues like
>> this.
> 
> No, but they need to be aware of it.

Actually, an ordinary user/admin /should/ have no more need to be aware 
of it than they do on any other filesystem.  Since that issue doesn't 
occur on ext* or reiserfs, to pick two examples I'm familiar with, they 
shouldn't need to worry about it on btrfs either.  But then, just such an 
"ordinary admin" shouldn't yet be running btrfs on their system, as it's 
simply not to that point of readiness and maturity yet.

Which is why I'm not particularly happy with seeing all the "btrfs is 
still not stable, use at your own risk" warnings disappearing.  With them 
there, people who chose to run btrfs /could/ be expected to have done 
their research and have btrfs specific knowledge such as this, because 
btrfs was clearly marked as /not/ ready for "ordinary users" not prepared 
to do such research on their own.

But now that those warnings are all being removed, btrfs should "just 
work" for all those "ordinary users".

But it doesn't.  Btrfs is still special and requires btrfs-domain 
specific knowledge to properly administer, as the fixes that would remove 
that requirement, in this case perhaps a background thread that would 
check for data/metadata imbalance and at least log a warning suggesting a 
rebalance, if not triggering that rebalance on its own, simply aren't 
there yet.

IMO, without those fixes, btrfs is still experimental, or at least not 
entirely stable yet and requiring btrfs-domain-specific knowledge, and 
should keep the warnings saying exactly that.  Unfortunately...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 17:09             ` Austin S Hemmelgarn
@ 2014-08-05  8:20               ` Duncan
  2014-08-05 11:31                 ` Austin S Hemmelgarn
  0 siblings, 1 reply; 44+ messages in thread
From: Duncan @ 2014-08-05  8:20 UTC (permalink / raw)
  To: linux-btrfs

Austin S Hemmelgarn posted on Mon, 04 Aug 2014 13:09:23 -0400 as
excerpted:

> Think of each chunk like a box, and each block as a block, and that you
> have two different types of block (data and metadata) and two different
> types of box (also data and metadata). The data boxes are four times the
> size of the metadata boxes, and they all have to fit in one really big
> container (the device itself).  You can only put data blocks in the data
> boxs, and you can only put metadata blocks in metadata boxes.  Say that
> in total, you can fit 128 data boxes in the large container, or you can
> replace one data box with up to four metadata boxes.  Even though you
> may only have a few blocks in a given box, the box still takes up the
> same amount of space in the larger container.  Thus, it's possible to
> have only a few blocks stored, but not be able to add any more boxes to
> the larger container.  A balance operation is essentially the equivalent
> of taking all of the blocks of a given type, and fitting them into the
> smallest number of boxes possible.

FWIW, that's a great analogy to stick up on the wiki somewhere, probably 
somewhere in the FAQ related to ENOSPC.  Please consider doing so.

(Someone took one of my explanations from the list and stuck it in the 
wiki, virtually word-for-word, with a link to the list post in the 
archives for more.  I was glad, as for some reason I just seem to work 
best on the lists, and seem to treat web pages as read-only, even if 
they're on a wiki I in theory have or can get write-privs on.  I'm 
suggesting someone, doesn't have to be you tho great if it is, do the 
same with this.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04  8:14   ` Peter Waller
  2014-08-04  9:22     ` Clemens Eisserer
  2014-08-04  9:39     ` Chris Samuel
@ 2014-08-05  8:51     ` Qu Wenruo
  2014-08-05 12:17       ` Russell Coker
  2 siblings, 1 reply; 44+ messages in thread
From: Qu Wenruo @ 2014-08-05  8:51 UTC (permalink / raw)
  To: Peter Waller; +Cc: linux-btrfs


-------- Original Message --------
Subject: Re: ENOSPC with mkdir and rename
From: Peter Waller <peter@scraperwiki.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Date: 2014年08月04日 16:14
> Thanks for responses.
>
> All of this is *very* surprising. I'm not new to BTRFS, I've been
> using it on my own machines for multiple years. I didn't realise there
> was an un-holstered footgun on my lap at this point. How can it be
> made clear how to avoid the ENOSPC problem to myself and other
> sysadmins? Or preferably not exist as a problem?
[snip]

In fact such "defeat"(or whatever) is not really btrfs only problem.
In ext*, there is still similiar behavior: ext* has a up limit on the 
number of inode after mkfs.
(When you mkfs.ext*, you are prompt the up limit of inodes)
However other metadata in ext* is stored together with data, so no 
ENOSPC problem like btrfs.

Btrfs only makes ENOSPC easier to happen by completly split data and 
metadata, and does extra
data reserve for metadata.

If you like the ext* way, as already mentioned you can mkfs.btrfs with 
-M flag.

But IMO, some tuning in btrfs chunk allocation algorithm may helps.
For example, we have a 20G disk, and 14G space is allocated to 
data/metadata chunks.
Under such sitiuation, if btrfs needs new data chunk, it will allocate 
up to 10% of disk, which is 2G.
But if it comes to metadata, it will only allocate up to 256M metadata 
chunk.
This makes it very easy to allocate the rest of space all to data chunk.

But if btrfs can use the free space in a more diligent way when space is 
not enough,
metadata and data usage should be more balanced and less ENOSPC will occur.

If nobody dislike the idea, I'd like try to implent this later.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05  8:20               ` Duncan
@ 2014-08-05 11:31                 ` Austin S Hemmelgarn
  0 siblings, 0 replies; 44+ messages in thread
From: Austin S Hemmelgarn @ 2014-08-05 11:31 UTC (permalink / raw)
  To: Duncan, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2023 bytes --]

On 2014-08-05 04:20, Duncan wrote:
> Austin S Hemmelgarn posted on Mon, 04 Aug 2014 13:09:23 -0400 as
> excerpted:
> 
>> Think of each chunk like a box, and each block as a block, and that you
>> have two different types of block (data and metadata) and two different
>> types of box (also data and metadata). The data boxes are four times the
>> size of the metadata boxes, and they all have to fit in one really big
>> container (the device itself).  You can only put data blocks in the data
>> boxs, and you can only put metadata blocks in metadata boxes.  Say that
>> in total, you can fit 128 data boxes in the large container, or you can
>> replace one data box with up to four metadata boxes.  Even though you
>> may only have a few blocks in a given box, the box still takes up the
>> same amount of space in the larger container.  Thus, it's possible to
>> have only a few blocks stored, but not be able to add any more boxes to
>> the larger container.  A balance operation is essentially the equivalent
>> of taking all of the blocks of a given type, and fitting them into the
>> smallest number of boxes possible.
> 
> FWIW, that's a great analogy to stick up on the wiki somewhere, probably 
> somewhere in the FAQ related to ENOSPC.  Please consider doing so.
> 
> (Someone took one of my explanations from the list and stuck it in the 
> wiki, virtually word-for-word, with a link to the list post in the 
> archives for more.  I was glad, as for some reason I just seem to work 
> best on the lists, and seem to treat web pages as read-only, even if 
> they're on a wiki I in theory have or can get write-privs on.  I'm 
> suggesting someone, doesn't have to be you tho great if it is, do the 
> same with this.)
> 
I would love to have it up on the wiki, but don't have an account or
write privileges.  FWIW, I consider anything I post on a mailing list
that isn't marked otherwise (except patches) to be public domain, so
everyone feel free to use it however you want.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05  8:51     ` Qu Wenruo
@ 2014-08-05 12:17       ` Russell Coker
  0 siblings, 0 replies; 44+ messages in thread
From: Russell Coker @ 2014-08-05 12:17 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Tue, 5 Aug 2014 16:51:44 Qu Wenruo wrote:
> In fact such "defeat"(or whatever) is not really btrfs only problem.
> In ext*, there is still similiar behavior: ext* has a up limit on the 
> number of inode after mkfs.
> (When you mkfs.ext*, you are prompt the up limit of inodes)
> However other metadata in ext* is stored together with data, so no 
> ENOSPC problem like btrfs.

There is a huge difference between BTRFS and Ext* in this regard.

The way that Ext* has always worked is that if you delete one file, pipe or 
socket that isn't hard-linked, or one sym-link or directory then you free up 1 
Inode.  1 free Inode allows you to create 1 file, pipe, socket, sym-link, or 
directory.

Deleting a file or directory on BTRFS takes MORE metadata space (at least 
temporarily) because it  writes a new copy of the tree.  So not only will 
deleting files not immediately solve a lack of metadata space on BTRFS but it 
might even make things worse.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05  8:06           ` Duncan
@ 2014-08-05 12:20             ` Russell Coker
  2014-08-05 12:58               ` Clemens Eisserer
                                 ` (3 more replies)
  0 siblings, 4 replies; 44+ messages in thread
From: Russell Coker @ 2014-08-05 12:20 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Tue, 5 Aug 2014 08:06:12 Duncan wrote:
> Which is why I'm not particularly happy with seeing all the "btrfs is 
> still not stable, use at your own risk" warnings disappearing.  With them 
> there, people who chose to run btrfs /could/ be expected to have done 
> their research and have btrfs specific knowledge such as this, because 
> btrfs was clearly marked as /not/ ready for "ordinary users" not prepared 
> to do such research on their own.
> 
> But now that those warnings are all being removed, btrfs should "just 
> work" for all those "ordinary users".
> 
> But it doesn't.  Btrfs is still special and requires btrfs-domain 
> specific knowledge to properly administer, as the fixes that would remove 
> that requirement, in this case perhaps a background thread that would 
> check for data/metadata imbalance and at least log a warning suggesting a 
> rebalance, if not triggering that rebalance on its own, simply aren't 
> there yet.

Currently the Debian/Jessie freeze is approaching.  The Debian kernel team 
have chosen 3.16 and don't have any plans for significant back-ports from 
later kernels.

Based on what I've read on this list it seems that BTRFS is less stable in 
3.15 than in 3.14.  Even 3.14 isn't something I'd recommend to random people 
who want something to just work.

The Debian installer has BTRFS in a list of filesystems to choose with no 
special notice about it.  I'm thinking of filing a Debian bug requesting that 
they put a warning against it.

What do people here think?

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:20             ` Russell Coker
@ 2014-08-05 12:58               ` Clemens Eisserer
  2014-08-05 13:02                 ` Peter Waller
  2014-08-10 17:21                 ` Martin Steigerwald
  2014-08-05 13:36               ` Chris Samuel
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 44+ messages in thread
From: Clemens Eisserer @ 2014-08-05 12:58 UTC (permalink / raw)
  To: linux-btrfs

Hi Russel,

> The Debian installer has BTRFS in a list of filesystems to choose with no
> special notice about it.  I'm thinking of filing a Debian bug requesting that
> they put a warning against it.

As long as it is not selected as the default filesystem, I think it is fine.
Other distributions have been offering btrfs for some time now, too.

Regards, Clemens

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:58               ` Clemens Eisserer
@ 2014-08-05 13:02                 ` Peter Waller
  2014-08-10 17:21                 ` Martin Steigerwald
  1 sibling, 0 replies; 44+ messages in thread
From: Peter Waller @ 2014-08-05 13:02 UTC (permalink / raw)
  To: Clemens Eisserer; +Cc: linux-btrfs

On 5 August 2014 13:58, Clemens Eisserer <linuxhippy@gmail.com> wrote:
> As long as it is not selected as the default filesystem, I think it is fine.
> Other distributions have been offering btrfs for some time now, too.

How do you warn non-BTRFS-developers in this case that they need to
run a regular rebalance or they may end up in a
difficult/expensive/impossible to fix ENOSPC condition at an
inconvenient moment?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:20             ` Russell Coker
  2014-08-05 12:58               ` Clemens Eisserer
@ 2014-08-05 13:36               ` Chris Samuel
  2014-08-06  0:04               ` Duncan
  2014-08-06  0:38               ` ronnie sahlberg
  3 siblings, 0 replies; 44+ messages in thread
From: Chris Samuel @ 2014-08-05 13:36 UTC (permalink / raw)
  To: linux-btrfs

On Tue, 5 Aug 2014 10:20:33 PM Russell Coker wrote:

> The Debian installer has BTRFS in a list of filesystems to choose with no 
> special notice about it.  I'm thinking of filing a Debian bug requesting
> that  they put a warning against it.

I think it's a good plan.  People should be aware of the risks they are 
running.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:20             ` Russell Coker
  2014-08-05 12:58               ` Clemens Eisserer
  2014-08-05 13:36               ` Chris Samuel
@ 2014-08-06  0:04               ` Duncan
  2014-08-06  0:38               ` ronnie sahlberg
  3 siblings, 0 replies; 44+ messages in thread
From: Duncan @ 2014-08-06  0:04 UTC (permalink / raw)
  To: linux-btrfs

Russell Coker posted on Tue, 05 Aug 2014 22:20:33 +1000 as excerpted:

> The Debian installer has BTRFS in a list of filesystems to choose with
> no special notice about it.  I'm thinking of filing a Debian bug
> requesting that they put a warning against it.
> 
> What do people here think?

You already have my general feeling, a warning is still appropriate.

For Debian, I believe it's fair to characterize people running stable as 
relatively conservative.  As such, a warning may be appropriate, but if 
they're actually /that/ conservative, is it needed, or will user's 
natural inclinations to filesystem conservatism be enough, and a btrfs 
warning thus look more serious than it is?

I'd say warn, unless that warning /will/ be seen as "eat your babies" 
level, even when it's more "appropriate care and a regular backup program 
recommended."

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:20             ` Russell Coker
                                 ` (2 preceding siblings ...)
  2014-08-06  0:04               ` Duncan
@ 2014-08-06  0:38               ` ronnie sahlberg
  2014-08-06  1:18                 ` Nick Krause
  3 siblings, 1 reply; 44+ messages in thread
From: ronnie sahlberg @ 2014-08-06  0:38 UTC (permalink / raw)
  To: russell; +Cc: Duncan, Btrfs BTRFS

On Tue, Aug 5, 2014 at 5:20 AM, Russell Coker <russell@coker.com.au> wrote:

>
> Based on what I've read on this list it seems that BTRFS is less stable in
> 3.15 than in 3.14.  Even 3.14 isn't something I'd recommend to random people
> who want something to just work.
>
> The Debian installer has BTRFS in a list of filesystems to choose with no
> special notice about it.  I'm thinking of filing a Debian bug requesting that
> they put a warning against it.
>
> What do people here think?

+1 for a warning.

btrfs is still a young filesystem and not as stable as say ext4.
I think it would be very prudent to have a small warning.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-06  0:38               ` ronnie sahlberg
@ 2014-08-06  1:18                 ` Nick Krause
  0 siblings, 0 replies; 44+ messages in thread
From: Nick Krause @ 2014-08-06  1:18 UTC (permalink / raw)
  To: ronnie sahlberg; +Cc: Russell Coker, Duncan, Btrfs BTRFS

On Tue, Aug 5, 2014 at 8:38 PM, ronnie sahlberg
<ronniesahlberg@gmail.com> wrote:
> On Tue, Aug 5, 2014 at 5:20 AM, Russell Coker <russell@coker.com.au> wrote:
>
>>
>> Based on what I've read on this list it seems that BTRFS is less stable in
>> 3.15 than in 3.14.  Even 3.14 isn't something I'd recommend to random people
>> who want something to just work.
>>
>> The Debian installer has BTRFS in a list of filesystems to choose with no
>> special notice about it.  I'm thinking of filing a Debian bug requesting that
>> they put a warning against it.
>>
>> What do people here think?
>
> +1 for a warning.
>
> btrfs is still a young filesystem and not as stable as say ext4.
> I think it would be very prudent to have a small warning.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
I agree here and feel this is very important. +1
Nick

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-05 12:58               ` Clemens Eisserer
  2014-08-05 13:02                 ` Peter Waller
@ 2014-08-10 17:21                 ` Martin Steigerwald
  1 sibling, 0 replies; 44+ messages in thread
From: Martin Steigerwald @ 2014-08-10 17:21 UTC (permalink / raw)
  To: Clemens Eisserer; +Cc: linux-btrfs

Am Dienstag, 5. August 2014, 14:58:34 schrieb Clemens Eisserer:
> Hi Russel,
> 
> > The Debian installer has BTRFS in a list of filesystems to choose with no
> > special notice about it.  I'm thinking of filing a Debian bug requesting
> > that they put a warning against it.
> 
> As long as it is not selected as the default filesystem, I think it is fine.
> Other distributions have been offering btrfs for some time now, too.

For example SLES 11 SP 2. A Linux training VM image I developed some slides 
about implementing an OpenLDAP server in SLES with on the next day was totally 
broke:

- no space left on device
- snapper created tons of snapshots
- yet in df -h still 2 GB free
- rm on a logfile returned no space left on device
- btrfs subvol delete returned no space left on device
- I think I also tried btrfs balance with no space left on device, but I am 
not 100% sure


At that time I just created a snapshot of the broken state and returned to a 
previous snapshot to have the VM fixed.

And note: This is on a distro that has enterprise support for using BTRFS on 
root filesystem – while using a 3.0 kernel, with hopefully some… but apparently 
not enough backports.

Granted, still it would be nice to to add a warning to Debian.

Better still would be just to have stability fixes for the hangs go into 3.16-
stable and thus also into Debian Jessie.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: ENOSPC with mkdir and rename
  2014-08-04 10:09       ` Peter Waller
  2014-08-04 10:22         ` Hugo Mills
  2014-08-04 10:50         ` Chris Samuel
@ 2014-08-10 17:26         ` Martin Steigerwald
  2 siblings, 0 replies; 44+ messages in thread
From: Martin Steigerwald @ 2014-08-10 17:26 UTC (permalink / raw)
  To: Peter Waller; +Cc: Chris Samuel, linux-btrfs

Am Montag, 4. August 2014, 11:09:23 schrieb Peter Waller:
> On 4 August 2014 10:39, Chris Samuel <chris@csamuel.org> wrote:
> > On Mon, 4 Aug 2014 09:14:19 AM Peter Waller wrote:
> >> All of this is *very* surprising.
> > 
> > Hmm, it shouldn't be, the ENOSPC issues are well known and have been
> > discussed here for years.
> 
> I accept that. It's all very well if you read the BTRFS list and/or
> are a BTRFS developer. But if you're trying to work it out in the heat
> of battle, as we have sysadmins who would have to, there is a
> combination of things here that makes it unreasonable and harmful for
> production.

Well, maybe, just maybe… BTRFS is not yet ready for production use.

I installed it on a server recently, my own VM. And I am expect that I may 
need to fix up things there. And I did it with a *huge* free space margin. And 
still running Debian backport kernel 3.14. Won´t change it to 3.16 until I 
have seen that it runs nicely on my laptop again.

Test it on non critical servers – yes. Use it on critical production servers? 
My answer is a no for this. Despite partial support in SLES 11 SP 2 and 
support (partial?) for it in Oracle Unbreakable Linux.

BTRFS is just not yet there from my current experiences with it.

One thing that may get you covered up usually: Make it five times as large as 
the data you put on it and try to monitor for situation you better rebalance 
in advance.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2014-08-10 17:26 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-02 23:35 ENOSPC with mkdir and rename Peter Waller
2014-08-03  0:28 ` Mitch Harder
2014-08-03  1:52   ` Nick Krause
2014-08-03  2:39 ` Russell Coker
2014-08-03  2:59   ` Nick Krause
2014-08-04  1:38 ` Qu Wenruo
2014-08-04  8:14   ` Peter Waller
2014-08-04  9:22     ` Clemens Eisserer
2014-08-04  9:39     ` Chris Samuel
2014-08-04  9:56       ` Clemens Eisserer
2014-08-04 10:24         ` Chris Samuel
2014-08-05  8:06           ` Duncan
2014-08-05 12:20             ` Russell Coker
2014-08-05 12:58               ` Clemens Eisserer
2014-08-05 13:02                 ` Peter Waller
2014-08-10 17:21                 ` Martin Steigerwald
2014-08-05 13:36               ` Chris Samuel
2014-08-06  0:04               ` Duncan
2014-08-06  0:38               ` ronnie sahlberg
2014-08-06  1:18                 ` Nick Krause
2014-08-04 10:09       ` Peter Waller
2014-08-04 10:22         ` Hugo Mills
2014-08-04 10:31           ` Peter Waller
2014-08-04 10:39             ` Hugo Mills
2014-08-04 10:48               ` Peter Waller
2014-08-04 11:29                 ` Hugo Mills
2014-08-04 17:09             ` Austin S Hemmelgarn
2014-08-05  8:20               ` Duncan
2014-08-05 11:31                 ` Austin S Hemmelgarn
2014-08-04 11:04           ` Clemens Eisserer
2014-08-04 11:32             ` Hugo Mills
2014-08-04 13:17               ` Peter Waller
2014-08-04 13:35                 ` Hugo Mills
2014-08-04 14:02                 ` Austin S Hemmelgarn
2014-08-04 14:11                   ` Peter Waller
2014-08-04 14:26                     ` Austin S Hemmelgarn
2014-08-04 14:47                 ` Russell Coker
2014-08-04 15:19                   ` Mitch Harder
2014-08-04 10:50         ` Chris Samuel
2014-08-04 10:59           ` Peter Waller
2014-08-04 21:27             ` Chris Samuel
2014-08-10 17:26         ` Martin Steigerwald
2014-08-05  8:51     ` Qu Wenruo
2014-08-05 12:17       ` Russell Coker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.