Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Edward Shishkin <edward.shishkin@gmail.com>
Cc: Mat <jackdachef@gmail.com>, LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, Ric Wheeler <rwheeler@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	The development of BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs)
Date: Fri, 18 Jun 2010 11:10:17 -0400
Message-ID: <20100618151017.GN27466@think> (raw)
In-Reply-To: <4C1B8B4A.9060308@gmail.com>

On Fri, Jun 18, 2010 at 05:05:46PM +0200, Edward Shishkin wrote:
> Chris Mason wrote:
> >On Fri, Jun 18, 2010 at 03:32:16PM +0200, Edward Shishkin wrote:
> >>Mat wrote:
> >>>On Thu, Jun 3, 2010 at 4:58 PM, Edward Shishkin <edward@redhat.com> wrote:
> >>>>Hello everyone.
> >>>>
> >>>>I was asked to review/evaluate Btrfs for using in enterprise
> >>>>systems and the below are my first impressions (linux-2.6.33).
> >>>>
> >>>>The first test I have made was filling an empty 659M (/dev/sdb2)
> >>>>btrfs partition (mounted to /mnt) with 2K files:
> >>>>
> >>>># for i in $(seq 1000000); \
> >>>>do dd if=/dev/zero of=/mnt/file_$i bs=2048 count=1; done
> >>>>(terminated after getting "No space left on device" reports).
> >>>>
> >>>># ls /mnt | wc -l
> >>>>59480
> >>>>
> >>>>So, I got the "dirty" utilization 59480*2048 / (659*1024*1024) = 0.17,
> >>>>and the first obvious question is "hey, where are other 83% of my
> >>>>disk space???" I looked at the btrfs storage tree (fs_tree) and was
> >>>>shocked with the situation on the leaf level. The Appendix B shows
> >>>>5 adjacent btrfs leafs, which have the same parent.
> >>>>
> >>>>For example, look at the leaf 29425664: "items 1 free space 3892"
> >>>>(of 4096!!). Note, that this "free" space (3892) is _dead_: any
> >>>>attempts to write to the file system will result in "No space left
> >>>>on device".
> >
> >There are two easy ways to fix this problem.  Turn off the inline
> >extents (max_inline=0) or allow splitting of the inline extents.  I
> >didn't put in the splitting simply because the complexity was high while
> >the benefits were low (in comparison with just turning off the inline
> >extents).
> 
> Hello, Chris. Thanks for response!
> I afraid that both ways won't fix the problem. Look at this leaf:
> 
> [...]
> leaf 29425664 items 1 free space 3892 generation 8 owner 5
> fs uuid 50268d9d-2a53-4f4d-b3a3-4fbff74dd956
> chunk uuid 963ba49a-bb2b-48a3-9b35-520d857aade6
>        item 0 key (320 XATTR_ITEM 3817753667) itemoff 3917 itemsize 78
>                location key (0 UNKNOWN 0) type 8
>                namelen 16 datalen 32 name: security.selinux
> [...]
> 
> There is no inline extents, and what are you going to split here?
> All leafs must be at least a half filled, otherwise we loose all
> boundaries, which provides non-zero utilization..

Right, there is no inline extent because we require them to fit entirely
in the leaf.  So we end up with mostly empty leaves because the inline
item is large enough to make it difficult to push around but not large
enough to fill the leaf.

-chris

  parent reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-03 14:58 Unbound(?) internal fragmentation in Btrfs Edward Shishkin
     [not found] ` <AANLkTilKw2onQkdNlZjg7WVnPu2dsNpDSvoxrO_FA2z_@mail.gmail.com>
2010-06-18  8:03   ` Christian Stroetmann
2010-06-18 13:32   ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Edward Shishkin
2010-06-18 13:45     ` Daniel J Blueman
2010-06-18 16:50       ` Edward Shishkin
2010-06-23 23:40         ` Jamie Lokier
2010-06-24  3:43           ` Daniel Taylor
2010-06-24  4:51             ` Mike Fedyk
2010-06-24 22:06               ` Daniel Taylor
2010-06-25  9:15                 ` Btrfs: broken file system design Andi Kleen
2010-06-25 18:58                 ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Ric Wheeler
2010-06-26  5:18                   ` Michael Tokarev
2010-06-26 11:55                     ` Ric Wheeler
     [not found]                     ` <57784.2001:5c0:82dc::2.1277555665.squirrel@www.tofubar.com>
2010-06-26 13:47                       ` Ric Wheeler
2010-06-24  9:50             ` David Woodhouse
2010-06-18 18:15       ` Christian Stroetmann
2010-06-18 13:47     ` Chris Mason
2010-06-18 15:05       ` Edward Shishkin
     [not found]       ` <4C1B8B4A.9060308@gmail.com>
2010-06-18 15:10         ` Chris Mason [this message]
2010-06-18 16:22           ` Edward Shishkin
     [not found]           ` <4C1B9D4F.6010008@gmail.com>
2010-06-18 18:10             ` Chris Mason
2010-06-18 15:21       ` Christian Stroetmann
2010-06-18 15:22         ` Chris Mason
2010-06-18 15:56     ` Jamie Lokier
2010-06-18 19:25       ` Christian Stroetmann
2010-06-18 19:29       ` Edward Shishkin
2010-06-18 19:35         ` Chris Mason
2010-06-18 22:04           ` Balancing leaves when walking from top to down (was Btrfs:...) Edward Shishkin
     [not found]           ` <4C1BED56.9010300@redhat.com>
2010-06-18 22:16             ` Ric Wheeler
2010-06-19  0:03               ` Edward Shishkin
2010-06-21 13:15             ` Chris Mason
     [not found]               ` <20100621180013.GD17979@think>
2010-06-22 14:12                 ` Edward Shishkin
2010-06-22 14:20                   ` Chris Mason
2010-06-23 13:46                     ` Edward Shishkin
     [not found]                     ` <4C221049.501@gmail.com>
2010-06-23 23:37                       ` Jamie Lokier
2010-06-24 13:06                         ` Chris Mason
2010-06-30 20:05                           ` Edward Shishkin
     [not found]                           ` <4C2BA381.7040808@redhat.com>
2010-06-30 21:12                             ` Chris Mason
2010-07-09  4:16                 ` Chris Samuel
2010-07-09 20:30                   ` Chris Mason
2010-06-23 23:57         ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100618151017.GN27466@think \
    --to=chris.mason@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=edward.shishkin@gmail.com \
    --cc=jackdachef@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rwheeler@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git