Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Edward Shishkin <edward@redhat.com>
Cc: Jamie Lokier <jamie@shareable.org>,
	Edward Shishkin <edward.shishkin@gmail.com>,
	Mat <jackdachef@gmail.com>, LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, Ric Wheeler <rwheeler@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	The development of BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs)
Date: Fri, 18 Jun 2010 15:35:55 -0400
Message-ID: <20100618193555.GV27466@think> (raw)
In-Reply-To: <4C1BC924.30604@redhat.com>

On Fri, Jun 18, 2010 at 09:29:40PM +0200, Edward Shishkin wrote:
> Jamie Lokier wrote:
> >Edward Shishkin wrote:
> >>If you decide to base your file system on some algorithms then please
> >>use the original ones from proper academic papers. DO NOT modify the
> >>algorithms in solitude: this is very fragile thing! All such
> >>modifications must be reviewed by specialists in the theory of
> >>algorithms. Such review can be done in various scientific magazines of
> >>proper level.
> >>
> >>Personally I don't see any way to improve the situation with Btrfs
> >>except full redesigning the last one. If you want to base your file
> >>system on the paper of Ohad Rodeh, then please, use *exactly* the
> >>Bayer's B-trees that he refers to. That said, make sure that all
> >>records you put to the tree has equal length and all non-root nodes of
> >>your tree are at least half filled.
> >
> >First, thanks Edward for identifying a specific problem with the
> >current btrfs implementation.
> 
> Hello Jamie.
> 
> >I've studied modified B-trees quite a lot and know enough to be sure
> >that they are quite robust when you modify them in all sorts of ways.
> 
> Which property is robust?
> 
> >Moreover, you are incorrect to say there's an intrinsic algorithmic
> >problem with variable-length records.  It is not true; if Knuth said
> >so, Knuth was mistaken.
> 
> I didn't say about intrinsic algorithmic problems :)
> I just repeat (after Knuth et al) that B-trees with variable-length
> records don't
> have any sane boundary for internal fragmentation. The common idea
> is that if we
> don't want Btrfs to be in infinite development stage, then we should
> choose some
> *sane* strategy (for example the paper of Ohad Rodeh) and strictly
> adhere this in
> future.

Again, other than the inline file data, what exactly do you believe
needs to change?  Top down balancing vs balancing on insertion doesn't
impact our ability to maintain full leaves.  The current code is clearly
choosing not to merge two leaves that it should have merged, which is
just a plain old bug.

-chris

  reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-03 14:58 Unbound(?) internal fragmentation in Btrfs Edward Shishkin
     [not found] ` <AANLkTilKw2onQkdNlZjg7WVnPu2dsNpDSvoxrO_FA2z_@mail.gmail.com>
2010-06-18  8:03   ` Christian Stroetmann
2010-06-18 13:32   ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Edward Shishkin
2010-06-18 13:45     ` Daniel J Blueman
2010-06-18 16:50       ` Edward Shishkin
2010-06-23 23:40         ` Jamie Lokier
2010-06-24  3:43           ` Daniel Taylor
2010-06-24  4:51             ` Mike Fedyk
2010-06-24 22:06               ` Daniel Taylor
2010-06-25  9:15                 ` Btrfs: broken file system design Andi Kleen
2010-06-25 18:58                 ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Ric Wheeler
2010-06-26  5:18                   ` Michael Tokarev
2010-06-26 11:55                     ` Ric Wheeler
     [not found]                     ` <57784.2001:5c0:82dc::2.1277555665.squirrel@www.tofubar.com>
2010-06-26 13:47                       ` Ric Wheeler
2010-06-24  9:50             ` David Woodhouse
2010-06-18 18:15       ` Christian Stroetmann
2010-06-18 13:47     ` Chris Mason
2010-06-18 15:05       ` Edward Shishkin
     [not found]       ` <4C1B8B4A.9060308@gmail.com>
2010-06-18 15:10         ` Chris Mason
2010-06-18 16:22           ` Edward Shishkin
     [not found]           ` <4C1B9D4F.6010008@gmail.com>
2010-06-18 18:10             ` Chris Mason
2010-06-18 15:21       ` Christian Stroetmann
2010-06-18 15:22         ` Chris Mason
2010-06-18 15:56     ` Jamie Lokier
2010-06-18 19:25       ` Christian Stroetmann
2010-06-18 19:29       ` Edward Shishkin
2010-06-18 19:35         ` Chris Mason [this message]
2010-06-18 22:04           ` Balancing leaves when walking from top to down (was Btrfs:...) Edward Shishkin
     [not found]           ` <4C1BED56.9010300@redhat.com>
2010-06-18 22:16             ` Ric Wheeler
2010-06-19  0:03               ` Edward Shishkin
2010-06-21 13:15             ` Chris Mason
     [not found]               ` <20100621180013.GD17979@think>
2010-06-22 14:12                 ` Edward Shishkin
2010-06-22 14:20                   ` Chris Mason
2010-06-23 13:46                     ` Edward Shishkin
     [not found]                     ` <4C221049.501@gmail.com>
2010-06-23 23:37                       ` Jamie Lokier
2010-06-24 13:06                         ` Chris Mason
2010-06-30 20:05                           ` Edward Shishkin
     [not found]                           ` <4C2BA381.7040808@redhat.com>
2010-06-30 21:12                             ` Chris Mason
2010-07-09  4:16                 ` Chris Samuel
2010-07-09 20:30                   ` Chris Mason
2010-06-23 23:57         ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100618193555.GV27466@think \
    --to=chris.mason@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=edward.shishkin@gmail.com \
    --cc=edward@redhat.com \
    --cc=jackdachef@gmail.com \
    --cc=jamie@shareable.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rwheeler@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git