Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: "Daniel Taylor" <Daniel.Taylor@wdc.com>
To: "Mike Fedyk" <mfedyk@mikefedyk.com>
Cc: "Daniel J Blueman" <daniel.blueman@gmail.com>,
	"Mat" <jackdachef@gmail.com>,
	"LKML" <linux-kernel@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>,
	"Chris Mason" <chris.mason@oracle.com>,
	"Ric Wheeler" <rwheeler@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"The development of BTRFS" <linux-btrfs@vger.kernel.org>
Subject: RE: Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs)
Date: Thu, 24 Jun 2010 15:06:03 -0700
Message-ID: <469D2D911E4BF043BFC8AD32E8E30F5B24AEBB@wdscexbe07.sc.wdc.com> (raw)
In-Reply-To: <AANLkTil5Gl0-rWClRsLZby_c37bQu5RB_tCgHsxTFshO@mail.gmail.com>

=20

> -----Original Message-----
> From: mikefedyk@gmail.com [mailto:mikefedyk@gmail.com] On=20
> Behalf Of Mike Fedyk
> Sent: Wednesday, June 23, 2010 9:51 PM
> To: Daniel Taylor
> Cc: Daniel J Blueman; Mat; LKML;=20
> linux-fsdevel@vger.kernel.org; Chris Mason; Ric Wheeler;=20
> Andrew Morton; Linus Torvalds; The development of BTRFS
> Subject: Re: Btrfs: broken file system design (was Unbound(?)=20
> internal fragmentation in Btrfs)
>=20
> On Wed, Jun 23, 2010 at 8:43 PM, Daniel Taylor=20
> <Daniel.Taylor@wdc.com> wrote:
> > Just an FYI reminder. =A0The original test (2K files) is utterly
> > pathological for disk drives with 4K physical sectors, such as
> > those now shipping from WD, Seagate, and others. =A0Some of the
> > SSDs have larger (16K0 or smaller blocks (2K). =A0There is also
> > the issue of btrfs over RAID (which I know is not entirely
> > sensible, but which will happen).
> >
> > The absolute minimum allocation size for data should be the same
> > as, and aligned with, the underlying disk block size. =A0If that
> > results in underutilization, I think that's a good thing for
> > performance, compared to read-modify-write cycles to update
> > partial disk blocks.
>=20
> Block size =3D 4k
>=20
> Btrfs packs smaller objects into the blocks in certain cases.
>=20

As long as no object smaller than the disk block size is ever
flushed to media, and all flushed objects are aligned to the disk
blocks, there should be no real performance hit from that.

Otherwise we end up with the damage for the ext[234] family, where
the file blocks can be aligned, but the 1K inode updates cause
the read-modify-write (RMW) cycles and and cost >10% performance
hit for creation/update of large numbers of files.

An RMW cycle costs at least a full rotation (11 msec on a 5400 RPM
drive), which is painful.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel=
" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-03 14:58 Unbound(?) internal fragmentation in Btrfs Edward Shishkin
     [not found] ` <AANLkTilKw2onQkdNlZjg7WVnPu2dsNpDSvoxrO_FA2z_@mail.gmail.com>
2010-06-18  8:03   ` Christian Stroetmann
2010-06-18 13:32   ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Edward Shishkin
2010-06-18 13:45     ` Daniel J Blueman
2010-06-18 16:50       ` Edward Shishkin
2010-06-23 23:40         ` Jamie Lokier
2010-06-24  3:43           ` Daniel Taylor
2010-06-24  4:51             ` Mike Fedyk
2010-06-24 22:06               ` Daniel Taylor [this message]
2010-06-25  9:15                 ` Btrfs: broken file system design Andi Kleen
2010-06-25 18:58                 ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Ric Wheeler
2010-06-26  5:18                   ` Michael Tokarev
2010-06-26 11:55                     ` Ric Wheeler
     [not found]                     ` <57784.2001:5c0:82dc::2.1277555665.squirrel@www.tofubar.com>
2010-06-26 13:47                       ` Ric Wheeler
2010-06-24  9:50             ` David Woodhouse
2010-06-18 18:15       ` Christian Stroetmann
2010-06-18 13:47     ` Chris Mason
2010-06-18 15:05       ` Edward Shishkin
     [not found]       ` <4C1B8B4A.9060308@gmail.com>
2010-06-18 15:10         ` Chris Mason
2010-06-18 16:22           ` Edward Shishkin
     [not found]           ` <4C1B9D4F.6010008@gmail.com>
2010-06-18 18:10             ` Chris Mason
2010-06-18 15:21       ` Christian Stroetmann
2010-06-18 15:22         ` Chris Mason
2010-06-18 15:56     ` Jamie Lokier
2010-06-18 19:25       ` Christian Stroetmann
2010-06-18 19:29       ` Edward Shishkin
2010-06-18 19:35         ` Chris Mason
2010-06-18 22:04           ` Balancing leaves when walking from top to down (was Btrfs:...) Edward Shishkin
     [not found]           ` <4C1BED56.9010300@redhat.com>
2010-06-18 22:16             ` Ric Wheeler
2010-06-19  0:03               ` Edward Shishkin
2010-06-21 13:15             ` Chris Mason
     [not found]               ` <20100621180013.GD17979@think>
2010-06-22 14:12                 ` Edward Shishkin
2010-06-22 14:20                   ` Chris Mason
2010-06-23 13:46                     ` Edward Shishkin
     [not found]                     ` <4C221049.501@gmail.com>
2010-06-23 23:37                       ` Jamie Lokier
2010-06-24 13:06                         ` Chris Mason
2010-06-30 20:05                           ` Edward Shishkin
     [not found]                           ` <4C2BA381.7040808@redhat.com>
2010-06-30 21:12                             ` Chris Mason
2010-07-09  4:16                 ` Chris Samuel
2010-07-09 20:30                   ` Chris Mason
2010-06-23 23:57         ` Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs) Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=469D2D911E4BF043BFC8AD32E8E30F5B24AEBB@wdscexbe07.sc.wdc.com \
    --to=daniel.taylor@wdc.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=daniel.blueman@gmail.com \
    --cc=jackdachef@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfedyk@mikefedyk.com \
    --cc=rwheeler@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git