From: pg@btrfs.list.sabi.co.UK (Peter Grandi)
To: Linux fs Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Shrinking a device - performance?
Date: Sat, 1 Apr 2017 12:30:11 +0100 [thread overview]
Message-ID: <22751.36675.942455.136171@tree.ty.sabi.co.uk> (raw)
In-Reply-To: <CAP8EXU0TWqZZrDe6J0nRLEhgO7w2iUvSPh-H86JNPeYQqtA9fw@mail.gmail.com>
[ ... ]
>>> $ D='btrfs f2fs gfs2 hfsplus jfs nilfs2 reiserfs udf xfs'
>>> $ find $D -name '*.ko' | xargs size | sed 's/^ *//;s/ .*\t//g'
>>> text filename
>>> 832719 btrfs/btrfs.ko
>>> 237952 f2fs/f2fs.ko
>>> 251805 gfs2/gfs2.ko
>>> 72731 hfsplus/hfsplus.ko
>>> 171623 jfs/jfs.ko
>>> 173540 nilfs2/nilfs2.ko
>>> 214655 reiserfs/reiserfs.ko
>>> 81628 udf/udf.ko
>>> 658637 xfs/xfs.ko
That was Linux AMD64.
> udf is 637K on Mac OS 10.6
> exfat is 75K on Mac OS 10.9
> msdosfs is 79K on Mac OS 10.9
> ntfs is 394K (That must be Paragon's ntfs for Mac)
...
> zfs is 1.7M (10.9)
> spl is 247K (10.9)
Similar on Linux AMD64 but smaller:
$ size updates/dkms/*.ko | sed 's/^ *//;s/ .*\t//g'
text filename
62005 updates/dkms/spl.ko
184370 updates/dkms/splat.ko
3879 updates/dkms/zavl.ko
22688 updates/dkms/zcommon.ko
1012212 updates/dkms/zfs.ko
39874 updates/dkms/znvpair.ko
18321 updates/dkms/zpios.ko
319224 updates/dkms/zunicode.ko
> If they are somehow comparable even with the differences, 833K
> is not bad for btrfs compared to zfs. I did not look at the
> format of the file; it must be binary, but compression may be
> optional for third party kexts. So the kernel module sizes are
> large for both btrfs and zfs. Given the feature sets of both,
> is that surprising?
Not surprising and indeed I agree with the statement that
appeared earlier that "there are use cases that actually need
them". There are also use cases that need realtime translation
of file content from chinese to spanish, and one could add to
ZFS or Btrfs an extension to detect the language of text files
and invoke via HTTP Google Translate, for example with option
"translate=chinese-spanish" at mount time; or less flexibly
there are many use cases where B-Tree lookup of records in files
is useful, and it would be possible to add that to Btrfs or ZFS,
so that for example 'lseek(4,"Jane Smith",SEEK_KEY)' would be
possible, as in the ancient TSS/370 filesystem design.
But the question is about engineering, where best to implement
those "feature sets": in the kernel or higher levels. There is
no doubt for me that realtime language translation and seeking
by key can be added to a filesystem kernel module, and would
"work". The issue is a crudely technical one: "works" for an
engineer is not a binary state, but a statistical property over
a wide spectrum of cost/benefit tradeoffs.
Adding "feature sets" because "there are use cases that actually
need them" is fine, adding their implementation to the kernel
driver of a filesystem is quite a different proposition, which
may have downsides, as the implementations of those feature sets
may make code more complex and harder to understand and test,
never mind debug, even for the base features. But of course lots
of people know better :-).
Buit there is more; look again at some compiled code sizes as a
crude proxy for complexity, divided in two groups, both of
robust, full featured designs:
1012212 updates/dkms/zfs.ko
832719 btrfs/btrfs.ko
658637 xfs/xfs.ko
237952 f2fs/f2fs.ko
173540 nilfs2/nilfs2.ko
171623 jfs/jfs.ko
81628 udf/udf.ko
The code size for JFS or NILFS2 or UDF is roughly 1/4 the code
size for XFS, yet there is little difference in functionality.
Compared to ZFS as to base functionality JFS lacks checksums and
snapshots (in theory it has subvolumes, but they are disabled),
but NILFS2 has snapshots and checksums (but does not verify them
on ordinary reads), and yet the code size is 1/6 that of ZFS.
ZFS has also RAID, but looking at the code size of the Linux MD
RAID modules I see rather smaller numbers. Even so ZFS has a
good reputation for reliability despire its amazing complexity,
but that is also because SUN invested big into massive release
engineering for it, and similarly for XFS.
Therefore my impression is that the filesystems in the first
group have a lot of cool features like compression or dedup
etc. that could have been implemented user-level, and having
them in the kernel is good "for "marketing" purposes, to win
box-ticking competitions".
next prev parent reply other threads:[~2017-04-01 11:30 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-27 11:17 Shrinking a device - performance? Christian Theune
2017-03-27 13:07 ` Hugo Mills
2017-03-27 13:20 ` Christian Theune
2017-03-27 13:24 ` Hugo Mills
2017-03-27 13:46 ` Austin S. Hemmelgarn
2017-03-27 13:50 ` Christian Theune
2017-03-27 13:54 ` Christian Theune
2017-03-27 14:17 ` Austin S. Hemmelgarn
2017-03-27 14:49 ` Christian Theune
2017-03-27 15:06 ` Roman Mamedov
2017-04-01 9:05 ` Kai Krakow
2017-03-27 14:14 ` Austin S. Hemmelgarn
2017-03-27 14:48 ` Roman Mamedov
2017-03-27 14:53 ` Christian Theune
2017-03-28 14:43 ` Peter Grandi
2017-03-28 14:50 ` Tomasz Kusmierz
2017-03-28 15:06 ` Peter Grandi
2017-03-28 15:35 ` Tomasz Kusmierz
2017-03-28 16:20 ` Peter Grandi
2017-03-28 14:59 ` Peter Grandi
2017-03-28 15:20 ` Peter Grandi
2017-03-28 15:56 ` Austin S. Hemmelgarn
2017-03-30 15:55 ` Peter Grandi
2017-03-31 12:41 ` Austin S. Hemmelgarn
2017-03-31 17:25 ` Peter Grandi
2017-03-31 19:38 ` GWB
2017-03-31 20:27 ` Peter Grandi
2017-04-01 0:02 ` GWB
2017-04-01 2:42 ` Duncan
2017-04-01 4:26 ` GWB
2017-04-01 11:30 ` Peter Grandi [this message]
2017-03-30 15:00 ` Piotr Pawłow
2017-03-30 16:13 ` Peter Grandi
2017-03-30 22:13 ` Piotr Pawłow
2017-03-31 1:00 ` GWB
2017-03-31 5:26 ` Duncan
2017-03-31 5:38 ` Duncan
2017-03-31 12:37 ` Peter Grandi
2017-03-31 11:37 ` Peter Grandi
2017-03-31 10:51 ` Peter Grandi
2017-03-27 11:51 Christian Theune
2017-03-27 12:55 ` Christian Theune
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22751.36675.942455.136171@tree.ty.sabi.co.uk \
--to=pg@btrfs.list.sabi.co.uk \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.