All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: Demi Marie Obenour <demi@invisiblethingslab.com>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: Comparison to ZFS and BTRFS
Date: Fri, 15 Apr 2022 15:11:40 -0400	[thread overview]
Message-ID: <20220415191140.2xyni3kusht6wear@moria.home.lan> (raw)
In-Reply-To: <Yk05Sk4ztxLMpgrt@itl-email>

On Wed, Apr 06, 2022 at 02:55:04AM -0400, Demi Marie Obenour wrote:
> How does bcachefs manage to outperform ZFS and BTRFS?  Obviously being
> licensed under GPL-compatible terms is an advantage for inclusion in
> Linux, but I am more interested in the technical aspects.
> 
> - How does bcachefs avoid the nasty performance pitfalls that plague
>   BTRFS?  Are VM disks and databases on bcachefs fast?

Clean modular design (the result of years of slow incremental work), and a
_blazingly_ fast B+ tree implementation.

We're not fast in every situation yet. We don't have a nocow (non copy-on-write)
mode, and slow random reads can be slow due to checksum granularity being at the
extent level (which is a good tradeoff in most situations, but we need an option
for smaller checksum granularity at some point).

> - How does bcachefs avoid the dreaded RAID write hole? 

We're copy on write - and this extends to our erasure coding implementation, we
don't update existing stripes in place - we create new stripes as needed,
reusing buckets from existing stripes that still have data.

> - How does an O_DIRECT loop device on bcachefs compare to a zvol on ZFS?

I'd have to benchmark/profile it. It appears there's some bugs in the way the
loop driver in O_DIRECT mode interacts with bcachefs according to xfstests, and
the loopback driver is implemented in a more heavyweight way that it needs to be
- there's room for improvement.

> - Is there a good description of the bcachefs on-disk format anywhere?

Try this: https://bcachefs.org/Architecture/

> - What are the internal abstraction layers used in bcachefs?  Is it a
>   key-value store with a filesystem on top of it, the way ZFS is?

It's just a key value store with a filesystem on top, moreso than the way ZFS
is, from what I understand of ZFS.

> - Is it possible to shrink a bcachefs filesystem?

Not yet, but it won't take much work to add

> Does bcachefs have
>   any restrictions regarding the size of disks in a pool, or can I just
>   throw a bunch of varying-size disks at bcachefs and have it spread the
>   data around automatically to provide the level of redundancy I want?

No restrictions, the allocator stripes across available devices but biases in
favor of devices with more free space.

> - Can bcachefs use faster storage as a cache for slower storage, or
>   otherwise move data around based on usage patterns?

Yes.

> - Can bcachefs saturate your typical NVMe drive on realistic workloads?
>   Can it do so with encryption enabled?

This sounds like a question for someone interested in benchmarking :)

> - Is support for swap files on bcachefs planned?  That would require
>   being able to perform O_DIRECT asynchronous writes without any memory
>   allocations.

Yes it's planned, the IO path already has the necessary support

> - Is bcachefs being used in production anywhere?

Yes

  parent reply	other threads:[~2022-04-15 19:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06  6:55 Comparison to ZFS and BTRFS Demi Marie Obenour
2022-04-13 22:43 ` Eric Wheeler
2022-04-15 19:11 ` Kent Overstreet [this message]
2022-04-18 14:07   ` Demi Marie Obenour
2022-04-19  1:35     ` Kent Overstreet
2022-04-19 13:16       ` Demi Marie Obenour
2022-04-19  1:16   ` bcachefs loop devs (was: Comparison to ZFS and BTRFS) Eric Wheeler
2022-04-19  1:41     ` Kent Overstreet
2022-04-19 20:42       ` bcachefs loop devs Eric Wheeler
2022-06-02  8:45         ` Demi Marie Obenour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220415191140.2xyni3kusht6wear@moria.home.lan \
    --to=kent.overstreet@gmail.com \
    --cc=demi@invisiblethingslab.com \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.