linux-bcachefs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] bcachefs - semvar, forward compatibility
@ 2023-07-09 17:15 Kent Overstreet
  2023-07-09 17:15 ` [PATCH 01/10] bcachefs: Allow for unknown btree IDs Kent Overstreet
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Kent Overstreet @ 2023-07-09 17:15 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, bfoster, sandeen

So - in the upstreaming discussion, Brian mentioned code review, so now
seems like a good time to start making sure bcachefs patches hit the
list.

In the last cabal meeting, we started talking about on disk
compatibility issues for mainlining. Since (IIRC) the first release,
we've maintained backwards compatibility (support for very old versions
has been dropped, but there's always been an upgrade path) - but we
generally haven't been addressing forwards compatibility yet - we've
been doing a lot of forced incompatible version upgrades, where the old
version is no longer able to mount the filesystem after it's been
mounted by the new version.

Obviously, we can't do that anymore after we're in mainline and out of
EXPERIMENTAL. There were two main issues to address:

Major/minor version numbers
---------------------------

bcachefs started out with the traditional compatible/incompatible
feature bits in the superblock, but they are no longer my preferred
approach.

The problem with feature bits is that there was an ordering in which new
on disk format features were released, and feature bits lose that
ordering: they make it possible for users to create filesystems where x,
y, and z modern feature bits are enabled, but not feature bit a from 5
years ago - and the code was never written to expect that and you
certainly never tested that configuration, so things break in incredibly
fun ways.

Assigning every new on disk feature a distinct version number instead of
a feature bit preserves this ordering and makes it impossible for users
to create or use filesystems with features selected that historically
should not have existed. This has been the practice in bcachefs for
awhile now, and I've been quite happy with it.

The missing bit that this patch series adds is to split the version
number field into major and minor versions. Incrementing the minor
version number corresponds to adding a new compat feature flag, if we
were using feature bits: incrementing the major version number
corresponds to adding a new incompatible feature bit.

IOW, we'll allow mounting a filesystem with a version number greater
than the currently supported version as long as the major version number
is the same. As with compat feature bits, if you do so the filesystem
will be downgraded to the currently supported version, indicating those
new on disk structures may now be inconsistent.

Forwards compatibility of on disk structures:
---------------------------------------------

We need to be able to roll out new on disk data structures without
causing problems for old versions - old versions should just ignore
metadata they don't understand. This had been planned for in the past
and most of the work was done, so there wasn't much left.

Specifically, we need to be able to roll out new
 - Superblock sections: already handled, audited and cleaned up a bit
 - Journal entry types: already handled, audited
 - Btrees: addressed by this patchset
 - Bkey types: addressed by this patchset
 - New fields for existing bkeys: addressed by the patch series that
   introduced "bch2_bkey_get_val_typed()", but this is the trickiest to
   handle and likely more work will be required

With all this in place, we'll be able to roll out most of the new
features we want that require new on disk data structures as forwards
compatible changes, including everything currently in the pipeline. That
includes

 - Snapshot nodes are gaining skiplist entries soon: this will fix O(n)
   issues with bch2_snapshot_is_ancestor()

 - rebalance_work btree: Rebalance is the last operation that happens
   during normal operation that requires metadata scanning - soon I'll
   be adding a rebalance_work btree that references extents that
   rebalance will have work to do on in the future (e.g. for the
   background_compression or background_target io options). 

 - inodes_deleted btree: After unclean shutdown we still have to scan
   the entire inodes btree for deleted inodes, I'll be adding another
   bitset btree to address this - and also adding a tmpdir feature as
   well.

Things that will require incompatible changes:

 - New key types that replace existing key types, or in general new data
   structures that replace existing data structures

   Where we can maintain both the old and new data structures this isn't
   a problem - e.g. we can roll out a new bch_sb_members_v2 superblock
   section and just also keep writing out bch_sb_members for old
   versions to use; but we won't be able to roll out e.g. a new extent
   key type without an incompatible change.

 - New btree node header/journal entry headers - we'd like bigger
   nonces, so this will need to happen eventually

 - New extent_entry types: this one is a bit unfortunate, because
   extents contain a list of variable size fields (e.g. ptrs, different
   sized crc entries) and the entries themselves don't specify their
   size - the code that's reading it has to know how big every extent
   entry type is.

   This just came up with rebalance_work - rebalance_work needs a new
   extent entry type, so I rolled that out ahead of time so we can roll
   out the rest of the functionality as a compatible change.

Forced version upgrades:
------------------------

Going forward, we will still be doing forced version upgrades for awhile
- but only to forwards-compatible versions. After the next incompatible
(version 2.0) release, we likely won't be doing forced version upgrades
at all anymore.

Currently, version upgrades generally require a fsck. Another thing this
patchset addresses is enumerating all our recovery (including version
upgrade and fsck passes); this will let us specify "upgrading to this
version only requires this pass to run".

Kent Overstreet (10):
  bcachefs: Allow for unknown btree IDs
  bcachefs: Allow for unknown key types
  bcachefs: Refactor bch_sb_field_ops handling
  bcachefs: Change check for invalid key types
  bcachefs: BCH_SB_VERSION_UPGRADE_COMPLETE()
  bcachefs: version_upgrade is now an enum
  bcachefs: Kill bch2_bucket_gens_read()
  bcachefs: Stash journal replay params in bch_fs
  bcachefs: Enumerate recovery passes
  bcachefs: bcachefs_metadata_version_major_minor

 fs/bcachefs/alloc_background.c      | 129 +++++-----
 fs/bcachefs/alloc_background.h      |  18 +-
 fs/bcachefs/alloc_foreground.c      |   9 +-
 fs/bcachefs/backpointers.c          |  23 +-
 fs/bcachefs/backpointers.h          |   2 +-
 fs/bcachefs/bcachefs.h              |  62 ++++-
 fs/bcachefs/bcachefs_format.h       |  63 +++--
 fs/bcachefs/bkey_methods.c          |  81 ++++---
 fs/bcachefs/bkey_methods.h          |  20 +-
 fs/bcachefs/btree_cache.c           |  23 +-
 fs/bcachefs/btree_cache.h           |  22 +-
 fs/bcachefs/btree_gc.c              |  26 +-
 fs/bcachefs/btree_io.c              |   9 +-
 fs/bcachefs/btree_iter.c            |   4 +-
 fs/bcachefs/btree_update_interior.c |  18 +-
 fs/bcachefs/btree_update_leaf.c     |  17 +-
 fs/bcachefs/dirent.c                |   3 +-
 fs/bcachefs/dirent.h                |   4 +-
 fs/bcachefs/ec.c                    |   3 +-
 fs/bcachefs/ec.h                    |   4 +-
 fs/bcachefs/extents.c               |  12 +-
 fs/bcachefs/extents.h               |   9 +-
 fs/bcachefs/fsck.c                  |  77 +-----
 fs/bcachefs/fsck.h                  |  10 +-
 fs/bcachefs/inode.c                 |  12 +-
 fs/bcachefs/inode.h                 |  12 +-
 fs/bcachefs/journal_io.c            |  15 +-
 fs/bcachefs/lru.c                   |   3 +-
 fs/bcachefs/lru.h                   |   3 +-
 fs/bcachefs/move.c                  |  10 +-
 fs/bcachefs/opts.c                  |   5 +
 fs/bcachefs/opts.h                  |   5 +-
 fs/bcachefs/quota.c                 |   3 +-
 fs/bcachefs/quota.h                 |   4 +-
 fs/bcachefs/recovery.c              | 353 ++++++++++++++--------------
 fs/bcachefs/reflink.c               |   9 +-
 fs/bcachefs/reflink.h               |   8 +-
 fs/bcachefs/subvolume.c             |  16 +-
 fs/bcachefs/subvolume.h             |  14 +-
 fs/bcachefs/super-io.c              |  91 +++++--
 fs/bcachefs/super-io.h              |   3 +-
 fs/bcachefs/super.c                 |   1 +
 fs/bcachefs/xattr.c                 |   3 +-
 fs/bcachefs/xattr.h                 |   3 +-
 44 files changed, 700 insertions(+), 521 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-13 15:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-09 17:15 [PATCH 00/10] bcachefs - semvar, forward compatibility Kent Overstreet
2023-07-09 17:15 ` [PATCH 01/10] bcachefs: Allow for unknown btree IDs Kent Overstreet
2023-07-09 17:15 ` [PATCH 02/10] bcachefs: Allow for unknown key types Kent Overstreet
2023-07-09 17:15 ` [PATCH 03/10] bcachefs: Refactor bch_sb_field_ops handling Kent Overstreet
2023-07-09 17:15 ` [PATCH 04/10] bcachefs: Change check for invalid key types Kent Overstreet
2023-07-09 17:15 ` [PATCH 05/10] bcachefs: BCH_SB_VERSION_UPGRADE_COMPLETE() Kent Overstreet
2023-07-13 13:42   ` Brian Foster
2023-07-13 15:31     ` Kent Overstreet
2023-07-09 17:15 ` [PATCH 06/10] bcachefs: version_upgrade is now an enum Kent Overstreet
2023-07-09 17:15 ` [PATCH 07/10] bcachefs: Kill bch2_bucket_gens_read() Kent Overstreet
2023-07-09 17:15 ` [PATCH 08/10] bcachefs: Stash journal replay params in bch_fs Kent Overstreet
2023-07-09 17:15 ` [PATCH 09/10] bcachefs: Enumerate recovery passes Kent Overstreet
2023-07-09 17:15 ` [PATCH 10/10] bcachefs: bcachefs_metadata_version_major_minor Kent Overstreet
2023-07-09 17:49   ` Thomas Weißschuh
2023-07-09 18:31     ` Kent Overstreet
2023-07-09 19:29       ` Thomas Weißschuh
2023-07-09 20:08         ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).