linux-bcachefs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] [RFC WIP] bcachefs: online fsck
@ 2023-12-06 20:33 Kent Overstreet
  2023-12-06 20:33 ` [PATCH 1/6] bcachefs: thread_with_file Kent Overstreet
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Spent a couple days hacking code with Darrick, and online fsck seems to
be coming _way_ sooner than I expected/realized - various threads coming
together nicely.

What works - most of checking alloc info:

All but the original btree_gc.c code already runs while we're RW, and
various internal processes are actively modifying the filesystem:
copygc, rebalance, the do_discard worker. This means that in particular
the check alloc info passes are already well tested in RW mode with
active use.

There's one known concurrency bug in that part of fsck, which is that we
process gaps in the alloc btree all at once, which is incorrect since we
don't have any way of taking range locks on the key cache, but that bug
doesn't affect online fsck any more than regular fsck.

Still debating how to fix that one, just dropping the "process a gap all
at once" and going back to key-by-key is a serious regression on mkfs
performance on huge filesystems, and fsck performance when the fs is
still mostly empty.

Todo: btree_gc.c

The old btree_gc code used to run while the fs was in use and RW (before
bcachefs was a fs); it was well tested and relied upon then, but the
traversal order meant that it had to block btree topology changes,
effectively any interior btree node update. As filesystems keep getting
bigger that is definitely no longer feasible, so that code will all need
to be reworked and modernized (and possibly split into multiple passes,
i.e. splitting out checking of indirect extent refcounts).

Todo: fsck.c

The passes that check filesystem-level structure were written assuming
the filesystem was not in use and being modified; they use the normal
btree transaction API that handles locking w.r.t. other transactions,
but not "correctly"; they do things like build up state (walking a btree
path, tracking where extents end in different snapshots), but the btree
transactions (trans_begin() -> trans_commit()) are smaller than they
should be to protect that state as they're walking things.

The main reason for this is that currently, there's a fixed limit on the
number of btree_paths (it's a fixed array) in the btree_trans object -
64. I've started looking at lifting that limit and making the array of
paths dynamically growable, and I think that's going to be the way to go
- the alternative would be introducing extra outside locking to all the
filesystem paths that touch that stuff, and we don't want to do that.

Fortunately other developments have happened that make growing the max
btree_paths look a lot more reasonable than when last I was considering
it, and fsck should be a good place for testing that out and seeing what
growing pains we encounter.

other misc todo:
 - the thread_with_file stuff can probably use further cleanup and
   refactoring; also the f_ops we implement needs a .poll() so userspace
   doesn't spin when using O_NONBLOCK

 - ???

 - profit!

Kent Overstreet (6):
  bcachefs: thread_with_file
  bcachefs: Add ability to redirect log output
  bcachefs: Mark recovery passses that are safe to run online
  bcachefs: bch2_run_online_recovery_passes()
  bcachefs: BCH_IOCTL_FSCK_OFFLINE
  bcachefs: BCH_IOCTL_FSCK_ONLINE

 fs/bcachefs/bcachefs.h       |  61 ++++--
 fs/bcachefs/bcachefs_ioctl.h |  24 +++
 fs/bcachefs/chardev.c        | 376 ++++++++++++++++++++++++++++++-----
 fs/bcachefs/opts.h           |   5 +
 fs/bcachefs/recovery.c       |  51 +++--
 fs/bcachefs/recovery.h       |   1 +
 fs/bcachefs/recovery_types.h |  73 +++----
 fs/bcachefs/super.c          |  28 +++
 8 files changed, 506 insertions(+), 113 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] bcachefs: thread_with_file
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  2023-12-06 20:33 ` [PATCH 2/6] bcachefs: Add ability to redirect log output Kent Overstreet
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Abstract out a new helper from the data job code, for connecting a
kthread to a file descriptor.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/chardev.c | 113 ++++++++++++++++++++++++++----------------
 1 file changed, 69 insertions(+), 44 deletions(-)

diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
index ba0436ae6b05..8e3ac2d32298 100644
--- a/fs/bcachefs/chardev.c
+++ b/fs/bcachefs/chardev.c
@@ -29,6 +29,61 @@ static int copy_to_user_errcode(void __user *to, const void *from, unsigned long
 	return copy_to_user(to, from, n) ? -EFAULT : 0;
 }
 
+struct thread_with_file {
+	struct task_struct	*task;
+	int			ret;
+};
+
+static void thread_with_file_exit(struct thread_with_file *thr)
+{
+	kthread_stop(thr->task);
+	put_task_struct(thr->task);
+}
+
+static int run_thread_with_file(struct thread_with_file *thr,
+				const struct file_operations *fops,
+				int (*fn)(void *), const char *fmt, ...)
+{
+	va_list args;
+	struct file *file = NULL;
+	int ret, fd = -1;
+	struct printbuf name = PRINTBUF;
+	unsigned fd_flags = O_RDONLY|O_CLOEXEC|O_NONBLOCK;
+
+	va_start(args, fmt);
+	prt_vprintf(&name, fmt, args);
+	va_end(args);
+
+	thr->ret = 0;
+	thr->task = kthread_create(fn, thr, name.buf);
+	ret = PTR_ERR_OR_ZERO(thr->task);
+	if (ret)
+		goto err;
+
+	ret = get_unused_fd_flags(fd_flags);
+	if (ret < 0)
+		goto err_stop_task;
+	fd = ret;
+
+	file = anon_inode_getfile(name.buf, fops, thr, fd_flags);
+	ret = PTR_ERR_OR_ZERO(file);
+	if (ret)
+		goto err_put_fd;
+
+	fd_install(fd, file);
+	get_task_struct(thr->task);
+	wake_up_process(thr->task);
+	printbuf_exit(&name);
+	return fd;
+err_put_fd:
+	put_unused_fd(fd);
+err_stop_task:
+	kthread_stop(thr->task);
+err:
+	printbuf_exit(&name);
+	return ret;
+}
+
 /* returns with ref on ca->ref */
 static struct bch_dev *bch2_device_lookup(struct bch_fs *c, u64 dev,
 					  unsigned flags)
@@ -299,31 +354,27 @@ static long bch2_ioctl_disk_set_state(struct bch_fs *c,
 }
 
 struct bch_data_ctx {
+	struct thread_with_file		thr;
+
 	struct bch_fs			*c;
 	struct bch_ioctl_data		arg;
 	struct bch_move_stats		stats;
-
-	int				ret;
-
-	struct task_struct		*thread;
 };
 
 static int bch2_data_thread(void *arg)
 {
-	struct bch_data_ctx *ctx = arg;
-
-	ctx->ret = bch2_data_job(ctx->c, &ctx->stats, ctx->arg);
+	struct bch_data_ctx *ctx = container_of(arg, struct bch_data_ctx, thr);
 
+	ctx->thr.ret = bch2_data_job(ctx->c, &ctx->stats, ctx->arg);
 	ctx->stats.data_type = U8_MAX;
 	return 0;
 }
 
 static int bch2_data_job_release(struct inode *inode, struct file *file)
 {
-	struct bch_data_ctx *ctx = file->private_data;
+	struct bch_data_ctx *ctx = container_of(file->private_data, struct bch_data_ctx, thr);
 
-	kthread_stop(ctx->thread);
-	put_task_struct(ctx->thread);
+	thread_with_file_exit(&ctx->thr);
 	kfree(ctx);
 	return 0;
 }
@@ -331,7 +382,7 @@ static int bch2_data_job_release(struct inode *inode, struct file *file)
 static ssize_t bch2_data_job_read(struct file *file, char __user *buf,
 				  size_t len, loff_t *ppos)
 {
-	struct bch_data_ctx *ctx = file->private_data;
+	struct bch_data_ctx *ctx = container_of(file->private_data, struct bch_data_ctx, thr);
 	struct bch_fs *c = ctx->c;
 	struct bch_ioctl_data_event e = {
 		.type			= BCH_DATA_EVENT_PROGRESS,
@@ -357,10 +408,8 @@ static const struct file_operations bcachefs_data_ops = {
 static long bch2_ioctl_data(struct bch_fs *c,
 			    struct bch_ioctl_data arg)
 {
-	struct bch_data_ctx *ctx = NULL;
-	struct file *file = NULL;
-	unsigned flags = O_RDONLY|O_CLOEXEC|O_NONBLOCK;
-	int ret, fd = -1;
+	struct bch_data_ctx *ctx;
+	int ret;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
@@ -375,36 +424,12 @@ static long bch2_ioctl_data(struct bch_fs *c,
 	ctx->c = c;
 	ctx->arg = arg;
 
-	ctx->thread = kthread_create(bch2_data_thread, ctx,
-				     "bch-data/%s", c->name);
-	if (IS_ERR(ctx->thread)) {
-		ret = PTR_ERR(ctx->thread);
-		goto err;
-	}
-
-	ret = get_unused_fd_flags(flags);
+	ret = run_thread_with_file(&ctx->thr,
+				   &bcachefs_data_ops,
+				   bch2_data_thread,
+				   "bch-data/%s", c->name);
 	if (ret < 0)
-		goto err;
-	fd = ret;
-
-	file = anon_inode_getfile("[bcachefs]", &bcachefs_data_ops, ctx, flags);
-	if (IS_ERR(file)) {
-		ret = PTR_ERR(file);
-		goto err;
-	}
-
-	fd_install(fd, file);
-
-	get_task_struct(ctx->thread);
-	wake_up_process(ctx->thread);
-
-	return fd;
-err:
-	if (fd >= 0)
-		put_unused_fd(fd);
-	if (!IS_ERR_OR_NULL(ctx->thread))
-		kthread_stop(ctx->thread);
-	kfree(ctx);
+		kfree(ctx);
 	return ret;
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] bcachefs: Add ability to redirect log output
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
  2023-12-06 20:33 ` [PATCH 1/6] bcachefs: thread_with_file Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  2023-12-08 20:24   ` Brian Foster
  2023-12-06 20:33 ` [PATCH 3/6] bcachefs: Mark recovery passses that are safe to run online Kent Overstreet
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Upcoming patches are going to add two new ioctls for running fsck in the
kernel, but pretending that we're running our normal userspace fsck.

This patch adds some plumbing for redirecting our normal log messages
away from the dmesg log to a thread_with_file file descriptor - via a
struct log_output, which will be consumed by the fsck f_op's read method.

The new ioctls will allow for running fsck in the kernel against an
offline filesystem (without mounting it), and an online filesystem. For
an offline filesystem we need a way to pass in a pointer to the
log_output, which is done via a new hidden opts.h option.

For online fsck, we can set c->output directly, but only want to
redirect log messages from the thread running fsck - hence the new
c->output_filter method.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/bcachefs.h | 54 +++++++++++++++++++++++++++++++-----------
 fs/bcachefs/opts.h     |  5 ++++
 fs/bcachefs/super.c    | 28 ++++++++++++++++++++++
 3 files changed, 73 insertions(+), 14 deletions(-)

diff --git a/fs/bcachefs/bcachefs.h b/fs/bcachefs/bcachefs.h
index bb2a0cc43f83..4689a6d6233b 100644
--- a/fs/bcachefs/bcachefs.h
+++ b/fs/bcachefs/bcachefs.h
@@ -264,36 +264,54 @@ do {									\
 
 #define bch2_fmt(_c, fmt)		bch2_log_msg(_c, fmt "\n")
 
+void __bch2_print(struct bch_fs *c, const char *fmt, ...);
+
+#define maybe_dev_to_fs(_c)	_Generic((_c),				\
+	struct bch_dev *:	((struct bch_dev *) (_c))->fs,		\
+	struct bch_fs *:	(_c))
+
+#define bch2_print(_c, ...) __bch2_print(maybe_dev_to_fs(_c), __VA_ARGS__)
+
+#define bch2_print_ratelimited(_c, ...)					\
+do {									\
+	static DEFINE_RATELIMIT_STATE(_rs,				\
+				      DEFAULT_RATELIMIT_INTERVAL,	\
+				      DEFAULT_RATELIMIT_BURST);		\
+									\
+	if (__ratelimit(&_rs))						\
+		bch2_print(_c, __VA_ARGS__);				\
+} while (0)
+
 #define bch_info(c, fmt, ...) \
-	printk(KERN_INFO bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_INFO bch2_fmt(c, fmt), ##__VA_ARGS__)
 #define bch_notice(c, fmt, ...) \
-	printk(KERN_NOTICE bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_NOTICE bch2_fmt(c, fmt), ##__VA_ARGS__)
 #define bch_warn(c, fmt, ...) \
-	printk(KERN_WARNING bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_WARNING bch2_fmt(c, fmt), ##__VA_ARGS__)
 #define bch_warn_ratelimited(c, fmt, ...) \
-	printk_ratelimited(KERN_WARNING bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(c, KERN_WARNING bch2_fmt(c, fmt), ##__VA_ARGS__)
 
 #define bch_err(c, fmt, ...) \
-	printk(KERN_ERR bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_ERR bch2_fmt(c, fmt), ##__VA_ARGS__)
 #define bch_err_dev(ca, fmt, ...) \
-	printk(KERN_ERR bch2_fmt_dev(ca, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_ERR bch2_fmt_dev(ca, fmt), ##__VA_ARGS__)
 #define bch_err_dev_offset(ca, _offset, fmt, ...) \
-	printk(KERN_ERR bch2_fmt_dev_offset(ca, _offset, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_ERR bch2_fmt_dev_offset(ca, _offset, fmt), ##__VA_ARGS__)
 #define bch_err_inum(c, _inum, fmt, ...) \
-	printk(KERN_ERR bch2_fmt_inum(c, _inum, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_ERR bch2_fmt_inum(c, _inum, fmt), ##__VA_ARGS__)
 #define bch_err_inum_offset(c, _inum, _offset, fmt, ...) \
-	printk(KERN_ERR bch2_fmt_inum_offset(c, _inum, _offset, fmt), ##__VA_ARGS__)
+	bch2_print(c, KERN_ERR bch2_fmt_inum_offset(c, _inum, _offset, fmt), ##__VA_ARGS__)
 
 #define bch_err_ratelimited(c, fmt, ...) \
-	printk_ratelimited(KERN_ERR bch2_fmt(c, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(c, KERN_ERR bch2_fmt(c, fmt), ##__VA_ARGS__)
 #define bch_err_dev_ratelimited(ca, fmt, ...) \
-	printk_ratelimited(KERN_ERR bch2_fmt_dev(ca, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(ca, KERN_ERR bch2_fmt_dev(ca, fmt), ##__VA_ARGS__)
 #define bch_err_dev_offset_ratelimited(ca, _offset, fmt, ...) \
-	printk_ratelimited(KERN_ERR bch2_fmt_dev_offset(ca, _offset, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(ca, KERN_ERR bch2_fmt_dev_offset(ca, _offset, fmt), ##__VA_ARGS__)
 #define bch_err_inum_ratelimited(c, _inum, fmt, ...) \
-	printk_ratelimited(KERN_ERR bch2_fmt_inum(c, _inum, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(c, KERN_ERR bch2_fmt_inum(c, _inum, fmt), ##__VA_ARGS__)
 #define bch_err_inum_offset_ratelimited(c, _inum, _offset, fmt, ...) \
-	printk_ratelimited(KERN_ERR bch2_fmt_inum_offset(c, _inum, _offset, fmt), ##__VA_ARGS__)
+	bch2_print_ratelimited(c, KERN_ERR bch2_fmt_inum_offset(c, _inum, _offset, fmt), ##__VA_ARGS__)
 
 #define bch_err_fn(_c, _ret)						\
 do {									\
@@ -446,6 +464,12 @@ enum bch_time_stats {
 
 struct btree;
 
+struct log_output {
+	spinlock_t		lock;
+	wait_queue_head_t	wait;
+	struct printbuf		buf;
+};
+
 enum gc_phase {
 	GC_PHASE_NOT_RUNNING,
 	GC_PHASE_START,
@@ -700,6 +724,8 @@ struct bch_fs {
 	struct super_block	*vfs_sb;
 	dev_t			dev;
 	char			name[40];
+	struct log_output	*output;
+	struct task_struct	*output_filter;
 
 	/* ro/rw, add/remove/resize devices: */
 	struct rw_semaphore	state_lock;
diff --git a/fs/bcachefs/opts.h b/fs/bcachefs/opts.h
index 8526f177450a..91026dfb8c83 100644
--- a/fs/bcachefs/opts.h
+++ b/fs/bcachefs/opts.h
@@ -419,6 +419,11 @@ enum fsck_err_opts {
 	  OPT_BOOL(),							\
 	  BCH2_NO_SB_OPT,		false,				\
 	  NULL,		"Allocate the buckets_nouse bitmap")		\
+	x(log_output,			u64,				\
+	  0,								\
+	  OPT_UINT(0, S64_MAX),						\
+	  BCH2_NO_SB_OPT,		false,				\
+	  NULL,		"Allocate the buckets_nouse bitmap")		\
 	x(project,			u8,				\
 	  OPT_INODE,							\
 	  OPT_BOOL(),							\
diff --git a/fs/bcachefs/super.c b/fs/bcachefs/super.c
index e7f186b45df1..6c320defd593 100644
--- a/fs/bcachefs/super.c
+++ b/fs/bcachefs/super.c
@@ -80,6 +80,32 @@ const char * const bch2_fs_flag_strs[] = {
 	NULL
 };
 
+void __bch2_print(struct bch_fs *c, const char *fmt, ...)
+{
+	struct log_output *output = c->output;
+	va_list args;
+
+	if (c->output_filter && c->output_filter != current)
+		output = NULL;
+
+	va_start(args, fmt);
+	if (likely(!output)) {
+		vprintk(fmt, args);
+	} else {
+		unsigned long flags;
+
+		if (fmt[0] == KERN_SOH[0])
+			fmt += 2;
+
+		spin_lock_irqsave(&output->lock, flags);
+		prt_vprintf(&output->buf, fmt, args);
+		spin_unlock_irqrestore(&output->lock, flags);
+
+		wake_up(&output->wait);
+	}
+	va_end(args);
+}
+
 #define KTYPE(type)							\
 static const struct attribute_group type ## _group = {			\
 	.attrs = type ## _files						\
@@ -703,6 +729,8 @@ static struct bch_fs *bch2_fs_alloc(struct bch_sb *sb, struct bch_opts opts)
 		goto out;
 	}
 
+	c->output = (void *)(unsigned long) opts.log_output;
+
 	__module_get(THIS_MODULE);
 
 	closure_init(&c->cl, NULL);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] bcachefs: Mark recovery passses that are safe to run online
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
  2023-12-06 20:33 ` [PATCH 1/6] bcachefs: thread_with_file Kent Overstreet
  2023-12-06 20:33 ` [PATCH 2/6] bcachefs: Add ability to redirect log output Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  2023-12-06 20:33 ` [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes() Kent Overstreet
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Online fsck is coming, and many of our recovery/fsck passes are already
safe to run while the filesystem is in use - mark which ones.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/recovery_types.h | 73 ++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 36 deletions(-)

diff --git a/fs/bcachefs/recovery_types.h b/fs/bcachefs/recovery_types.h
index 515e3d62c2ac..3af29a2762c6 100644
--- a/fs/bcachefs/recovery_types.h
+++ b/fs/bcachefs/recovery_types.h
@@ -6,43 +6,44 @@
 #define PASS_FSCK		BIT(1)
 #define PASS_UNCLEAN		BIT(2)
 #define PASS_ALWAYS		BIT(3)
+#define PASS_ONLINE		BIT(4)
 
-#define BCH_RECOVERY_PASSES()									\
-	x(alloc_read,			PASS_ALWAYS)						\
-	x(stripes_read,			PASS_ALWAYS)						\
-	x(initialize_subvolumes,	0)							\
-	x(snapshots_read,		PASS_ALWAYS)						\
-	x(check_topology,		0)							\
-	x(check_allocations,		PASS_FSCK)						\
-	x(trans_mark_dev_sbs,		PASS_ALWAYS|PASS_SILENT)				\
-	x(fs_journal_alloc,		PASS_ALWAYS|PASS_SILENT)				\
-	x(set_may_go_rw,		PASS_ALWAYS|PASS_SILENT)				\
-	x(journal_replay,		PASS_ALWAYS)						\
-	x(check_alloc_info,		PASS_FSCK)						\
-	x(check_lrus,			PASS_FSCK)						\
-	x(check_btree_backpointers,	PASS_FSCK)						\
-	x(check_backpointers_to_extents,PASS_FSCK)						\
-	x(check_extents_to_backpointers,PASS_FSCK)						\
-	x(check_alloc_to_lru_refs,	PASS_FSCK)						\
-	x(fs_freespace_init,		PASS_ALWAYS|PASS_SILENT)				\
-	x(bucket_gens_init,		0)							\
-	x(check_snapshot_trees,		PASS_FSCK)						\
-	x(check_snapshots,		PASS_FSCK)						\
-	x(check_subvols,		PASS_FSCK)						\
-	x(delete_dead_snapshots,	PASS_FSCK)						\
-	x(fs_upgrade_for_subvolumes,	0)							\
-	x(resume_logged_ops,		PASS_ALWAYS)						\
-	x(check_inodes,			PASS_FSCK)						\
-	x(check_extents,		PASS_FSCK)						\
-	x(check_indirect_extents,	PASS_FSCK)						\
-	x(check_dirents,		PASS_FSCK)						\
-	x(check_xattrs,			PASS_FSCK)						\
-	x(check_root,			PASS_FSCK)						\
-	x(check_directory_structure,	PASS_FSCK)						\
-	x(check_nlinks,			PASS_FSCK)						\
-	x(delete_dead_inodes,		PASS_FSCK|PASS_UNCLEAN)					\
-	x(fix_reflink_p,		0)							\
-	x(set_fs_needs_rebalance,	0)							\
+#define BCH_RECOVERY_PASSES()							\
+	x(alloc_read,				PASS_ALWAYS)			\
+	x(stripes_read,				PASS_ALWAYS)			\
+	x(initialize_subvolumes,		0)				\
+	x(snapshots_read,			PASS_ALWAYS)			\
+	x(check_topology,			0)				\
+	x(check_allocations,			PASS_FSCK)			\
+	x(trans_mark_dev_sbs,			PASS_ALWAYS|PASS_SILENT)	\
+	x(fs_journal_alloc,			PASS_ALWAYS|PASS_SILENT)	\
+	x(set_may_go_rw,			PASS_ALWAYS|PASS_SILENT)	\
+	x(journal_replay,			PASS_ALWAYS)			\
+	x(check_alloc_info,			PASS_ONLINE|PASS_FSCK)		\
+	x(check_lrus,				PASS_ONLINE|PASS_FSCK)		\
+	x(check_btree_backpointers,		PASS_ONLINE|PASS_FSCK)		\
+	x(check_backpointers_to_extents,	PASS_ONLINE|PASS_FSCK)		\
+	x(check_extents_to_backpointers,	PASS_ONLINE|PASS_FSCK)		\
+	x(check_alloc_to_lru_refs,		PASS_ONLINE|PASS_FSCK)		\
+	x(fs_freespace_init,			PASS_ALWAYS|PASS_SILENT)	\
+	x(bucket_gens_init,			0)				\
+	x(check_snapshot_trees,			PASS_ONLINE|PASS_FSCK)		\
+	x(check_snapshots,			PASS_ONLINE|PASS_FSCK)		\
+	x(check_subvols,			PASS_ONLINE|PASS_FSCK)		\
+	x(delete_dead_snapshots,		PASS_ONLINE|PASS_FSCK)		\
+	x(fs_upgrade_for_subvolumes,		0)				\
+	x(resume_logged_ops,			PASS_ALWAYS)			\
+	x(check_inodes,				PASS_FSCK)			\
+	x(check_extents,			PASS_FSCK)			\
+	x(check_indirect_extents,		PASS_FSCK)			\
+	x(check_dirents,			PASS_FSCK)			\
+	x(check_xattrs,				PASS_FSCK)			\
+	x(check_root,				PASS_FSCK)			\
+	x(check_directory_structure,		PASS_FSCK)			\
+	x(check_nlinks,				PASS_FSCK)			\
+	x(delete_dead_inodes,			PASS_FSCK|PASS_UNCLEAN)		\
+	x(fix_reflink_p,			0)				\
+	x(set_fs_needs_rebalance,		0)				\
 
 enum bch_recovery_pass {
 #define x(n, when)	BCH_RECOVERY_PASS_##n,
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes()
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
                   ` (2 preceding siblings ...)
  2023-12-06 20:33 ` [PATCH 3/6] bcachefs: Mark recovery passses that are safe to run online Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  2023-12-08 20:25   ` Brian Foster
  2023-12-06 20:33 ` [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE Kent Overstreet
  2023-12-06 20:33 ` [PATCH 6/6] bcachefs: BCH_IOCTL_FSCK_ONLINE Kent Overstreet
  5 siblings, 1 reply; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Add a new helper for running online recovery passes - i.e. online fsck.
This is a subset of our normal recovery passes, and does not - for now -
use or follow c->curr_recovery_pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/bcachefs.h |  7 ++++++
 fs/bcachefs/recovery.c | 51 ++++++++++++++++++++++++++++--------------
 fs/bcachefs/recovery.h |  1 +
 3 files changed, 42 insertions(+), 17 deletions(-)

diff --git a/fs/bcachefs/bcachefs.h b/fs/bcachefs/bcachefs.h
index 4689a6d6233b..83ef5d29c226 100644
--- a/fs/bcachefs/bcachefs.h
+++ b/fs/bcachefs/bcachefs.h
@@ -1038,6 +1038,13 @@ struct bch_fs {
 	/* RECOVERY */
 	u64			journal_replay_seq_start;
 	u64			journal_replay_seq_end;
+	/*
+	 * Two different uses:
+	 * "Has this fsck pass?" - i.e. should this type of error be an
+	 * emergency read-only
+	 * And, in certain situations fsck will rewind to an earlier pass: used
+	 * for signaling to the toplevel code which pass we want to run now.
+	 */
 	enum bch_recovery_pass	curr_recovery_pass;
 	/* bitmap of explicitly enabled recovery passes: */
 	u64			recovery_passes_explicit;
diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
index 585b4928964f..262c923b2f1a 100644
--- a/fs/bcachefs/recovery.c
+++ b/fs/bcachefs/recovery.c
@@ -634,7 +634,7 @@ u64 bch2_fsck_recovery_passes(void)
 
 static bool should_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
 {
-	struct recovery_pass_fn *p = recovery_pass_fns + c->curr_recovery_pass;
+	struct recovery_pass_fn *p = recovery_pass_fns + pass;
 
 	if (c->opts.norecovery && pass > BCH_RECOVERY_PASS_snapshots_read)
 		return false;
@@ -651,39 +651,56 @@ static bool should_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pa
 
 static int bch2_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
 {
+	struct recovery_pass_fn *p = recovery_pass_fns + pass;
 	int ret;
 
-	c->curr_recovery_pass = pass;
+	if (!(p->when & PASS_SILENT))
+		printk(KERN_INFO bch2_log_msg(c, "%s..."),
+		       bch2_recovery_passes[pass]);
+	ret = p->fn(c);
+	if (ret)
+		return ret;
+	if (!(p->when & PASS_SILENT))
+		printk(KERN_CONT " done\n");
 
-	if (should_run_recovery_pass(c, pass)) {
-		struct recovery_pass_fn *p = recovery_pass_fns + pass;
+	return 0;
+}
 
-		if (!(p->when & PASS_SILENT))
-			printk(KERN_INFO bch2_log_msg(c, "%s..."),
-			       bch2_recovery_passes[pass]);
-		ret = p->fn(c);
-		if (ret)
-			return ret;
-		if (!(p->when & PASS_SILENT))
-			printk(KERN_CONT " done\n");
+static int bch2_run_recovery_passes(struct bch_fs *c)
+{
+	int ret = 0;
 
-		c->recovery_passes_complete |= BIT_ULL(pass);
+	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
+		if (should_run_recovery_pass(c, c->curr_recovery_pass)) {
+			ret = bch2_run_recovery_pass(c, c->curr_recovery_pass);
+			if (bch2_err_matches(ret, BCH_ERR_restart_recovery))
+				continue;
+			if (ret)
+				break;
+
+			c->recovery_passes_complete |= BIT_ULL(c->curr_recovery_pass);
+		}
+		c->curr_recovery_pass++;
 	}
 
-	return 0;
+	return ret;
 }
 
-static int bch2_run_recovery_passes(struct bch_fs *c)
+int bch2_run_online_recovery_passes(struct bch_fs *c)
 {
 	int ret = 0;
 
-	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
+	for (unsigned i = 0; i < ARRAY_SIZE(recovery_pass_fns); i++) {
+		struct recovery_pass_fn *p = recovery_pass_fns + c->curr_recovery_pass;
+
+		if (!(p->when & PASS_ONLINE))
+			continue;
+
 		ret = bch2_run_recovery_pass(c, c->curr_recovery_pass);
 		if (bch2_err_matches(ret, BCH_ERR_restart_recovery))
 			continue;
 		if (ret)
 			break;
-		c->curr_recovery_pass++;
 	}
 
 	return ret;
diff --git a/fs/bcachefs/recovery.h b/fs/bcachefs/recovery.h
index 852d30567da9..447590f60609 100644
--- a/fs/bcachefs/recovery.h
+++ b/fs/bcachefs/recovery.h
@@ -25,6 +25,7 @@ static inline int bch2_run_explicit_recovery_pass(struct bch_fs *c,
 	}
 }
 
+int bch2_run_online_recovery_passes(struct bch_fs *);
 u64 bch2_fsck_recovery_passes(void);
 
 int bch2_fs_recovery(struct bch_fs *);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
                   ` (3 preceding siblings ...)
  2023-12-06 20:33 ` [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes() Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  2023-12-08 20:26   ` Brian Foster
  2023-12-06 20:33 ` [PATCH 6/6] bcachefs: BCH_IOCTL_FSCK_ONLINE Kent Overstreet
  5 siblings, 1 reply; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

This adds a new ioctl for running fsck on a list of devices.

Normally, if we wish to use the kernel's implementation of fsck we'd run
it at mount time with -o fsck. This ioctl lets us run fsck without
mounting, so that userspace bcachefs-tools can transparently switch to
the kernel's implementation of fsck when appropriate - primarily if the
kernel version of bcachefs better matches the filesystem on disk.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/bcachefs_ioctl.h |  13 +++
 fs/bcachefs/chardev.c        | 206 ++++++++++++++++++++++++++++++++++-
 fs/bcachefs/recovery.c       |   6 +-
 3 files changed, 219 insertions(+), 6 deletions(-)

diff --git a/fs/bcachefs/bcachefs_ioctl.h b/fs/bcachefs/bcachefs_ioctl.h
index 43822c17297c..07c490851742 100644
--- a/fs/bcachefs/bcachefs_ioctl.h
+++ b/fs/bcachefs/bcachefs_ioctl.h
@@ -83,6 +83,8 @@ struct bch_ioctl_incremental {
 
 #define BCH_IOCTL_DEV_USAGE_V2	_IOWR(0xbc,	18, struct bch_ioctl_dev_usage_v2)
 
+#define BCH_IOCTL_FSCK_OFFLINE		_IOW(0xbc,	19,  struct bch_ioctl_fsck_offline)
+
 /* ioctl below act on a particular file, not the filesystem as a whole: */
 
 #define BCHFS_IOC_REINHERIT_ATTRS	_IOR(0xbc, 64, const char __user *)
@@ -386,4 +388,15 @@ struct bch_ioctl_subvolume {
 #define BCH_SUBVOL_SNAPSHOT_CREATE	(1U << 0)
 #define BCH_SUBVOL_SNAPSHOT_RO		(1U << 1)
 
+/*
+ * BCH_IOCTL_FSCK_OFFLINE: run fsck from the 'bcachefs fsck' userspace command,
+ * but with the kernel's implementation of fsck:
+ */
+struct bch_ioctl_fsck_offline {
+	__u64			flags;
+	__u64			opts;		/* string */
+	__u64			nr_devs;
+	__u64			devs[0];
+};
+
 #endif /* _BCACHEFS_IOCTL_H */
diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
index 8e3ac2d32298..03082a001036 100644
--- a/fs/bcachefs/chardev.c
+++ b/fs/bcachefs/chardev.c
@@ -32,12 +32,15 @@ static int copy_to_user_errcode(void __user *to, const void *from, unsigned long
 struct thread_with_file {
 	struct task_struct	*task;
 	int			ret;
+	bool			done;
 };
 
 static void thread_with_file_exit(struct thread_with_file *thr)
 {
-	kthread_stop(thr->task);
-	put_task_struct(thr->task);
+	if (thr->task) {
+		kthread_stop(thr->task);
+		put_task_struct(thr->task);
+	}
 }
 
 static int run_thread_with_file(struct thread_with_file *thr,
@@ -193,8 +196,196 @@ static long bch2_ioctl_incremental(struct bch_ioctl_incremental __user *user_arg
 }
 #endif
 
+struct fsck_thread {
+	struct thread_with_file	thr;
+	struct printbuf		buf;
+	char			**devs;
+	size_t			nr_devs;
+	struct bch_opts		opts;
+
+	struct log_output	output;
+	DARRAY(char)		output2;
+};
+
+static void bch2_fsck_thread_free(struct fsck_thread *thr)
+{
+	thread_with_file_exit(&thr->thr);
+	if (thr->devs)
+		for (size_t i = 0; i < thr->nr_devs; i++)
+			kfree(thr->devs[i]);
+	darray_exit(&thr->output2);
+	printbuf_exit(&thr->output.buf);
+	kfree(thr->devs);
+	kfree(thr);
+}
+
+static int bch2_fsck_thread_release(struct inode *inode, struct file *file)
+{
+	struct fsck_thread *thr = container_of(file->private_data, struct fsck_thread, thr);
+
+	bch2_fsck_thread_free(thr);
+	return 0;
+}
+
+static bool fsck_thread_ready(struct fsck_thread *thr)
+{
+	return thr->output.buf.pos ||
+		thr->output2.nr ||
+		thr->thr.done;
+}
+
+static ssize_t bch2_fsck_thread_read(struct file *file, char __user *buf,
+				     size_t len, loff_t *ppos)
+{
+	struct fsck_thread *thr = container_of(file->private_data, struct fsck_thread, thr);
+	size_t copied = 0, b;
+	int ret = 0;
+
+	if ((file->f_flags & O_NONBLOCK) &&
+	    !fsck_thread_ready(thr))
+		return -EAGAIN;
+
+	ret = wait_event_interruptible(thr->output.wait,
+			fsck_thread_ready(thr));
+	if (ret)
+		return ret;
+
+	if (thr->thr.done)
+		return 0;
+
+	while (len) {
+		ret = darray_make_room(&thr->output2, thr->output.buf.pos);
+		if (ret)
+			break;
+
+		spin_lock_irq(&thr->output.lock);
+		b = min_t(size_t, darray_room(thr->output2), thr->output.buf.pos);
+
+		memcpy(&darray_top(thr->output2), thr->output.buf.buf, b);
+		memmove(thr->output.buf.buf,
+			thr->output.buf.buf + b,
+			thr->output.buf.pos - b);
+
+		thr->output2.nr += b;
+		thr->output.buf.pos -= b;
+		spin_unlock_irq(&thr->output.lock);
+
+		b = min(len, thr->output2.nr);
+		if (!b)
+			break;
+
+		b -= copy_to_user(buf, thr->output2.data, b);
+		if (!b) {
+			ret = -EFAULT;
+			break;
+		}
+
+		copied	+= b;
+		buf	+= b;
+		len	-= b;
+
+		memmove(thr->output2.data,
+			thr->output2.data + b,
+			thr->output2.nr - b);
+		thr->output2.nr -= b;
+	}
+
+	return copied ?: ret;
+}
+
+static const struct file_operations fsck_thread_ops = {
+	.release	= bch2_fsck_thread_release,
+	.read		= bch2_fsck_thread_read,
+	.llseek		= no_llseek,
+};
+
+static int bch2_fsck_offline_thread_fn(void *arg)
+{
+	struct fsck_thread *thr = container_of(arg, struct fsck_thread, thr);
+	struct bch_fs *c = bch2_fs_open(thr->devs, thr->nr_devs, thr->opts);
+
+	thr->thr.ret = PTR_ERR_OR_ZERO(c);
+	if (!thr->thr.ret)
+		bch2_fs_stop(c);
+
+	thr->thr.done = true;
+	wake_up(&thr->output.wait);
+	return 0;
+}
+
+static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_arg)
+{
+	struct bch_ioctl_fsck_offline arg;
+	struct fsck_thread *thr = NULL;
+	u64 *devs = NULL;
+	long ret = 0;
+
+	if (copy_from_user(&arg, user_arg, sizeof(arg)))
+		return -EFAULT;
+
+	if (arg.flags)
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (!(devs = kcalloc(arg.nr_devs, sizeof(*devs), GFP_KERNEL)) ||
+	    !(thr = kzalloc(sizeof(*thr), GFP_KERNEL)) ||
+	    !(thr->devs = kcalloc(arg.nr_devs, sizeof(*thr->devs), GFP_KERNEL))) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	thr->nr_devs = arg.nr_devs;
+	thr->output.buf	= PRINTBUF;
+	thr->output.buf.atomic++;
+	spin_lock_init(&thr->output.lock);
+	init_waitqueue_head(&thr->output.wait);
+	darray_init(&thr->output2);
+
+	if (copy_from_user(devs, &user_arg->devs[0], sizeof(user_arg->devs[0]) * arg.nr_devs)) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	for (size_t i = 0; i < arg.nr_devs; i++) {
+		thr->devs[i] = strndup_user((char __user *)(unsigned long) devs[i], PATH_MAX);
+		ret = PTR_ERR_OR_ZERO(thr->devs[i]);
+		if (ret)
+			goto err;
+	}
+
+	if (arg.opts) {
+		char *optstr = strndup_user((char __user *)(unsigned long) arg.opts, 1 << 16);
+
+		ret =   PTR_ERR_OR_ZERO(optstr) ?:
+			bch2_parse_mount_opts(NULL, &thr->opts, optstr);
+		kfree(optstr);
+
+		if (ret)
+			goto err;
+	}
+
+	opt_set(thr->opts, log_output, (u64)(unsigned long)&thr->output);
+
+	ret = run_thread_with_file(&thr->thr,
+				   &fsck_thread_ops,
+				   bch2_fsck_offline_thread_fn,
+				   "bch-fsck");
+err:
+	if (ret < 0) {
+		if (thr)
+			bch2_fsck_thread_free(thr);
+		pr_err("ret %s", bch2_err_str(ret));
+	}
+	kfree(devs);
+	return ret;
+}
+
 static long bch2_global_ioctl(unsigned cmd, void __user *arg)
 {
+	long ret;
+
 	switch (cmd) {
 #if 0
 	case BCH_IOCTL_ASSEMBLE:
@@ -202,9 +393,18 @@ static long bch2_global_ioctl(unsigned cmd, void __user *arg)
 	case BCH_IOCTL_INCREMENTAL:
 		return bch2_ioctl_incremental(arg);
 #endif
+	case BCH_IOCTL_FSCK_OFFLINE: {
+		ret = bch2_ioctl_fsck_offline(arg);
+		break;
+	}
 	default:
-		return -ENOTTY;
+		ret = -ENOTTY;
+		break;
 	}
+
+	if (ret < 0)
+		ret = bch2_err_class(ret);
+	return ret;
 }
 
 static long bch2_ioctl_query_uuid(struct bch_fs *c,
diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
index 262c923b2f1a..2f5daecfbcf7 100644
--- a/fs/bcachefs/recovery.c
+++ b/fs/bcachefs/recovery.c
@@ -655,13 +655,13 @@ static int bch2_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
 	int ret;
 
 	if (!(p->when & PASS_SILENT))
-		printk(KERN_INFO bch2_log_msg(c, "%s..."),
-		       bch2_recovery_passes[pass]);
+		bch2_print(c, KERN_INFO bch2_log_msg(c, "%s..."),
+			   bch2_recovery_passes[pass]);
 	ret = p->fn(c);
 	if (ret)
 		return ret;
 	if (!(p->when & PASS_SILENT))
-		printk(KERN_CONT " done\n");
+		bch2_print(c, KERN_CONT " done\n");
 
 	return 0;
 }
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] bcachefs: BCH_IOCTL_FSCK_ONLINE
  2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
                   ` (4 preceding siblings ...)
  2023-12-06 20:33 ` [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE Kent Overstreet
@ 2023-12-06 20:33 ` Kent Overstreet
  5 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:33 UTC (permalink / raw)
  To: linux-bcachefs; +Cc: Kent Overstreet, djwong

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/bcachefs_ioctl.h | 11 +++++++
 fs/bcachefs/chardev.c        | 61 +++++++++++++++++++++++++++++++++++-
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/fs/bcachefs/bcachefs_ioctl.h b/fs/bcachefs/bcachefs_ioctl.h
index 07c490851742..2ac6272c8ef5 100644
--- a/fs/bcachefs/bcachefs_ioctl.h
+++ b/fs/bcachefs/bcachefs_ioctl.h
@@ -85,6 +85,8 @@ struct bch_ioctl_incremental {
 
 #define BCH_IOCTL_FSCK_OFFLINE		_IOW(0xbc,	19,  struct bch_ioctl_fsck_offline)
 
+#define BCH_IOCTL_FSCK_ONLINE		_IOW(0xbc,	20,  struct bch_ioctl_fsck_online)
+
 /* ioctl below act on a particular file, not the filesystem as a whole: */
 
 #define BCHFS_IOC_REINHERIT_ATTRS	_IOR(0xbc, 64, const char __user *)
@@ -399,4 +401,13 @@ struct bch_ioctl_fsck_offline {
 	__u64			devs[0];
 };
 
+/*
+ * BCH_IOCTL_FSCK_ONLINE: run fsck from the 'bcachefs fsck' userspace command,
+ * but with the kernel's implementation of fsck:
+ */
+struct bch_ioctl_fsck_online {
+	__u64			flags;
+	__u64			opts;		/* string */
+};
+
 #endif /* _BCACHEFS_IOCTL_H */
diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
index 03082a001036..c572f3b7aa1f 100644
--- a/fs/bcachefs/chardev.c
+++ b/fs/bcachefs/chardev.c
@@ -7,6 +7,7 @@
 #include "chardev.h"
 #include "journal.h"
 #include "move.h"
+#include "recovery.h"
 #include "replicas.h"
 #include "super.h"
 #include "super-io.h"
@@ -199,6 +200,7 @@ static long bch2_ioctl_incremental(struct bch_ioctl_incremental __user *user_arg
 struct fsck_thread {
 	struct thread_with_file	thr;
 	struct printbuf		buf;
+	struct bch_fs		*c;
 	char			**devs;
 	size_t			nr_devs;
 	struct bch_opts		opts;
@@ -915,6 +917,62 @@ static long bch2_ioctl_disk_resize_journal(struct bch_fs *c,
 	return ret;
 }
 
+static int bch2_fsck_online_thread_fn(void *arg)
+{
+	struct fsck_thread *thr = container_of(arg, struct fsck_thread, thr);
+	struct bch_fs *c = thr->c;
+
+	c->output_filter = current;
+	c->output = &thr->output;
+
+	c->opts.fsck = true;
+	c->curr_recovery_pass = BCH_RECOVERY_PASS_check_alloc_info;
+	bch2_run_online_recovery_passes(c);
+
+	c->output = NULL;
+	c->output_filter = NULL;
+
+	thr->thr.done = true;
+	wake_up(&thr->output.wait);
+	return 0;
+}
+
+static long bch2_ioctl_fsck_online(struct bch_fs *c,
+				   struct bch_ioctl_fsck_online arg)
+{
+	struct fsck_thread *thr = NULL;
+	long ret = 0;
+
+	/* XXX what refcount are we taking on the filesystem? */
+
+	if (arg.flags)
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	thr = kzalloc(sizeof(*thr), GFP_KERNEL);
+	if (!thr)
+		return -ENOMEM;
+
+	thr->c = c;
+	thr->output.buf	= PRINTBUF;
+	thr->output.buf.atomic++;
+	spin_lock_init(&thr->output.lock);
+	init_waitqueue_head(&thr->output.wait);
+	darray_init(&thr->output2);
+
+	ret = run_thread_with_file(&thr->thr,
+				   &fsck_thread_ops,
+				   bch2_fsck_online_thread_fn,
+				   "bch-fsck");
+	if (ret < 0) {
+		bch_err_fn(c, ret);
+		bch2_fsck_thread_free(thr);
+	}
+	return ret;
+}
+
 #define BCH_IOCTL(_name, _argtype)					\
 do {									\
 	_argtype i;							\
@@ -970,7 +1028,8 @@ long bch2_fs_ioctl(struct bch_fs *c, unsigned cmd, void __user *arg)
 		BCH_IOCTL(disk_resize, struct bch_ioctl_disk_resize);
 	case BCH_IOCTL_DISK_RESIZE_JOURNAL:
 		BCH_IOCTL(disk_resize_journal, struct bch_ioctl_disk_resize_journal);
-
+	case BCH_IOCTL_FSCK_ONLINE:
+		BCH_IOCTL(fsck_online, struct bch_ioctl_fsck_online);
 	default:
 		return -ENOTTY;
 	}
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] bcachefs: Add ability to redirect log output
  2023-12-06 20:33 ` [PATCH 2/6] bcachefs: Add ability to redirect log output Kent Overstreet
@ 2023-12-08 20:24   ` Brian Foster
  2023-12-08 20:35     ` Kent Overstreet
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2023-12-08 20:24 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcachefs, djwong

On Wed, Dec 06, 2023 at 03:33:06PM -0500, Kent Overstreet wrote:
> Upcoming patches are going to add two new ioctls for running fsck in the
> kernel, but pretending that we're running our normal userspace fsck.
> 
> This patch adds some plumbing for redirecting our normal log messages
> away from the dmesg log to a thread_with_file file descriptor - via a
> struct log_output, which will be consumed by the fsck f_op's read method.
> 
> The new ioctls will allow for running fsck in the kernel against an
> offline filesystem (without mounting it), and an online filesystem. For
> an offline filesystem we need a way to pass in a pointer to the
> log_output, which is done via a new hidden opts.h option.
> 
> For online fsck, we can set c->output directly, but only want to
> redirect log messages from the thread running fsck - hence the new
> c->output_filter method.
> 
> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> ---
>  fs/bcachefs/bcachefs.h | 54 +++++++++++++++++++++++++++++++-----------
>  fs/bcachefs/opts.h     |  5 ++++
>  fs/bcachefs/super.c    | 28 ++++++++++++++++++++++
>  3 files changed, 73 insertions(+), 14 deletions(-)
> 
...
> diff --git a/fs/bcachefs/opts.h b/fs/bcachefs/opts.h
> index 8526f177450a..91026dfb8c83 100644
> --- a/fs/bcachefs/opts.h
> +++ b/fs/bcachefs/opts.h
> @@ -419,6 +419,11 @@ enum fsck_err_opts {
>  	  OPT_BOOL(),							\
>  	  BCH2_NO_SB_OPT,		false,				\
>  	  NULL,		"Allocate the buckets_nouse bitmap")		\
> +	x(log_output,			u64,				\
> +	  0,								\
> +	  OPT_UINT(0, S64_MAX),						\
> +	  BCH2_NO_SB_OPT,		false,				\
> +	  NULL,		"Allocate the buckets_nouse bitmap")		\

I assume you want to update the string to something more related. :)

Brian

>  	x(project,			u8,				\
>  	  OPT_INODE,							\
>  	  OPT_BOOL(),							\
> diff --git a/fs/bcachefs/super.c b/fs/bcachefs/super.c
> index e7f186b45df1..6c320defd593 100644
> --- a/fs/bcachefs/super.c
> +++ b/fs/bcachefs/super.c
> @@ -80,6 +80,32 @@ const char * const bch2_fs_flag_strs[] = {
>  	NULL
>  };
>  
> +void __bch2_print(struct bch_fs *c, const char *fmt, ...)
> +{
> +	struct log_output *output = c->output;
> +	va_list args;
> +
> +	if (c->output_filter && c->output_filter != current)
> +		output = NULL;
> +
> +	va_start(args, fmt);
> +	if (likely(!output)) {
> +		vprintk(fmt, args);
> +	} else {
> +		unsigned long flags;
> +
> +		if (fmt[0] == KERN_SOH[0])
> +			fmt += 2;
> +
> +		spin_lock_irqsave(&output->lock, flags);
> +		prt_vprintf(&output->buf, fmt, args);
> +		spin_unlock_irqrestore(&output->lock, flags);
> +
> +		wake_up(&output->wait);
> +	}
> +	va_end(args);
> +}
> +
>  #define KTYPE(type)							\
>  static const struct attribute_group type ## _group = {			\
>  	.attrs = type ## _files						\
> @@ -703,6 +729,8 @@ static struct bch_fs *bch2_fs_alloc(struct bch_sb *sb, struct bch_opts opts)
>  		goto out;
>  	}
>  
> +	c->output = (void *)(unsigned long) opts.log_output;
> +
>  	__module_get(THIS_MODULE);
>  
>  	closure_init(&c->cl, NULL);
> -- 
> 2.42.0
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes()
  2023-12-06 20:33 ` [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes() Kent Overstreet
@ 2023-12-08 20:25   ` Brian Foster
  2023-12-08 20:34     ` Kent Overstreet
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2023-12-08 20:25 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcachefs, djwong

On Wed, Dec 06, 2023 at 03:33:08PM -0500, Kent Overstreet wrote:
> Add a new helper for running online recovery passes - i.e. online fsck.
> This is a subset of our normal recovery passes, and does not - for now -
> use or follow c->curr_recovery_pass.
> 
> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> ---
>  fs/bcachefs/bcachefs.h |  7 ++++++
>  fs/bcachefs/recovery.c | 51 ++++++++++++++++++++++++++++--------------
>  fs/bcachefs/recovery.h |  1 +
>  3 files changed, 42 insertions(+), 17 deletions(-)
> 
...
> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> index 585b4928964f..262c923b2f1a 100644
> --- a/fs/bcachefs/recovery.c
> +++ b/fs/bcachefs/recovery.c
...
> @@ -651,39 +651,56 @@ static bool should_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pa
>  
>  static int bch2_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
>  {
> +	struct recovery_pass_fn *p = recovery_pass_fns + pass;
>  	int ret;
>  
> -	c->curr_recovery_pass = pass;
> +	if (!(p->when & PASS_SILENT))
> +		printk(KERN_INFO bch2_log_msg(c, "%s..."),
> +		       bch2_recovery_passes[pass]);
> +	ret = p->fn(c);
> +	if (ret)
> +		return ret;
> +	if (!(p->when & PASS_SILENT))
> +		printk(KERN_CONT " done\n");
>  
> -	if (should_run_recovery_pass(c, pass)) {
> -		struct recovery_pass_fn *p = recovery_pass_fns + pass;
> +	return 0;
> +}
>  
> -		if (!(p->when & PASS_SILENT))
> -			printk(KERN_INFO bch2_log_msg(c, "%s..."),
> -			       bch2_recovery_passes[pass]);
> -		ret = p->fn(c);
> -		if (ret)
> -			return ret;
> -		if (!(p->when & PASS_SILENT))
> -			printk(KERN_CONT " done\n");
> +static int bch2_run_recovery_passes(struct bch_fs *c)
> +{
> +	int ret = 0;
>  
> -		c->recovery_passes_complete |= BIT_ULL(pass);
> +	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
> +		if (should_run_recovery_pass(c, c->curr_recovery_pass)) {
> +			ret = bch2_run_recovery_pass(c, c->curr_recovery_pass);
> +			if (bch2_err_matches(ret, BCH_ERR_restart_recovery))
> +				continue;
> +			if (ret)
> +				break;
> +
> +			c->recovery_passes_complete |= BIT_ULL(c->curr_recovery_pass);
> +		}
> +		c->curr_recovery_pass++;
>  	}
>  
> -	return 0;
> +	return ret;
>  }
>  
> -static int bch2_run_recovery_passes(struct bch_fs *c)
> +int bch2_run_online_recovery_passes(struct bch_fs *c)
>  {
>  	int ret = 0;
>  
> -	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
> +	for (unsigned i = 0; i < ARRAY_SIZE(recovery_pass_fns); i++) {
> +		struct recovery_pass_fn *p = recovery_pass_fns + c->curr_recovery_pass;

Maybe I'm confused, but should this be using i?

Brian

> +
> +		if (!(p->when & PASS_ONLINE))
> +			continue;
> +
>  		ret = bch2_run_recovery_pass(c, c->curr_recovery_pass);
>  		if (bch2_err_matches(ret, BCH_ERR_restart_recovery))
>  			continue;
>  		if (ret)
>  			break;
> -		c->curr_recovery_pass++;
>  	}
>  
>  	return ret;
> diff --git a/fs/bcachefs/recovery.h b/fs/bcachefs/recovery.h
> index 852d30567da9..447590f60609 100644
> --- a/fs/bcachefs/recovery.h
> +++ b/fs/bcachefs/recovery.h
> @@ -25,6 +25,7 @@ static inline int bch2_run_explicit_recovery_pass(struct bch_fs *c,
>  	}
>  }
>  
> +int bch2_run_online_recovery_passes(struct bch_fs *);
>  u64 bch2_fsck_recovery_passes(void);
>  
>  int bch2_fs_recovery(struct bch_fs *);
> -- 
> 2.42.0
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE
  2023-12-06 20:33 ` [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE Kent Overstreet
@ 2023-12-08 20:26   ` Brian Foster
  2023-12-08 20:33     ` Kent Overstreet
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2023-12-08 20:26 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: linux-bcachefs, djwong

On Wed, Dec 06, 2023 at 03:33:09PM -0500, Kent Overstreet wrote:
> This adds a new ioctl for running fsck on a list of devices.
> 
> Normally, if we wish to use the kernel's implementation of fsck we'd run
> it at mount time with -o fsck. This ioctl lets us run fsck without
> mounting, so that userspace bcachefs-tools can transparently switch to
> the kernel's implementation of fsck when appropriate - primarily if the
> kernel version of bcachefs better matches the filesystem on disk.
> 
> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> ---
>  fs/bcachefs/bcachefs_ioctl.h |  13 +++
>  fs/bcachefs/chardev.c        | 206 ++++++++++++++++++++++++++++++++++-
>  fs/bcachefs/recovery.c       |   6 +-
>  3 files changed, 219 insertions(+), 6 deletions(-)
> 
...
> diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
> index 8e3ac2d32298..03082a001036 100644
> --- a/fs/bcachefs/chardev.c
> +++ b/fs/bcachefs/chardev.c
...
> @@ -193,8 +196,196 @@ static long bch2_ioctl_incremental(struct bch_ioctl_incremental __user *user_arg
>  }
>  #endif
>  
...
> +
> +static const struct file_operations fsck_thread_ops = {
> +	.release	= bch2_fsck_thread_release,
> +	.read		= bch2_fsck_thread_read,
> +	.llseek		= no_llseek,
> +};
> +
> +static int bch2_fsck_offline_thread_fn(void *arg)
> +{
> +	struct fsck_thread *thr = container_of(arg, struct fsck_thread, thr);
> +	struct bch_fs *c = bch2_fs_open(thr->devs, thr->nr_devs, thr->opts);
> +
> +	thr->thr.ret = PTR_ERR_OR_ZERO(c);
> +	if (!thr->thr.ret)
> +		bch2_fs_stop(c);
> +

Looks pretty neat. Mainly just curious if I'm following this right...
Basically it looks like we wire up the log output redirect to c->output
via options, then start/stop the fs which runs "mount time" fsck (?),
but feeding the message output to the redirect (and then fed to
userspace via the fd/read handler just above). Hm?

Brian

> +	thr->thr.done = true;
> +	wake_up(&thr->output.wait);
> +	return 0;
> +}
> +
> +static long bch2_ioctl_fsck_offline(struct bch_ioctl_fsck_offline __user *user_arg)
> +{
> +	struct bch_ioctl_fsck_offline arg;
> +	struct fsck_thread *thr = NULL;
> +	u64 *devs = NULL;
> +	long ret = 0;
> +
> +	if (copy_from_user(&arg, user_arg, sizeof(arg)))
> +		return -EFAULT;
> +
> +	if (arg.flags)
> +		return -EINVAL;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	if (!(devs = kcalloc(arg.nr_devs, sizeof(*devs), GFP_KERNEL)) ||
> +	    !(thr = kzalloc(sizeof(*thr), GFP_KERNEL)) ||
> +	    !(thr->devs = kcalloc(arg.nr_devs, sizeof(*thr->devs), GFP_KERNEL))) {
> +		ret = -ENOMEM;
> +		goto err;
> +	}
> +
> +	thr->nr_devs = arg.nr_devs;
> +	thr->output.buf	= PRINTBUF;
> +	thr->output.buf.atomic++;
> +	spin_lock_init(&thr->output.lock);
> +	init_waitqueue_head(&thr->output.wait);
> +	darray_init(&thr->output2);
> +
> +	if (copy_from_user(devs, &user_arg->devs[0], sizeof(user_arg->devs[0]) * arg.nr_devs)) {
> +		ret = -EINVAL;
> +		goto err;
> +	}
> +
> +	for (size_t i = 0; i < arg.nr_devs; i++) {
> +		thr->devs[i] = strndup_user((char __user *)(unsigned long) devs[i], PATH_MAX);
> +		ret = PTR_ERR_OR_ZERO(thr->devs[i]);
> +		if (ret)
> +			goto err;
> +	}
> +
> +	if (arg.opts) {
> +		char *optstr = strndup_user((char __user *)(unsigned long) arg.opts, 1 << 16);
> +
> +		ret =   PTR_ERR_OR_ZERO(optstr) ?:
> +			bch2_parse_mount_opts(NULL, &thr->opts, optstr);
> +		kfree(optstr);
> +
> +		if (ret)
> +			goto err;
> +	}
> +
> +	opt_set(thr->opts, log_output, (u64)(unsigned long)&thr->output);
> +
> +	ret = run_thread_with_file(&thr->thr,
> +				   &fsck_thread_ops,
> +				   bch2_fsck_offline_thread_fn,
> +				   "bch-fsck");
> +err:
> +	if (ret < 0) {
> +		if (thr)
> +			bch2_fsck_thread_free(thr);
> +		pr_err("ret %s", bch2_err_str(ret));
> +	}
> +	kfree(devs);
> +	return ret;
> +}
> +
>  static long bch2_global_ioctl(unsigned cmd, void __user *arg)
>  {
> +	long ret;
> +
>  	switch (cmd) {
>  #if 0
>  	case BCH_IOCTL_ASSEMBLE:
> @@ -202,9 +393,18 @@ static long bch2_global_ioctl(unsigned cmd, void __user *arg)
>  	case BCH_IOCTL_INCREMENTAL:
>  		return bch2_ioctl_incremental(arg);
>  #endif
> +	case BCH_IOCTL_FSCK_OFFLINE: {
> +		ret = bch2_ioctl_fsck_offline(arg);
> +		break;
> +	}
>  	default:
> -		return -ENOTTY;
> +		ret = -ENOTTY;
> +		break;
>  	}
> +
> +	if (ret < 0)
> +		ret = bch2_err_class(ret);
> +	return ret;
>  }
>  
>  static long bch2_ioctl_query_uuid(struct bch_fs *c,
> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> index 262c923b2f1a..2f5daecfbcf7 100644
> --- a/fs/bcachefs/recovery.c
> +++ b/fs/bcachefs/recovery.c
> @@ -655,13 +655,13 @@ static int bch2_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
>  	int ret;
>  
>  	if (!(p->when & PASS_SILENT))
> -		printk(KERN_INFO bch2_log_msg(c, "%s..."),
> -		       bch2_recovery_passes[pass]);
> +		bch2_print(c, KERN_INFO bch2_log_msg(c, "%s..."),
> +			   bch2_recovery_passes[pass]);
>  	ret = p->fn(c);
>  	if (ret)
>  		return ret;
>  	if (!(p->when & PASS_SILENT))
> -		printk(KERN_CONT " done\n");
> +		bch2_print(c, KERN_CONT " done\n");
>  
>  	return 0;
>  }
> -- 
> 2.42.0
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE
  2023-12-08 20:26   ` Brian Foster
@ 2023-12-08 20:33     ` Kent Overstreet
  0 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-08 20:33 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-bcachefs, djwong

On Fri, Dec 08, 2023 at 03:26:53PM -0500, Brian Foster wrote:
> On Wed, Dec 06, 2023 at 03:33:09PM -0500, Kent Overstreet wrote:
> > This adds a new ioctl for running fsck on a list of devices.
> > 
> > Normally, if we wish to use the kernel's implementation of fsck we'd run
> > it at mount time with -o fsck. This ioctl lets us run fsck without
> > mounting, so that userspace bcachefs-tools can transparently switch to
> > the kernel's implementation of fsck when appropriate - primarily if the
> > kernel version of bcachefs better matches the filesystem on disk.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > ---
> >  fs/bcachefs/bcachefs_ioctl.h |  13 +++
> >  fs/bcachefs/chardev.c        | 206 ++++++++++++++++++++++++++++++++++-
> >  fs/bcachefs/recovery.c       |   6 +-
> >  3 files changed, 219 insertions(+), 6 deletions(-)
> > 
> ...
> > diff --git a/fs/bcachefs/chardev.c b/fs/bcachefs/chardev.c
> > index 8e3ac2d32298..03082a001036 100644
> > --- a/fs/bcachefs/chardev.c
> > +++ b/fs/bcachefs/chardev.c
> ...
> > @@ -193,8 +196,196 @@ static long bch2_ioctl_incremental(struct bch_ioctl_incremental __user *user_arg
> >  }
> >  #endif
> >  
> ...
> > +
> > +static const struct file_operations fsck_thread_ops = {
> > +	.release	= bch2_fsck_thread_release,
> > +	.read		= bch2_fsck_thread_read,
> > +	.llseek		= no_llseek,
> > +};
> > +
> > +static int bch2_fsck_offline_thread_fn(void *arg)
> > +{
> > +	struct fsck_thread *thr = container_of(arg, struct fsck_thread, thr);
> > +	struct bch_fs *c = bch2_fs_open(thr->devs, thr->nr_devs, thr->opts);
> > +
> > +	thr->thr.ret = PTR_ERR_OR_ZERO(c);
> > +	if (!thr->thr.ret)
> > +		bch2_fs_stop(c);
> > +
> 
> Looks pretty neat. Mainly just curious if I'm following this right...
> Basically it looks like we wire up the log output redirect to c->output
> via options, then start/stop the fs which runs "mount time" fsck (?),
> but feeding the message output to the redirect (and then fed to
> userspace via the fd/read handler just above). Hm?

Yup, just that. I still need to wire up input so that the ask_yn()
responses can get passed in, but I'm going to leave that later since I'm
focusing more on online fsck now.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes()
  2023-12-08 20:25   ` Brian Foster
@ 2023-12-08 20:34     ` Kent Overstreet
  0 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-08 20:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-bcachefs, djwong

On Fri, Dec 08, 2023 at 03:25:02PM -0500, Brian Foster wrote:
> On Wed, Dec 06, 2023 at 03:33:08PM -0500, Kent Overstreet wrote:
> > Add a new helper for running online recovery passes - i.e. online fsck.
> > This is a subset of our normal recovery passes, and does not - for now -
> > use or follow c->curr_recovery_pass.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > ---
> >  fs/bcachefs/bcachefs.h |  7 ++++++
> >  fs/bcachefs/recovery.c | 51 ++++++++++++++++++++++++++++--------------
> >  fs/bcachefs/recovery.h |  1 +
> >  3 files changed, 42 insertions(+), 17 deletions(-)
> > 
> ...
> > diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> > index 585b4928964f..262c923b2f1a 100644
> > --- a/fs/bcachefs/recovery.c
> > +++ b/fs/bcachefs/recovery.c
> ...
> > @@ -651,39 +651,56 @@ static bool should_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pa
> >  
> >  static int bch2_run_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
> >  {
> > +	struct recovery_pass_fn *p = recovery_pass_fns + pass;
> >  	int ret;
> >  
> > -	c->curr_recovery_pass = pass;
> > +	if (!(p->when & PASS_SILENT))
> > +		printk(KERN_INFO bch2_log_msg(c, "%s..."),
> > +		       bch2_recovery_passes[pass]);
> > +	ret = p->fn(c);
> > +	if (ret)
> > +		return ret;
> > +	if (!(p->when & PASS_SILENT))
> > +		printk(KERN_CONT " done\n");
> >  
> > -	if (should_run_recovery_pass(c, pass)) {
> > -		struct recovery_pass_fn *p = recovery_pass_fns + pass;
> > +	return 0;
> > +}
> >  
> > -		if (!(p->when & PASS_SILENT))
> > -			printk(KERN_INFO bch2_log_msg(c, "%s..."),
> > -			       bch2_recovery_passes[pass]);
> > -		ret = p->fn(c);
> > -		if (ret)
> > -			return ret;
> > -		if (!(p->when & PASS_SILENT))
> > -			printk(KERN_CONT " done\n");
> > +static int bch2_run_recovery_passes(struct bch_fs *c)
> > +{
> > +	int ret = 0;
> >  
> > -		c->recovery_passes_complete |= BIT_ULL(pass);
> > +	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
> > +		if (should_run_recovery_pass(c, c->curr_recovery_pass)) {
> > +			ret = bch2_run_recovery_pass(c, c->curr_recovery_pass);
> > +			if (bch2_err_matches(ret, BCH_ERR_restart_recovery))
> > +				continue;
> > +			if (ret)
> > +				break;
> > +
> > +			c->recovery_passes_complete |= BIT_ULL(c->curr_recovery_pass);
> > +		}
> > +		c->curr_recovery_pass++;
> >  	}
> >  
> > -	return 0;
> > +	return ret;
> >  }
> >  
> > -static int bch2_run_recovery_passes(struct bch_fs *c)
> > +int bch2_run_online_recovery_passes(struct bch_fs *c)
> >  {
> >  	int ret = 0;
> >  
> > -	while (c->curr_recovery_pass < ARRAY_SIZE(recovery_pass_fns)) {
> > +	for (unsigned i = 0; i < ARRAY_SIZE(recovery_pass_fns); i++) {
> > +		struct recovery_pass_fn *p = recovery_pass_fns + c->curr_recovery_pass;
> 
> Maybe I'm confused, but should this be using i?

Yeah, this was my early rough draft, it's fixed now :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] bcachefs: Add ability to redirect log output
  2023-12-08 20:24   ` Brian Foster
@ 2023-12-08 20:35     ` Kent Overstreet
  0 siblings, 0 replies; 13+ messages in thread
From: Kent Overstreet @ 2023-12-08 20:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-bcachefs, djwong

On Fri, Dec 08, 2023 at 03:24:13PM -0500, Brian Foster wrote:
> On Wed, Dec 06, 2023 at 03:33:06PM -0500, Kent Overstreet wrote:
> > Upcoming patches are going to add two new ioctls for running fsck in the
> > kernel, but pretending that we're running our normal userspace fsck.
> > 
> > This patch adds some plumbing for redirecting our normal log messages
> > away from the dmesg log to a thread_with_file file descriptor - via a
> > struct log_output, which will be consumed by the fsck f_op's read method.
> > 
> > The new ioctls will allow for running fsck in the kernel against an
> > offline filesystem (without mounting it), and an online filesystem. For
> > an offline filesystem we need a way to pass in a pointer to the
> > log_output, which is done via a new hidden opts.h option.
> > 
> > For online fsck, we can set c->output directly, but only want to
> > redirect log messages from the thread running fsck - hence the new
> > c->output_filter method.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > ---
> >  fs/bcachefs/bcachefs.h | 54 +++++++++++++++++++++++++++++++-----------
> >  fs/bcachefs/opts.h     |  5 ++++
> >  fs/bcachefs/super.c    | 28 ++++++++++++++++++++++
> >  3 files changed, 73 insertions(+), 14 deletions(-)
> > 
> ...
> > diff --git a/fs/bcachefs/opts.h b/fs/bcachefs/opts.h
> > index 8526f177450a..91026dfb8c83 100644
> > --- a/fs/bcachefs/opts.h
> > +++ b/fs/bcachefs/opts.h
> > @@ -419,6 +419,11 @@ enum fsck_err_opts {
> >  	  OPT_BOOL(),							\
> >  	  BCH2_NO_SB_OPT,		false,				\
> >  	  NULL,		"Allocate the buckets_nouse bitmap")		\
> > +	x(log_output,			u64,				\
> > +	  0,								\
> > +	  OPT_UINT(0, S64_MAX),						\
> > +	  BCH2_NO_SB_OPT,		false,				\
> > +	  NULL,		"Allocate the buckets_nouse bitmap")		\
> 
> I assume you want to update the string to something more related. :)

Whoops yes :) this is an internal, completely hidden option so nothing
should ever see that, fortunately :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-12-08 20:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-06 20:33 [PATCH 0/6] [RFC WIP] bcachefs: online fsck Kent Overstreet
2023-12-06 20:33 ` [PATCH 1/6] bcachefs: thread_with_file Kent Overstreet
2023-12-06 20:33 ` [PATCH 2/6] bcachefs: Add ability to redirect log output Kent Overstreet
2023-12-08 20:24   ` Brian Foster
2023-12-08 20:35     ` Kent Overstreet
2023-12-06 20:33 ` [PATCH 3/6] bcachefs: Mark recovery passses that are safe to run online Kent Overstreet
2023-12-06 20:33 ` [PATCH 4/6] bcachefs: bch2_run_online_recovery_passes() Kent Overstreet
2023-12-08 20:25   ` Brian Foster
2023-12-08 20:34     ` Kent Overstreet
2023-12-06 20:33 ` [PATCH 5/6] bcachefs: BCH_IOCTL_FSCK_OFFLINE Kent Overstreet
2023-12-08 20:26   ` Brian Foster
2023-12-08 20:33     ` Kent Overstreet
2023-12-06 20:33 ` [PATCH 6/6] bcachefs: BCH_IOCTL_FSCK_ONLINE Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).