All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v3 00/24] vfs: provide automatic kernel freeze / resume
@ 2023-01-14  0:33 ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Darrick J. Wong poked me about the status of the fs freez work, he's
right, it's been too long since the last spin. The last v2 attempt happened
in April 2021 [0], this just takes the feedback from Christoph and spins it
again. I've only done basic build tests on x86_64, and haven't yet run time
tested the stuff, but given the size of this set its better to review early
before getting stuck on details. So this is what I've ended up with so far.

Please help me paint the bike shed, and figure out the stuff perhaps
I had not considered yet. The locking stuff is really the important thing
here.

I'd like to re-iterate that tons of areas of the kernel are using the
kthread freezer stuff for things it probably has no reason to use it, so
once we remove this from the fs, it should be easy to start trimming this
from other parts of the kernel. The kthread freezer stuff was put in place
originally stop IO in flight for fs. Other parts of the kernels should
have no business using this stuff after all this work is done.

[0] https://lore.kernel.org/all/20210417001026.23858-1-mcgrof@kernel.org/

Changes since the last v2:
  * instead of having different semantics for lock / unlocked freeze
    and thaw calls, this unifies the semantics by requiring the lock
    prior to freeze / thaw
  * uses grab_active_super() now in all all places which need to freeze
    or thaw, this includes filesystems, this is to match the locking
    requirements, and so to not add new heuristics over defining if the
    superblock might be in a good state for freeze/thaw.
  * drops SB_FREEZE_COMPLETE_AUTO in favor of just checking for a flag
    to be able to determine if userspace initiated the freeze or if its
    auto (by the kernel pm)
  * folded the pm calls for the VFS so that instead of one call which
    has a one-liner with two routines, we use the same one-liner on the
    pm side of things.
  * split the FS stuff by using a enw temporary flag, so to enable
    easier review of the FS changes
  * more filesystems use the freezer API now so this also converts them
    over
  * adjusted the coccinelle rule to use the new flag and in the end
    removes it

This is all here too:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20231010-fs-freeze-v5

Luis Chamberlain (24):
  fs: unify locking semantics for fs freeze / thaw
  fs: add frozen sb state helpers
  fs: distinguish between user initiated freeze and kernel initiated
    freeze
  fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  xfs: replace kthread freezing with auto fs freezing
  btrfs: replace kthread freezing with auto fs freezing
  ext4: replace kthread freezing with auto fs freezing
  f2fs: replace kthread freezing with auto fs freezing
  cifs: replace kthread freezing with auto fs freezing
  gfs2: replace kthread freezing with auto fs freezing
  jfs: replace kthread freezing with auto fs freezing
  nilfs2: replace kthread freezing with auto fs freezing
  nfs: replace kthread freezing with auto fs freezing
  nfsd: replace kthread freezing with auto fs freezing
  ubifs: replace kthread freezing with auto fs freezing
  ksmbd: replace kthread freezing with auto fs freezing
  jffs2: replace kthread freezing with auto fs freezing
  jbd2: replace kthread freezing with auto fs freezing
  coredump: drop freezer usage
  ecryptfs: replace kthread freezing with auto fs freezing
  fscache: replace kthread freezing with auto fs freezing
  lockd: replace kthread freezing with auto fs freezing
  fs: remove FS_AUTOFREEZE

 block/bdev.c             |   9 +-
 fs/btrfs/disk-io.c       |   4 +-
 fs/btrfs/scrub.c         |   2 +-
 fs/cifs/cifsfs.c         |  10 +-
 fs/cifs/connect.c        |   8 --
 fs/cifs/dfs_cache.c      |   2 +-
 fs/coredump.c            |   2 +-
 fs/ecryptfs/kthread.c    |   1 -
 fs/ext4/ext4_jbd2.c      |   2 +-
 fs/ext4/super.c          |   3 -
 fs/f2fs/gc.c             |  12 +-
 fs/f2fs/segment.c        |   6 +-
 fs/fscache/main.c        |   2 +-
 fs/gfs2/glock.c          |   6 +-
 fs/gfs2/glops.c          |   2 +-
 fs/gfs2/log.c            |   2 -
 fs/gfs2/main.c           |   4 +-
 fs/gfs2/quota.c          |   2 -
 fs/gfs2/super.c          |  11 +-
 fs/gfs2/sys.c            |  12 +-
 fs/gfs2/util.c           |   7 +-
 fs/ioctl.c               |  14 ++-
 fs/jbd2/journal.c        |  54 ++++-----
 fs/jffs2/background.c    |   3 +-
 fs/jfs/jfs_logmgr.c      |  11 +-
 fs/jfs/jfs_txnmgr.c      |  31 ++----
 fs/ksmbd/connection.c    |   3 -
 fs/ksmbd/transport_tcp.c |   2 -
 fs/lockd/clntproc.c      |   1 -
 fs/lockd/svc.c           |   3 -
 fs/nfs/callback.c        |   4 -
 fs/nfsd/nfssvc.c         |   2 -
 fs/nilfs2/segment.c      |  48 ++++----
 fs/quota/quota.c         |   4 +-
 fs/super.c               | 232 ++++++++++++++++++++++++++++++++-------
 fs/ubifs/commit.c        |   4 -
 fs/xfs/xfs_log.c         |   3 +-
 fs/xfs/xfs_log_cil.c     |   2 +-
 fs/xfs/xfs_mru_cache.c   |   2 +-
 fs/xfs/xfs_pwork.c       |   2 +-
 fs/xfs/xfs_super.c       |  14 +--
 fs/xfs/xfs_trans.c       |   3 +-
 fs/xfs/xfs_trans_ail.c   |   3 -
 include/linux/fs.h       |  53 ++++++++-
 kernel/power/process.c   |  15 ++-
 45 files changed, 393 insertions(+), 229 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [RFC v3 00/24] vfs: provide automatic kernel freeze / resume
@ 2023-01-14  0:33 ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Darrick J. Wong poked me about the status of the fs freez work, he's
right, it's been too long since the last spin. The last v2 attempt happened
in April 2021 [0], this just takes the feedback from Christoph and spins it
again. I've only done basic build tests on x86_64, and haven't yet run time
tested the stuff, but given the size of this set its better to review early
before getting stuck on details. So this is what I've ended up with so far.

Please help me paint the bike shed, and figure out the stuff perhaps
I had not considered yet. The locking stuff is really the important thing
here.

I'd like to re-iterate that tons of areas of the kernel are using the
kthread freezer stuff for things it probably has no reason to use it, so
once we remove this from the fs, it should be easy to start trimming this
from other parts of the kernel. The kthread freezer stuff was put in place
originally stop IO in flight for fs. Other parts of the kernels should
have no business using this stuff after all this work is done.

[0] https://lore.kernel.org/all/20210417001026.23858-1-mcgrof@kernel.org/

Changes since the last v2:
  * instead of having different semantics for lock / unlocked freeze
    and thaw calls, this unifies the semantics by requiring the lock
    prior to freeze / thaw
  * uses grab_active_super() now in all all places which need to freeze
    or thaw, this includes filesystems, this is to match the locking
    requirements, and so to not add new heuristics over defining if the
    superblock might be in a good state for freeze/thaw.
  * drops SB_FREEZE_COMPLETE_AUTO in favor of just checking for a flag
    to be able to determine if userspace initiated the freeze or if its
    auto (by the kernel pm)
  * folded the pm calls for the VFS so that instead of one call which
    has a one-liner with two routines, we use the same one-liner on the
    pm side of things.
  * split the FS stuff by using a enw temporary flag, so to enable
    easier review of the FS changes
  * more filesystems use the freezer API now so this also converts them
    over
  * adjusted the coccinelle rule to use the new flag and in the end
    removes it

This is all here too:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20231010-fs-freeze-v5

Luis Chamberlain (24):
  fs: unify locking semantics for fs freeze / thaw
  fs: add frozen sb state helpers
  fs: distinguish between user initiated freeze and kernel initiated
    freeze
  fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  xfs: replace kthread freezing with auto fs freezing
  btrfs: replace kthread freezing with auto fs freezing
  ext4: replace kthread freezing with auto fs freezing
  f2fs: replace kthread freezing with auto fs freezing
  cifs: replace kthread freezing with auto fs freezing
  gfs2: replace kthread freezing with auto fs freezing
  jfs: replace kthread freezing with auto fs freezing
  nilfs2: replace kthread freezing with auto fs freezing
  nfs: replace kthread freezing with auto fs freezing
  nfsd: replace kthread freezing with auto fs freezing
  ubifs: replace kthread freezing with auto fs freezing
  ksmbd: replace kthread freezing with auto fs freezing
  jffs2: replace kthread freezing with auto fs freezing
  jbd2: replace kthread freezing with auto fs freezing
  coredump: drop freezer usage
  ecryptfs: replace kthread freezing with auto fs freezing
  fscache: replace kthread freezing with auto fs freezing
  lockd: replace kthread freezing with auto fs freezing
  fs: remove FS_AUTOFREEZE

 block/bdev.c             |   9 +-
 fs/btrfs/disk-io.c       |   4 +-
 fs/btrfs/scrub.c         |   2 +-
 fs/cifs/cifsfs.c         |  10 +-
 fs/cifs/connect.c        |   8 --
 fs/cifs/dfs_cache.c      |   2 +-
 fs/coredump.c            |   2 +-
 fs/ecryptfs/kthread.c    |   1 -
 fs/ext4/ext4_jbd2.c      |   2 +-
 fs/ext4/super.c          |   3 -
 fs/f2fs/gc.c             |  12 +-
 fs/f2fs/segment.c        |   6 +-
 fs/fscache/main.c        |   2 +-
 fs/gfs2/glock.c          |   6 +-
 fs/gfs2/glops.c          |   2 +-
 fs/gfs2/log.c            |   2 -
 fs/gfs2/main.c           |   4 +-
 fs/gfs2/quota.c          |   2 -
 fs/gfs2/super.c          |  11 +-
 fs/gfs2/sys.c            |  12 +-
 fs/gfs2/util.c           |   7 +-
 fs/ioctl.c               |  14 ++-
 fs/jbd2/journal.c        |  54 ++++-----
 fs/jffs2/background.c    |   3 +-
 fs/jfs/jfs_logmgr.c      |  11 +-
 fs/jfs/jfs_txnmgr.c      |  31 ++----
 fs/ksmbd/connection.c    |   3 -
 fs/ksmbd/transport_tcp.c |   2 -
 fs/lockd/clntproc.c      |   1 -
 fs/lockd/svc.c           |   3 -
 fs/nfs/callback.c        |   4 -
 fs/nfsd/nfssvc.c         |   2 -
 fs/nilfs2/segment.c      |  48 ++++----
 fs/quota/quota.c         |   4 +-
 fs/super.c               | 232 ++++++++++++++++++++++++++++++++-------
 fs/ubifs/commit.c        |   4 -
 fs/xfs/xfs_log.c         |   3 +-
 fs/xfs/xfs_log_cil.c     |   2 +-
 fs/xfs/xfs_mru_cache.c   |   2 +-
 fs/xfs/xfs_pwork.c       |   2 +-
 fs/xfs/xfs_super.c       |  14 +--
 fs/xfs/xfs_trans.c       |   3 +-
 fs/xfs/xfs_trans_ail.c   |   3 -
 include/linux/fs.h       |  53 ++++++++-
 kernel/power/process.c   |  15 ++-
 45 files changed, 393 insertions(+), 229 deletions(-)

-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Right now freeze_super()  and thaw_super() are called with
different locking contexts. To expand on this is messy, so
just unify the requirement to require grabbing an active
reference and keep the superblock locked.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c    |  5 ++++-
 fs/f2fs/gc.c    |  5 +++++
 fs/gfs2/super.c |  9 +++++++--
 fs/gfs2/sys.c   |  6 ++++++
 fs/gfs2/util.c  |  5 +++++
 fs/ioctl.c      | 12 ++++++++++--
 fs/super.c      | 51 ++++++++++++++-----------------------------------
 7 files changed, 51 insertions(+), 42 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index edc110d90df4..8fd3a7991c02 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
 		error = sb->s_op->freeze_super(sb);
 	else
 		error = freeze_super(sb);
-	deactivate_super(sb);
+	deactivate_locked_super(sb);
 
 	if (error) {
 		bdev->bd_fsfreeze_count--;
@@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
 	sb = bdev->bd_fsfreeze_sb;
 	if (!sb)
 		goto out;
+	if (!get_active_super(bdev))
+		goto out;
 
 	if (sb->s_op->thaw_super)
 		error = sb->s_op->thaw_super(sb);
@@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
 		bdev->bd_fsfreeze_count++;
 	else
 		bdev->bd_fsfreeze_sb = NULL;
+	deactivate_locked_super(sb);
 out:
 	mutex_unlock(&bdev->bd_fsfreeze_mutex);
 	return error;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 7444c392eab1..4c681fe487ee 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2139,7 +2139,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 	if (err)
 		return err;
 
+	if (!get_active_super(sbi->sb->s_bdev))
+		return -ENOTTY;
 	freeze_super(sbi->sb);
+
 	f2fs_down_write(&sbi->gc_lock);
 	f2fs_down_write(&sbi->cp_global_sem);
 
@@ -2190,6 +2193,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 out_err:
 	f2fs_up_write(&sbi->cp_global_sem);
 	f2fs_up_write(&sbi->gc_lock);
+	/* We use the same active reference from freeze */
 	thaw_super(sbi->sb);
+	deactivate_locked_super(sbi->sb);
 	return err;
 }
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 999cc146d708..48df7b276b64 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -661,7 +661,12 @@ void gfs2_freeze_func(struct work_struct *work)
 	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
 	struct super_block *sb = sdp->sd_vfs;
 
-	atomic_inc(&sb->s_active);
+	if (!get_active_super(sb->s_bdev)) {
+		fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
+		gfs2_assert_withdraw(sdp, 0);
+		return;
+	}
+
 	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
 	if (error) {
 		gfs2_assert_withdraw(sdp, 0);
@@ -675,7 +680,7 @@ void gfs2_freeze_func(struct work_struct *work)
 		}
 		gfs2_freeze_unlock(&freeze_gh);
 	}
-	deactivate_super(sb);
+	deactivate_locked_super(sb);
 	clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
 	wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
 	return;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index d87ea98cf535..d0b80552a678 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -162,6 +162,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (!get_active_super(sb->s_bdev))
+		return -ENOTTY;
+
 	switch (n) {
 	case 0:
 		error = thaw_super(sdp->sd_vfs);
@@ -170,9 +173,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 		error = freeze_super(sdp->sd_vfs);
 		break;
 	default:
+		deactivate_locked_super(sb);
 		return -EINVAL;
 	}
 
+	deactivate_locked_super(sb);
+
 	if (error) {
 		fs_warn(sdp, "freeze %d error %d\n", n, error);
 		return error;
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 7a6aeffcdf5c..3a0cd5e9ad84 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
 	set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);
 
 	if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
+		if (!get_active_super(sb->s_bdev)) {
+			fs_err(sdp, "could not grab super on withdraw for file system\n");
+			return -1;
+		}
 		fs_err(sdp, "about to withdraw this file system\n");
 		BUG_ON(sdp->sd_args.ar_debug);
 
 		signal_our_withdraw(sdp);
+		deactivate_locked_super(sb);
 
 		kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);
 
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 80ac36aea913..3d2536e1ea58 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 static int ioctl_fsfreeze(struct file *filp)
 {
 	struct super_block *sb = file_inode(filp)->i_sb;
+	int ret;
 
 	if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
 		return -EPERM;
@@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
 	if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
 		return -EOPNOTSUPP;
 
+	if (!get_active_super(sb->s_bdev))
+		return -ENOTTY;
+
 	/* Freeze */
 	if (sb->s_op->freeze_super)
-		return sb->s_op->freeze_super(sb);
-	return freeze_super(sb);
+		ret = sb->s_op->freeze_super(sb);
+	ret = freeze_super(sb);
+
+	deactivate_locked_super(sb);
+
+	return ret;
 }
 
 static int ioctl_fsthaw(struct file *filp)
diff --git a/fs/super.c b/fs/super.c
index 12c08cb20405..a31a41b313f3 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -39,8 +39,6 @@
 #include <uapi/linux/mount.h>
 #include "internal.h"
 
-static int thaw_super_locked(struct super_block *sb);
-
 static LIST_HEAD(super_blocks);
 static DEFINE_SPINLOCK(sb_lock);
 
@@ -830,7 +828,6 @@ struct super_block *get_active_super(struct block_device *bdev)
 		if (sb->s_bdev == bdev) {
 			if (!grab_super(sb))
 				goto restart;
-			up_write(&sb->s_umount);
 			return sb;
 		}
 	}
@@ -1003,13 +1000,13 @@ void emergency_remount(void)
 
 static void do_thaw_all_callback(struct super_block *sb)
 {
-	down_write(&sb->s_umount);
+	if (!get_active_super(sb->s_bdev))
+		return;
 	if (sb->s_root && sb->s_flags & SB_BORN) {
 		emergency_thaw_bdev(sb);
-		thaw_super_locked(sb);
-	} else {
-		up_write(&sb->s_umount);
+		thaw_super(sb);
 	}
+	deactivate_locked_super(sb);
 }
 
 static void do_thaw_all(struct work_struct *work)
@@ -1651,22 +1648,15 @@ int freeze_super(struct super_block *sb)
 {
 	int ret;
 
-	atomic_inc(&sb->s_active);
-	down_write(&sb->s_umount);
-	if (sb->s_writers.frozen != SB_UNFROZEN) {
-		deactivate_locked_super(sb);
+	if (sb->s_writers.frozen != SB_UNFROZEN)
 		return -EBUSY;
-	}
 
-	if (!(sb->s_flags & SB_BORN)) {
-		up_write(&sb->s_umount);
+	if (!(sb->s_flags & SB_BORN))
 		return 0;	/* sic - it's "nothing to do" */
-	}
 
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
-		up_write(&sb->s_umount);
 		return 0;
 	}
 
@@ -1686,7 +1676,6 @@ int freeze_super(struct super_block *sb)
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
-		deactivate_locked_super(sb);
 		return ret;
 	}
 
@@ -1702,7 +1691,6 @@ int freeze_super(struct super_block *sb)
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
 			wake_up(&sb->s_writers.wait_unfrozen);
-			deactivate_locked_super(sb);
 			return ret;
 		}
 	}
@@ -1712,19 +1700,22 @@ int freeze_super(struct super_block *sb)
 	 */
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 	lockdep_sb_freeze_release(sb);
-	up_write(&sb->s_umount);
 	return 0;
 }
 EXPORT_SYMBOL(freeze_super);
 
-static int thaw_super_locked(struct super_block *sb)
+/**
+ * thaw_super -- unlock filesystem
+ * @sb: the super to thaw
+ *
+ * Unlocks the filesystem and marks it writeable again after freeze_super().
+ */
+int thaw_super(struct super_block *sb)
 {
 	int error;
 
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
-		up_write(&sb->s_umount);
+	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
 		return -EINVAL;
-	}
 
 	if (sb_rdonly(sb)) {
 		sb->s_writers.frozen = SB_UNFROZEN;
@@ -1739,7 +1730,6 @@ static int thaw_super_locked(struct super_block *sb)
 			printk(KERN_ERR
 				"VFS:Filesystem thaw failed\n");
 			lockdep_sb_freeze_release(sb);
-			up_write(&sb->s_umount);
 			return error;
 		}
 	}
@@ -1748,19 +1738,6 @@ static int thaw_super_locked(struct super_block *sb)
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);
-	deactivate_locked_super(sb);
 	return 0;
 }
-
-/**
- * thaw_super -- unlock filesystem
- * @sb: the super to thaw
- *
- * Unlocks the filesystem and marks it writeable again after freeze_super().
- */
-int thaw_super(struct super_block *sb)
-{
-	down_write(&sb->s_umount);
-	return thaw_super_locked(sb);
-}
 EXPORT_SYMBOL(thaw_super);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Right now freeze_super()  and thaw_super() are called with
different locking contexts. To expand on this is messy, so
just unify the requirement to require grabbing an active
reference and keep the superblock locked.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c    |  5 ++++-
 fs/f2fs/gc.c    |  5 +++++
 fs/gfs2/super.c |  9 +++++++--
 fs/gfs2/sys.c   |  6 ++++++
 fs/gfs2/util.c  |  5 +++++
 fs/ioctl.c      | 12 ++++++++++--
 fs/super.c      | 51 ++++++++++++++-----------------------------------
 7 files changed, 51 insertions(+), 42 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index edc110d90df4..8fd3a7991c02 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
 		error = sb->s_op->freeze_super(sb);
 	else
 		error = freeze_super(sb);
-	deactivate_super(sb);
+	deactivate_locked_super(sb);
 
 	if (error) {
 		bdev->bd_fsfreeze_count--;
@@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
 	sb = bdev->bd_fsfreeze_sb;
 	if (!sb)
 		goto out;
+	if (!get_active_super(bdev))
+		goto out;
 
 	if (sb->s_op->thaw_super)
 		error = sb->s_op->thaw_super(sb);
@@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
 		bdev->bd_fsfreeze_count++;
 	else
 		bdev->bd_fsfreeze_sb = NULL;
+	deactivate_locked_super(sb);
 out:
 	mutex_unlock(&bdev->bd_fsfreeze_mutex);
 	return error;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 7444c392eab1..4c681fe487ee 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2139,7 +2139,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 	if (err)
 		return err;
 
+	if (!get_active_super(sbi->sb->s_bdev))
+		return -ENOTTY;
 	freeze_super(sbi->sb);
+
 	f2fs_down_write(&sbi->gc_lock);
 	f2fs_down_write(&sbi->cp_global_sem);
 
@@ -2190,6 +2193,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 out_err:
 	f2fs_up_write(&sbi->cp_global_sem);
 	f2fs_up_write(&sbi->gc_lock);
+	/* We use the same active reference from freeze */
 	thaw_super(sbi->sb);
+	deactivate_locked_super(sbi->sb);
 	return err;
 }
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 999cc146d708..48df7b276b64 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -661,7 +661,12 @@ void gfs2_freeze_func(struct work_struct *work)
 	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
 	struct super_block *sb = sdp->sd_vfs;
 
-	atomic_inc(&sb->s_active);
+	if (!get_active_super(sb->s_bdev)) {
+		fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
+		gfs2_assert_withdraw(sdp, 0);
+		return;
+	}
+
 	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
 	if (error) {
 		gfs2_assert_withdraw(sdp, 0);
@@ -675,7 +680,7 @@ void gfs2_freeze_func(struct work_struct *work)
 		}
 		gfs2_freeze_unlock(&freeze_gh);
 	}
-	deactivate_super(sb);
+	deactivate_locked_super(sb);
 	clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
 	wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
 	return;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index d87ea98cf535..d0b80552a678 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -162,6 +162,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (!get_active_super(sb->s_bdev))
+		return -ENOTTY;
+
 	switch (n) {
 	case 0:
 		error = thaw_super(sdp->sd_vfs);
@@ -170,9 +173,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 		error = freeze_super(sdp->sd_vfs);
 		break;
 	default:
+		deactivate_locked_super(sb);
 		return -EINVAL;
 	}
 
+	deactivate_locked_super(sb);
+
 	if (error) {
 		fs_warn(sdp, "freeze %d error %d\n", n, error);
 		return error;
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 7a6aeffcdf5c..3a0cd5e9ad84 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
 	set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);
 
 	if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
+		if (!get_active_super(sb->s_bdev)) {
+			fs_err(sdp, "could not grab super on withdraw for file system\n");
+			return -1;
+		}
 		fs_err(sdp, "about to withdraw this file system\n");
 		BUG_ON(sdp->sd_args.ar_debug);
 
 		signal_our_withdraw(sdp);
+		deactivate_locked_super(sb);
 
 		kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);
 
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 80ac36aea913..3d2536e1ea58 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
 static int ioctl_fsfreeze(struct file *filp)
 {
 	struct super_block *sb = file_inode(filp)->i_sb;
+	int ret;
 
 	if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
 		return -EPERM;
@@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
 	if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
 		return -EOPNOTSUPP;
 
+	if (!get_active_super(sb->s_bdev))
+		return -ENOTTY;
+
 	/* Freeze */
 	if (sb->s_op->freeze_super)
-		return sb->s_op->freeze_super(sb);
-	return freeze_super(sb);
+		ret = sb->s_op->freeze_super(sb);
+	ret = freeze_super(sb);
+
+	deactivate_locked_super(sb);
+
+	return ret;
 }
 
 static int ioctl_fsthaw(struct file *filp)
diff --git a/fs/super.c b/fs/super.c
index 12c08cb20405..a31a41b313f3 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -39,8 +39,6 @@
 #include <uapi/linux/mount.h>
 #include "internal.h"
 
-static int thaw_super_locked(struct super_block *sb);
-
 static LIST_HEAD(super_blocks);
 static DEFINE_SPINLOCK(sb_lock);
 
@@ -830,7 +828,6 @@ struct super_block *get_active_super(struct block_device *bdev)
 		if (sb->s_bdev == bdev) {
 			if (!grab_super(sb))
 				goto restart;
-			up_write(&sb->s_umount);
 			return sb;
 		}
 	}
@@ -1003,13 +1000,13 @@ void emergency_remount(void)
 
 static void do_thaw_all_callback(struct super_block *sb)
 {
-	down_write(&sb->s_umount);
+	if (!get_active_super(sb->s_bdev))
+		return;
 	if (sb->s_root && sb->s_flags & SB_BORN) {
 		emergency_thaw_bdev(sb);
-		thaw_super_locked(sb);
-	} else {
-		up_write(&sb->s_umount);
+		thaw_super(sb);
 	}
+	deactivate_locked_super(sb);
 }
 
 static void do_thaw_all(struct work_struct *work)
@@ -1651,22 +1648,15 @@ int freeze_super(struct super_block *sb)
 {
 	int ret;
 
-	atomic_inc(&sb->s_active);
-	down_write(&sb->s_umount);
-	if (sb->s_writers.frozen != SB_UNFROZEN) {
-		deactivate_locked_super(sb);
+	if (sb->s_writers.frozen != SB_UNFROZEN)
 		return -EBUSY;
-	}
 
-	if (!(sb->s_flags & SB_BORN)) {
-		up_write(&sb->s_umount);
+	if (!(sb->s_flags & SB_BORN))
 		return 0;	/* sic - it's "nothing to do" */
-	}
 
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
-		up_write(&sb->s_umount);
 		return 0;
 	}
 
@@ -1686,7 +1676,6 @@ int freeze_super(struct super_block *sb)
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
-		deactivate_locked_super(sb);
 		return ret;
 	}
 
@@ -1702,7 +1691,6 @@ int freeze_super(struct super_block *sb)
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
 			wake_up(&sb->s_writers.wait_unfrozen);
-			deactivate_locked_super(sb);
 			return ret;
 		}
 	}
@@ -1712,19 +1700,22 @@ int freeze_super(struct super_block *sb)
 	 */
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 	lockdep_sb_freeze_release(sb);
-	up_write(&sb->s_umount);
 	return 0;
 }
 EXPORT_SYMBOL(freeze_super);
 
-static int thaw_super_locked(struct super_block *sb)
+/**
+ * thaw_super -- unlock filesystem
+ * @sb: the super to thaw
+ *
+ * Unlocks the filesystem and marks it writeable again after freeze_super().
+ */
+int thaw_super(struct super_block *sb)
 {
 	int error;
 
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
-		up_write(&sb->s_umount);
+	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
 		return -EINVAL;
-	}
 
 	if (sb_rdonly(sb)) {
 		sb->s_writers.frozen = SB_UNFROZEN;
@@ -1739,7 +1730,6 @@ static int thaw_super_locked(struct super_block *sb)
 			printk(KERN_ERR
 				"VFS:Filesystem thaw failed\n");
 			lockdep_sb_freeze_release(sb);
-			up_write(&sb->s_umount);
 			return error;
 		}
 	}
@@ -1748,19 +1738,6 @@ static int thaw_super_locked(struct super_block *sb)
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);
-	deactivate_locked_super(sb);
 	return 0;
 }
-
-/**
- * thaw_super -- unlock filesystem
- * @sb: the super to thaw
- *
- * Unlocks the filesystem and marks it writeable again after freeze_super().
- */
-int thaw_super(struct super_block *sb)
-{
-	down_write(&sb->s_umount);
-	return thaw_super_locked(sb);
-}
 EXPORT_SYMBOL(thaw_super);
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 02/24] fs: add frozen sb state helpers
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Provide helpers so that we can check a superblock frozen state.
This will make subsequent changes easier to read. This makes
no functional changes.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ext4/ext4_jbd2.c |  2 +-
 fs/gfs2/sys.c       |  2 +-
 fs/quota/quota.c    |  4 ++--
 fs/super.c          |  4 ++--
 fs/xfs/xfs_trans.c  |  3 +--
 include/linux/fs.h  | 22 ++++++++++++++++++++++
 6 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 77f318ec8abb..ef441f15053b 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -72,7 +72,7 @@ static int ext4_journal_check_start(struct super_block *sb)
 
 	if (sb_rdonly(sb))
 		return -EROFS;
-	WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(sb_is_frozen(sb));
 	journal = EXT4_SB(sb)->s_journal;
 	/*
 	 * Special case here: if the journal has aborted behind our
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index d0b80552a678..b98be03d0d1e 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -146,7 +146,7 @@ static ssize_t uuid_show(struct gfs2_sbd *sdp, char *buf)
 static ssize_t freeze_show(struct gfs2_sbd *sdp, char *buf)
 {
 	struct super_block *sb = sdp->sd_vfs;
-	int frozen = (sb->s_writers.frozen == SB_UNFROZEN) ? 0 : 1;
+	int frozen = sb_is_unfrozen(sb) ? 0 : 1;
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", frozen);
 }
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 052f143e2e0e..d8147c21bf03 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -890,13 +890,13 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 	sb = user_get_super(dev, excl);
 	if (!sb)
 		return ERR_PTR(-ENODEV);
-	if (thawed && sb->s_writers.frozen != SB_UNFROZEN) {
+	if (thawed && sb_is_unfrozen(sb)) {
 		if (excl)
 			up_write(&sb->s_umount);
 		else
 			up_read(&sb->s_umount);
 		wait_event(sb->s_writers.wait_unfrozen,
-			   sb->s_writers.frozen == SB_UNFROZEN);
+			   sb_is_unfrozen(sb));
 		put_super(sb);
 		goto retry;
 	}
diff --git a/fs/super.c b/fs/super.c
index a31a41b313f3..fdcf5a87af0a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -883,7 +883,7 @@ int reconfigure_super(struct fs_context *fc)
 
 	if (fc->sb_flags_mask & ~MS_RMT_MASK)
 		return -EINVAL;
-	if (sb->s_writers.frozen != SB_UNFROZEN)
+	if (!(sb_is_unfrozen(sb)))
 		return -EBUSY;
 
 	retval = security_sb_remount(sb, fc->security);
@@ -907,7 +907,7 @@ int reconfigure_super(struct fs_context *fc)
 			down_write(&sb->s_umount);
 			if (!sb->s_root)
 				return 0;
-			if (sb->s_writers.frozen != SB_UNFROZEN)
+			if (!sb_is_unfrozen(sb))
 				return -EBUSY;
 			remount_ro = !sb_rdonly(sb);
 		}
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 7bd16fbff534..ceb4890a4c96 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -267,8 +267,7 @@ xfs_trans_alloc(
 	 * Zero-reservation ("empty") transactions can't modify anything, so
 	 * they're allowed to run while we're frozen.
 	 */
-	WARN_ON(resp->tr_logres > 0 &&
-		mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(resp->tr_logres > 0 && sb_is_frozen(mp->m_super));
 	ASSERT(!(flags & XFS_TRANS_RES_FDBLKS) ||
 	       xfs_has_lazysbcount(mp));
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5042f5ab74a4..c0cab61f9f9a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1604,6 +1604,28 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
 
+/**
+ * sb_is_frozen - is superblock frozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is frozen.
+ */
+static inline bool sb_is_frozen(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
+}
+
+/**
+ * sb_is_unfrozen - is superblock unfrozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is unfrozen.
+ */
+static inline bool sb_is_unfrozen(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_UNFROZEN;
+}
+
 bool inode_owner_or_capable(struct user_namespace *mnt_userns,
 			    const struct inode *inode);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 02/24] fs: add frozen sb state helpers
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Provide helpers so that we can check a superblock frozen state.
This will make subsequent changes easier to read. This makes
no functional changes.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ext4/ext4_jbd2.c |  2 +-
 fs/gfs2/sys.c       |  2 +-
 fs/quota/quota.c    |  4 ++--
 fs/super.c          |  4 ++--
 fs/xfs/xfs_trans.c  |  3 +--
 include/linux/fs.h  | 22 ++++++++++++++++++++++
 6 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 77f318ec8abb..ef441f15053b 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -72,7 +72,7 @@ static int ext4_journal_check_start(struct super_block *sb)
 
 	if (sb_rdonly(sb))
 		return -EROFS;
-	WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(sb_is_frozen(sb));
 	journal = EXT4_SB(sb)->s_journal;
 	/*
 	 * Special case here: if the journal has aborted behind our
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index d0b80552a678..b98be03d0d1e 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -146,7 +146,7 @@ static ssize_t uuid_show(struct gfs2_sbd *sdp, char *buf)
 static ssize_t freeze_show(struct gfs2_sbd *sdp, char *buf)
 {
 	struct super_block *sb = sdp->sd_vfs;
-	int frozen = (sb->s_writers.frozen == SB_UNFROZEN) ? 0 : 1;
+	int frozen = sb_is_unfrozen(sb) ? 0 : 1;
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", frozen);
 }
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 052f143e2e0e..d8147c21bf03 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -890,13 +890,13 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 	sb = user_get_super(dev, excl);
 	if (!sb)
 		return ERR_PTR(-ENODEV);
-	if (thawed && sb->s_writers.frozen != SB_UNFROZEN) {
+	if (thawed && sb_is_unfrozen(sb)) {
 		if (excl)
 			up_write(&sb->s_umount);
 		else
 			up_read(&sb->s_umount);
 		wait_event(sb->s_writers.wait_unfrozen,
-			   sb->s_writers.frozen == SB_UNFROZEN);
+			   sb_is_unfrozen(sb));
 		put_super(sb);
 		goto retry;
 	}
diff --git a/fs/super.c b/fs/super.c
index a31a41b313f3..fdcf5a87af0a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -883,7 +883,7 @@ int reconfigure_super(struct fs_context *fc)
 
 	if (fc->sb_flags_mask & ~MS_RMT_MASK)
 		return -EINVAL;
-	if (sb->s_writers.frozen != SB_UNFROZEN)
+	if (!(sb_is_unfrozen(sb)))
 		return -EBUSY;
 
 	retval = security_sb_remount(sb, fc->security);
@@ -907,7 +907,7 @@ int reconfigure_super(struct fs_context *fc)
 			down_write(&sb->s_umount);
 			if (!sb->s_root)
 				return 0;
-			if (sb->s_writers.frozen != SB_UNFROZEN)
+			if (!sb_is_unfrozen(sb))
 				return -EBUSY;
 			remount_ro = !sb_rdonly(sb);
 		}
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 7bd16fbff534..ceb4890a4c96 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -267,8 +267,7 @@ xfs_trans_alloc(
 	 * Zero-reservation ("empty") transactions can't modify anything, so
 	 * they're allowed to run while we're frozen.
 	 */
-	WARN_ON(resp->tr_logres > 0 &&
-		mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(resp->tr_logres > 0 && sb_is_frozen(mp->m_super));
 	ASSERT(!(flags & XFS_TRANS_RES_FDBLKS) ||
 	       xfs_has_lazysbcount(mp));
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5042f5ab74a4..c0cab61f9f9a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1604,6 +1604,28 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
 	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
 }
 
+/**
+ * sb_is_frozen - is superblock frozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is frozen.
+ */
+static inline bool sb_is_frozen(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
+}
+
+/**
+ * sb_is_unfrozen - is superblock unfrozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is unfrozen.
+ */
+static inline bool sb_is_unfrozen(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_UNFROZEN;
+}
+
 bool inode_owner_or_capable(struct user_namespace *mnt_userns,
 			    const struct inode *inode);
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Userspace can initiate a freeze call using ioctls. If the kernel decides
to freeze a filesystem later it must be able to distinguish if userspace
had initiated the freeze, so that it does not unfreeze it later
automatically on resume.

Likewise if the kernel is initiating a freeze on its own it should *not*
fail to freeze a filesystem if a user had already frozen it on our behalf.
This same concept applies to thawing, even if its not possible for
userspace to beat the kernel in thawing a filesystem. This logic however
has never applied to userspace freezing and thawing, two consecutive
userspace freeze calls will results in only the first one succeeding, so
we must retain the same behaviour in userspace.

This doesn't implement yet kernel initiated filesystem freeze calls,
this will be done in subsequent calls. This change should introduce
no functional changes, it just extends the definitions of a frozen
filesystem to account for future kernel initiated filesystem freeze
and let's us keep record of when userpace initiated it so the kernel
can respect a userspace initiated freeze upon kernel initiated freeze
and its respective thaw cycle.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c       |  4 ++--
 fs/f2fs/gc.c       |  4 ++--
 fs/gfs2/glops.c    |  2 +-
 fs/gfs2/super.c    |  2 +-
 fs/gfs2/sys.c      |  4 ++--
 fs/gfs2/util.c     |  2 +-
 fs/ioctl.c         |  4 ++--
 fs/super.c         | 31 ++++++++++++++++++++++++++-----
 include/linux/fs.h | 16 ++++++++++++++--
 9 files changed, 51 insertions(+), 18 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 8fd3a7991c02..668ebf2015bf 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
 	if (sb->s_op->freeze_super)
 		error = sb->s_op->freeze_super(sb);
 	else
-		error = freeze_super(sb);
+		error = freeze_super(sb, true);
 	deactivate_locked_super(sb);
 
 	if (error) {
@@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
 	if (sb->s_op->thaw_super)
 		error = sb->s_op->thaw_super(sb);
 	else
-		error = thaw_super(sb);
+		error = thaw_super(sb, true);
 	if (error)
 		bdev->bd_fsfreeze_count++;
 	else
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 4c681fe487ee..8eac3042786b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 
 	if (!get_active_super(sbi->sb->s_bdev))
 		return -ENOTTY;
-	freeze_super(sbi->sb);
+	freeze_super(sbi->sb, true);
 
 	f2fs_down_write(&sbi->gc_lock);
 	f2fs_down_write(&sbi->cp_global_sem);
@@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 	f2fs_up_write(&sbi->cp_global_sem);
 	f2fs_up_write(&sbi->gc_lock);
 	/* We use the same active reference from freeze */
-	thaw_super(sbi->sb);
+	thaw_super(sbi->sb, true);
 	deactivate_locked_super(sbi->sb);
 	return err;
 }
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 081422644ec5..62a7e0693efa 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
 	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
 	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
 		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, true);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
 				error);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 48df7b276b64..9c55b8042aa4 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
 		gfs2_assert_withdraw(sdp, 0);
 	} else {
 		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-		error = thaw_super(sb);
+		error = thaw_super(sb, true);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
 				error);
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index b98be03d0d1e..69514294215b 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 
 	switch (n) {
 	case 0:
-		error = thaw_super(sdp->sd_vfs);
+		error = thaw_super(sdp->sd_vfs, true);
 		break;
 	case 1:
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, true);
 		break;
 	default:
 		deactivate_locked_super(sb);
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 3a0cd5e9ad84..be9705d618ec 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 		/* Make sure gfs2_unfreeze works if partially-frozen */
 		flush_work(&sdp->sd_freeze_work);
 		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
-		thaw_super(sdp->sd_vfs);
+		thaw_super(sdp->sd_vfs, true);
 	} else {
 		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
 			    TASK_UNINTERRUPTIBLE);
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 3d2536e1ea58..0ac1622785ad 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
 	/* Freeze */
 	if (sb->s_op->freeze_super)
 		ret = sb->s_op->freeze_super(sb);
-	ret = freeze_super(sb);
+	ret = freeze_super(sb, true);
 
 	deactivate_locked_super(sb);
 
@@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
 	/* Thaw */
 	if (sb->s_op->thaw_super)
 		return sb->s_op->thaw_super(sb);
-	return thaw_super(sb);
+	return thaw_super(sb, true);
 }
 
 static int ioctl_file_dedupe_range(struct file *file,
diff --git a/fs/super.c b/fs/super.c
index fdcf5a87af0a..0d6b4de8da88 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
 		return;
 	if (sb->s_root && sb->s_flags & SB_BORN) {
 		emergency_thaw_bdev(sb);
-		thaw_super(sb);
+		thaw_super(sb, true);
 	}
 	deactivate_locked_super(sb);
 }
@@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
 /**
  * freeze_super - lock the filesystem and force it into a consistent state
  * @sb: the super to lock
+ * @usercall: whether or not userspace initiated this via an ioctl or if it
+ * 	was a kernel freeze
  *
  * Syncs the super to make sure the filesystem is consistent and calls the fs's
  * freeze_fs.  Subsequent calls to this without first thawing the fs will return
@@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
  *
  * sb->s_writers.frozen is protected by sb->s_umount.
  */
-int freeze_super(struct super_block *sb)
+int freeze_super(struct super_block *sb, bool usercall)
 {
 	int ret;
 
-	if (sb->s_writers.frozen != SB_UNFROZEN)
+	if (!usercall && sb_is_frozen(sb))
+		return 0;
+
+	if (!sb_is_unfrozen(sb))
 		return -EBUSY;
 
 	if (!(sb->s_flags & SB_BORN))
@@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+		sb->s_writers.frozen_by_user = usercall;
 		return 0;
 	}
 
@@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
 	ret = sync_filesystem(sb);
 	if (ret) {
 		sb->s_writers.frozen = SB_UNFROZEN;
+		sb->s_writers.frozen_by_user = false;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
 		return ret;
@@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
 	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
 	 */
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	sb->s_writers.frozen_by_user = usercall;
 	lockdep_sb_freeze_release(sb);
 	return 0;
 }
@@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
 /**
  * thaw_super -- unlock filesystem
  * @sb: the super to thaw
+ * @usercall: whether or not userspace initiated this thaw or if it was the
+ * 	kernel which initiated it
  *
  * Unlocks the filesystem and marks it writeable again after freeze_super().
  */
-int thaw_super(struct super_block *sb)
+int thaw_super(struct super_block *sb, bool usercall)
 {
 	int error;
 
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
+	if (!usercall) {
+		/*
+		 * If userspace initiated the freeze don't let the kernel
+		 * thaw it on return from a kernel initiated freeze.
+		 */
+		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
+			return 0;
+	}
+
+	if (!sb_is_frozen(sb))
 		return -EINVAL;
 
 	if (sb_rdonly(sb)) {
 		sb->s_writers.frozen = SB_UNFROZEN;
+		sb->s_writers.frozen_by_user = false;
 		goto out;
 	}
 
@@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
 	}
 
 	sb->s_writers.frozen = SB_UNFROZEN;
+	sb->s_writers.frozen_by_user = false;
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c0cab61f9f9a..3b2586de4364 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1129,6 +1129,7 @@ enum {
 
 struct sb_writers {
 	int				frozen;		/* Is sb frozen? */
+	bool				frozen_by_user;	/* User freeze? */
 	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
 	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
 };
@@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
 	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
 }
 
+/**
+ * sb_is_frozen_by_user - was the superblock frozen by userspace?
+ * @sb: the super to check
+ *
+ * Returns true if the super is frozen by userspace, such as an ioctl.
+ */
+static inline bool sb_is_frozen_by_user(struct super_block *sb)
+{
+	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
+}
+
 /**
  * sb_is_unfrozen - is superblock unfrozen
  * @sb: the super to check
@@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
 extern int vfs_statfs(const struct path *, struct kstatfs *);
 extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
-extern int freeze_super(struct super_block *super);
-extern int thaw_super(struct super_block *super);
+extern int freeze_super(struct super_block *super, bool usercall);
+extern int thaw_super(struct super_block *super, bool usercall);
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Userspace can initiate a freeze call using ioctls. If the kernel decides
to freeze a filesystem later it must be able to distinguish if userspace
had initiated the freeze, so that it does not unfreeze it later
automatically on resume.

Likewise if the kernel is initiating a freeze on its own it should *not*
fail to freeze a filesystem if a user had already frozen it on our behalf.
This same concept applies to thawing, even if its not possible for
userspace to beat the kernel in thawing a filesystem. This logic however
has never applied to userspace freezing and thawing, two consecutive
userspace freeze calls will results in only the first one succeeding, so
we must retain the same behaviour in userspace.

This doesn't implement yet kernel initiated filesystem freeze calls,
this will be done in subsequent calls. This change should introduce
no functional changes, it just extends the definitions of a frozen
filesystem to account for future kernel initiated filesystem freeze
and let's us keep record of when userpace initiated it so the kernel
can respect a userspace initiated freeze upon kernel initiated freeze
and its respective thaw cycle.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/bdev.c       |  4 ++--
 fs/f2fs/gc.c       |  4 ++--
 fs/gfs2/glops.c    |  2 +-
 fs/gfs2/super.c    |  2 +-
 fs/gfs2/sys.c      |  4 ++--
 fs/gfs2/util.c     |  2 +-
 fs/ioctl.c         |  4 ++--
 fs/super.c         | 31 ++++++++++++++++++++++++++-----
 include/linux/fs.h | 16 ++++++++++++++--
 9 files changed, 51 insertions(+), 18 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 8fd3a7991c02..668ebf2015bf 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
 	if (sb->s_op->freeze_super)
 		error = sb->s_op->freeze_super(sb);
 	else
-		error = freeze_super(sb);
+		error = freeze_super(sb, true);
 	deactivate_locked_super(sb);
 
 	if (error) {
@@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
 	if (sb->s_op->thaw_super)
 		error = sb->s_op->thaw_super(sb);
 	else
-		error = thaw_super(sb);
+		error = thaw_super(sb, true);
 	if (error)
 		bdev->bd_fsfreeze_count++;
 	else
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 4c681fe487ee..8eac3042786b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 
 	if (!get_active_super(sbi->sb->s_bdev))
 		return -ENOTTY;
-	freeze_super(sbi->sb);
+	freeze_super(sbi->sb, true);
 
 	f2fs_down_write(&sbi->gc_lock);
 	f2fs_down_write(&sbi->cp_global_sem);
@@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
 	f2fs_up_write(&sbi->cp_global_sem);
 	f2fs_up_write(&sbi->gc_lock);
 	/* We use the same active reference from freeze */
-	thaw_super(sbi->sb);
+	thaw_super(sbi->sb, true);
 	deactivate_locked_super(sbi->sb);
 	return err;
 }
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 081422644ec5..62a7e0693efa 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
 	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
 	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
 		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, true);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
 				error);
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 48df7b276b64..9c55b8042aa4 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
 		gfs2_assert_withdraw(sdp, 0);
 	} else {
 		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
-		error = thaw_super(sb);
+		error = thaw_super(sb, true);
 		if (error) {
 			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
 				error);
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index b98be03d0d1e..69514294215b 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
 
 	switch (n) {
 	case 0:
-		error = thaw_super(sdp->sd_vfs);
+		error = thaw_super(sdp->sd_vfs, true);
 		break;
 	case 1:
-		error = freeze_super(sdp->sd_vfs);
+		error = freeze_super(sdp->sd_vfs, true);
 		break;
 	default:
 		deactivate_locked_super(sb);
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 3a0cd5e9ad84..be9705d618ec 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
 		/* Make sure gfs2_unfreeze works if partially-frozen */
 		flush_work(&sdp->sd_freeze_work);
 		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
-		thaw_super(sdp->sd_vfs);
+		thaw_super(sdp->sd_vfs, true);
 	} else {
 		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
 			    TASK_UNINTERRUPTIBLE);
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 3d2536e1ea58..0ac1622785ad 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
 	/* Freeze */
 	if (sb->s_op->freeze_super)
 		ret = sb->s_op->freeze_super(sb);
-	ret = freeze_super(sb);
+	ret = freeze_super(sb, true);
 
 	deactivate_locked_super(sb);
 
@@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
 	/* Thaw */
 	if (sb->s_op->thaw_super)
 		return sb->s_op->thaw_super(sb);
-	return thaw_super(sb);
+	return thaw_super(sb, true);
 }
 
 static int ioctl_file_dedupe_range(struct file *file,
diff --git a/fs/super.c b/fs/super.c
index fdcf5a87af0a..0d6b4de8da88 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
 		return;
 	if (sb->s_root && sb->s_flags & SB_BORN) {
 		emergency_thaw_bdev(sb);
-		thaw_super(sb);
+		thaw_super(sb, true);
 	}
 	deactivate_locked_super(sb);
 }
@@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
 /**
  * freeze_super - lock the filesystem and force it into a consistent state
  * @sb: the super to lock
+ * @usercall: whether or not userspace initiated this via an ioctl or if it
+ * 	was a kernel freeze
  *
  * Syncs the super to make sure the filesystem is consistent and calls the fs's
  * freeze_fs.  Subsequent calls to this without first thawing the fs will return
@@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
  *
  * sb->s_writers.frozen is protected by sb->s_umount.
  */
-int freeze_super(struct super_block *sb)
+int freeze_super(struct super_block *sb, bool usercall)
 {
 	int ret;
 
-	if (sb->s_writers.frozen != SB_UNFROZEN)
+	if (!usercall && sb_is_frozen(sb))
+		return 0;
+
+	if (!sb_is_unfrozen(sb))
 		return -EBUSY;
 
 	if (!(sb->s_flags & SB_BORN))
@@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+		sb->s_writers.frozen_by_user = usercall;
 		return 0;
 	}
 
@@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
 	ret = sync_filesystem(sb);
 	if (ret) {
 		sb->s_writers.frozen = SB_UNFROZEN;
+		sb->s_writers.frozen_by_user = false;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
 		wake_up(&sb->s_writers.wait_unfrozen);
 		return ret;
@@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
 	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
 	 */
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	sb->s_writers.frozen_by_user = usercall;
 	lockdep_sb_freeze_release(sb);
 	return 0;
 }
@@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
 /**
  * thaw_super -- unlock filesystem
  * @sb: the super to thaw
+ * @usercall: whether or not userspace initiated this thaw or if it was the
+ * 	kernel which initiated it
  *
  * Unlocks the filesystem and marks it writeable again after freeze_super().
  */
-int thaw_super(struct super_block *sb)
+int thaw_super(struct super_block *sb, bool usercall)
 {
 	int error;
 
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
+	if (!usercall) {
+		/*
+		 * If userspace initiated the freeze don't let the kernel
+		 * thaw it on return from a kernel initiated freeze.
+		 */
+		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
+			return 0;
+	}
+
+	if (!sb_is_frozen(sb))
 		return -EINVAL;
 
 	if (sb_rdonly(sb)) {
 		sb->s_writers.frozen = SB_UNFROZEN;
+		sb->s_writers.frozen_by_user = false;
 		goto out;
 	}
 
@@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
 	}
 
 	sb->s_writers.frozen = SB_UNFROZEN;
+	sb->s_writers.frozen_by_user = false;
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c0cab61f9f9a..3b2586de4364 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1129,6 +1129,7 @@ enum {
 
 struct sb_writers {
 	int				frozen;		/* Is sb frozen? */
+	bool				frozen_by_user;	/* User freeze? */
 	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
 	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
 };
@@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
 	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
 }
 
+/**
+ * sb_is_frozen_by_user - was the superblock frozen by userspace?
+ * @sb: the super to check
+ *
+ * Returns true if the super is frozen by userspace, such as an ioctl.
+ */
+static inline bool sb_is_frozen_by_user(struct super_block *sb)
+{
+	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
+}
+
 /**
  * sb_is_unfrozen - is superblock unfrozen
  * @sb: the super to check
@@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
 extern int vfs_statfs(const struct path *, struct kstatfs *);
 extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
-extern int freeze_super(struct super_block *super);
-extern int thaw_super(struct super_block *super);
+extern int freeze_super(struct super_block *super, bool usercall);
+extern int thaw_super(struct super_block *super, bool usercall);
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 04/24] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

There are use cases where we wish to traverse the superblock list
but also capture errors, and in which case we want to avoid having
our callers issue a lock themselves since we can do the locking for
the callers. Provide a iterate_supers_excl() which calls a function
with the write lock held. If an error occurs we capture it and
propagate it.

Likewise there are use cases where we wish to traverse the superblock
list but in reverse order. The new iterate_supers_reverse_excl() helpers
does this but also also captures any errors encountered.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  2 +
 2 files changed, 93 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 0d6b4de8da88..2f77fcb6e555 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -732,6 +732,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
 	spin_unlock(&sb_lock);
 }
 
+/**
+ *	iterate_supers_excl - exclusively call func for all active superblocks
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument. Returns 0 unless an error
+ *	occurred on calling the function on any superblock.
+ */
+int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
+/**
+ *	iterate_supers_reverse_excl - exclusively calls func in reverse order
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument, in reverse order, and holding
+ *	the s_umount write lock. Returns if an error occurred.
+ */
+int iterate_supers_reverse_excl(int (*f)(struct super_block *, void *),
+					 void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
 /**
  *	iterate_supers_type - call function for superblocks of given type
  *	@type: fs type
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3b2586de4364..f168e72f6ca1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2916,6 +2916,8 @@ extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
 extern void iterate_supers(void (*)(struct super_block *, void *), void *);
+extern int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg);
+extern int iterate_supers_reverse_excl(int (*)(struct super_block *, void *), void *);
 extern void iterate_supers_type(struct file_system_type *,
 			        void (*)(struct super_block *, void *), void *);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 04/24] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

There are use cases where we wish to traverse the superblock list
but also capture errors, and in which case we want to avoid having
our callers issue a lock themselves since we can do the locking for
the callers. Provide a iterate_supers_excl() which calls a function
with the write lock held. If an error occurs we capture it and
propagate it.

Likewise there are use cases where we wish to traverse the superblock
list but in reverse order. The new iterate_supers_reverse_excl() helpers
does this but also also captures any errors encountered.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  2 +
 2 files changed, 93 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 0d6b4de8da88..2f77fcb6e555 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -732,6 +732,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
 	spin_unlock(&sb_lock);
 }
 
+/**
+ *	iterate_supers_excl - exclusively call func for all active superblocks
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument. Returns 0 unless an error
+ *	occurred on calling the function on any superblock.
+ */
+int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
+/**
+ *	iterate_supers_reverse_excl - exclusively calls func in reverse order
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument, in reverse order, and holding
+ *	the s_umount write lock. Returns if an error occurred.
+ */
+int iterate_supers_reverse_excl(int (*f)(struct super_block *, void *),
+					 void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
 /**
  *	iterate_supers_type - call function for superblocks of given type
  *	@type: fs type
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3b2586de4364..f168e72f6ca1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2916,6 +2916,8 @@ extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
 extern void iterate_supers(void (*)(struct super_block *, void *), void *);
+extern int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg);
+extern int iterate_supers_reverse_excl(int (*)(struct super_block *, void *), void *);
 extern void iterate_supers_type(struct file_system_type *,
 			        void (*)(struct super_block *, void *), void *);
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Add support to automatically handle freezing and thawing filesystems
during the kernel's suspend/resume cycle.

This is needed so that we properly really stop IO in flight without
races after userspace has been frozen. Without this we rely on
kthread freezing and its semantics are loose and error prone.
For instance, even though a kthread may use try_to_freeze() and end
up being frozen we have no way of being sure that everything that
has been spawned asynchronously from it (such as timers) have also
been stopped as well.

A long term advantage of also adding filesystem freeze / thawing
supporting during suspend / hibernation is that long term we may
be able to eventually drop the kernel's thread freezing completely
as it was originally added to stop disk IO in flight as we hibernate
or suspend.

This does not remove the superflous freezer calls on all filesystems.
Each filesystem must remove all the kthread freezer stuff and peg
the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
flag.

Subsequent patches remove the kthread freezer usage from each
filesystem, one at a time to make all this work bisectable.
Once all filesystems remove the usage of the kthread freezer we
can remove the FS_AUTOFREEZE flag.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h     | 14 +++++++++
 kernel/power/process.c | 15 ++++++++-
 3 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 2f77fcb6e555..e8af4c8269ad 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
 	return 0;
 }
 EXPORT_SYMBOL(thaw_super);
+
+#ifdef CONFIG_PM_SLEEP
+static bool super_should_freeze(struct super_block *sb)
+{
+	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
+		return false;
+	/*
+	 * We don't freeze virtual filesystems, we skip those filesystems with
+	 * no backing device.
+	 */
+	if (sb->s_bdi == &noop_backing_dev_info)
+		return false;
+
+	return true;
+}
+
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!grab_lock_super(sb)) {
+		pr_err("%s (%s): freezing failed to grab_super()\n",
+		       sb->s_type->name, sb->s_id);
+		return -ENOTTY;
+	}
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
+
+	error = freeze_super(sb, false);
+	if (!error)
+		lockdep_sb_freeze_release(sb);
+	else if (error != -EBUSY)
+		pr_notice("%s (%s): Unable to freeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	deactivate_locked_super(sb);
+	return error;
+}
+
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!grab_lock_super(sb)) {
+		pr_err("%s (%s): thawing failed to grab_super()\n",
+		       sb->s_type->name, sb->s_id);
+		return -ENOTTY;
+	}
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
+
+	error = thaw_super(sb, false);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to unfreeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	deactivate_locked_super(sb);
+	return error;
+}
+
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f168e72f6ca1..e5bee359e804 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2231,6 +2231,7 @@ struct file_system_type {
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
+#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
@@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
 extern int freeze_super(struct super_block *super, bool usercall);
 extern int thaw_super(struct super_block *super, bool usercall);
+#ifdef CONFIG_PM_SLEEP
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
+#else
+static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+#endif
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 6c1c7e566d35..1dd6b0b6b4e5 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -140,6 +140,16 @@ int freeze_processes(void)
 
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
+	if (error) {
+		pr_cont("failed\n");
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
+		thaw_processes();
+		return error;
+	}
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disable
 	 * the OOM killer to disallow any further interference with
@@ -149,8 +159,10 @@ int freeze_processes(void)
 	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 		thaw_processes();
+	}
 	return error;
 }
 
@@ -188,6 +200,7 @@ void thaw_processes(void)
 	pm_nosig_freezing = false;
 
 	oom_killer_enable();
+	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 
 	pr_info("Restarting tasks ... ");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Add support to automatically handle freezing and thawing filesystems
during the kernel's suspend/resume cycle.

This is needed so that we properly really stop IO in flight without
races after userspace has been frozen. Without this we rely on
kthread freezing and its semantics are loose and error prone.
For instance, even though a kthread may use try_to_freeze() and end
up being frozen we have no way of being sure that everything that
has been spawned asynchronously from it (such as timers) have also
been stopped as well.

A long term advantage of also adding filesystem freeze / thawing
supporting during suspend / hibernation is that long term we may
be able to eventually drop the kernel's thread freezing completely
as it was originally added to stop disk IO in flight as we hibernate
or suspend.

This does not remove the superflous freezer calls on all filesystems.
Each filesystem must remove all the kthread freezer stuff and peg
the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
flag.

Subsequent patches remove the kthread freezer usage from each
filesystem, one at a time to make all this work bisectable.
Once all filesystems remove the usage of the kthread freezer we
can remove the FS_AUTOFREEZE flag.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h     | 14 +++++++++
 kernel/power/process.c | 15 ++++++++-
 3 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 2f77fcb6e555..e8af4c8269ad 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
 	return 0;
 }
 EXPORT_SYMBOL(thaw_super);
+
+#ifdef CONFIG_PM_SLEEP
+static bool super_should_freeze(struct super_block *sb)
+{
+	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
+		return false;
+	/*
+	 * We don't freeze virtual filesystems, we skip those filesystems with
+	 * no backing device.
+	 */
+	if (sb->s_bdi == &noop_backing_dev_info)
+		return false;
+
+	return true;
+}
+
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!grab_lock_super(sb)) {
+		pr_err("%s (%s): freezing failed to grab_super()\n",
+		       sb->s_type->name, sb->s_id);
+		return -ENOTTY;
+	}
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
+
+	error = freeze_super(sb, false);
+	if (!error)
+		lockdep_sb_freeze_release(sb);
+	else if (error != -EBUSY)
+		pr_notice("%s (%s): Unable to freeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	deactivate_locked_super(sb);
+	return error;
+}
+
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!grab_lock_super(sb)) {
+		pr_err("%s (%s): thawing failed to grab_super()\n",
+		       sb->s_type->name, sb->s_id);
+		return -ENOTTY;
+	}
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
+
+	error = thaw_super(sb, false);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to unfreeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	deactivate_locked_super(sb);
+	return error;
+}
+
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f168e72f6ca1..e5bee359e804 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2231,6 +2231,7 @@ struct file_system_type {
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
+#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
@@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
 extern int freeze_super(struct super_block *super, bool usercall);
 extern int thaw_super(struct super_block *super, bool usercall);
+#ifdef CONFIG_PM_SLEEP
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
+#else
+static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+#endif
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 6c1c7e566d35..1dd6b0b6b4e5 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -140,6 +140,16 @@ int freeze_processes(void)
 
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
+	if (error) {
+		pr_cont("failed\n");
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
+		thaw_processes();
+		return error;
+	}
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disable
 	 * the OOM killer to disallow any further interference with
@@ -149,8 +159,10 @@ int freeze_processes(void)
 	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 		thaw_processes();
+	}
 	return error;
 }
 
@@ -188,6 +200,7 @@ void thaw_processes(void)
 	pm_nosig_freezing = false;
 
 	oom_killer_enable();
+	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 
 	pr_info("Restarting tasks ... ");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 06/24] xfs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/xfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_log.c       |  3 +--
 fs/xfs/xfs_log_cil.c   |  2 +-
 fs/xfs/xfs_mru_cache.c |  2 +-
 fs/xfs/xfs_pwork.c     |  2 +-
 fs/xfs/xfs_super.c     | 16 ++++++++--------
 fs/xfs/xfs_trans_ail.c |  3 ---
 6 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index fc61cc024023..fbdbc81dc8ad 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1678,8 +1678,7 @@ xlog_alloc_log(
 	log->l_iclog->ic_prev = prev_iclog;	/* re-write 1st prev ptr */
 
 	log->l_ioend_workqueue = alloc_workqueue("xfs-log/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM |
-				    WQ_HIGHPRI),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_HIGHPRI),
 			0, mp->m_super->s_id);
 	if (!log->l_ioend_workqueue)
 		goto out_free_iclog;
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index eccbfb99e894..bcc5c8234ce8 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -1842,7 +1842,7 @@ xlog_cil_init(
 	 * concurrency the log spinlocks will be exposed to.
 	 */
 	cil->xc_push_wq = alloc_workqueue("xfs-cil/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM | WQ_UNBOUND),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_UNBOUND),
 			4, log->l_mp->m_super->s_id);
 	if (!cil->xc_push_wq)
 		goto out_destroy_cil;
diff --git a/fs/xfs/xfs_mru_cache.c b/fs/xfs/xfs_mru_cache.c
index f85e3b07ab44..98832a84be66 100644
--- a/fs/xfs/xfs_mru_cache.c
+++ b/fs/xfs/xfs_mru_cache.c
@@ -294,7 +294,7 @@ int
 xfs_mru_cache_init(void)
 {
 	xfs_mru_reap_wq = alloc_workqueue("xfs_mru_cache",
-			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 1);
+			XFS_WQFLAGS(WQ_MEM_RECLAIM), 1);
 	if (!xfs_mru_reap_wq)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c
index c283b801cc5d..3f5bf53f8778 100644
--- a/fs/xfs/xfs_pwork.c
+++ b/fs/xfs/xfs_pwork.c
@@ -72,7 +72,7 @@ xfs_pwork_init(
 	trace_xfs_pwork_init(mp, nr_threads, current->pid);
 
 	pctl->wq = alloc_workqueue("%s-%d",
-			WQ_UNBOUND | WQ_SYSFS | WQ_FREEZABLE, nr_threads, tag,
+			WQ_UNBOUND | WQ_SYSFS, nr_threads, tag,
 			current->pid);
 	if (!pctl->wq)
 		return -ENOMEM;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 0c4b73e9b29d..54cbf15fc459 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -526,37 +526,37 @@ xfs_init_mount_workqueues(
 	struct xfs_mount	*mp)
 {
 	mp->m_buf_workqueue = alloc_workqueue("xfs-buf/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			1, mp->m_super->s_id);
 	if (!mp->m_buf_workqueue)
 		goto out;
 
 	mp->m_unwritten_workqueue = alloc_workqueue("xfs-conv/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_unwritten_workqueue)
 		goto out_destroy_buf;
 
 	mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_reclaim_workqueue)
 		goto out_destroy_unwritten;
 
 	mp->m_blockgc_wq = alloc_workqueue("xfs-blockgc/%s",
-			XFS_WQFLAGS(WQ_UNBOUND | WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_UNBOUND | WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_blockgc_wq)
 		goto out_destroy_reclaim;
 
 	mp->m_inodegc_wq = alloc_workqueue("xfs-inodegc/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			1, mp->m_super->s_id);
 	if (!mp->m_inodegc_wq)
 		goto out_destroy_blockgc;
 
 	mp->m_sync_workqueue = alloc_workqueue("xfs-sync/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE), 0, mp->m_super->s_id);
+			XFS_WQFLAGS(0), 0, mp->m_super->s_id);
 	if (!mp->m_sync_workqueue)
 		goto out_destroy_inodegc;
 
@@ -1966,7 +1966,7 @@ static struct file_system_type xfs_fs_type = {
 	.init_fs_context	= xfs_init_fs_context,
 	.parameters		= xfs_fs_parameters,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("xfs");
 
@@ -2205,7 +2205,7 @@ xfs_init_workqueues(void)
 	 * max_active value for this workqueue.
 	 */
 	xfs_alloc_wq = alloc_workqueue("xfsalloc",
-			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 0);
+			XFS_WQFLAGS(WQ_MEM_RECLAIM), 0);
 	if (!xfs_alloc_wq)
 		return -ENOMEM;
 
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 7d4109af193e..03a9bb64927c 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -600,7 +600,6 @@ xfsaild(
 	unsigned int	noreclaim_flag;
 
 	noreclaim_flag = memalloc_noreclaim_save();
-	set_freezable();
 
 	while (1) {
 		if (tout && tout <= 20)
@@ -666,8 +665,6 @@ xfsaild(
 
 		__set_current_state(TASK_RUNNING);
 
-		try_to_freeze();
-
 		tout = xfsaild_push(ailp);
 	}
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 06/24] xfs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/xfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/xfs_log.c       |  3 +--
 fs/xfs/xfs_log_cil.c   |  2 +-
 fs/xfs/xfs_mru_cache.c |  2 +-
 fs/xfs/xfs_pwork.c     |  2 +-
 fs/xfs/xfs_super.c     | 16 ++++++++--------
 fs/xfs/xfs_trans_ail.c |  3 ---
 6 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index fc61cc024023..fbdbc81dc8ad 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1678,8 +1678,7 @@ xlog_alloc_log(
 	log->l_iclog->ic_prev = prev_iclog;	/* re-write 1st prev ptr */
 
 	log->l_ioend_workqueue = alloc_workqueue("xfs-log/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM |
-				    WQ_HIGHPRI),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_HIGHPRI),
 			0, mp->m_super->s_id);
 	if (!log->l_ioend_workqueue)
 		goto out_free_iclog;
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index eccbfb99e894..bcc5c8234ce8 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -1842,7 +1842,7 @@ xlog_cil_init(
 	 * concurrency the log spinlocks will be exposed to.
 	 */
 	cil->xc_push_wq = alloc_workqueue("xfs-cil/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM | WQ_UNBOUND),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_UNBOUND),
 			4, log->l_mp->m_super->s_id);
 	if (!cil->xc_push_wq)
 		goto out_destroy_cil;
diff --git a/fs/xfs/xfs_mru_cache.c b/fs/xfs/xfs_mru_cache.c
index f85e3b07ab44..98832a84be66 100644
--- a/fs/xfs/xfs_mru_cache.c
+++ b/fs/xfs/xfs_mru_cache.c
@@ -294,7 +294,7 @@ int
 xfs_mru_cache_init(void)
 {
 	xfs_mru_reap_wq = alloc_workqueue("xfs_mru_cache",
-			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 1);
+			XFS_WQFLAGS(WQ_MEM_RECLAIM), 1);
 	if (!xfs_mru_reap_wq)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c
index c283b801cc5d..3f5bf53f8778 100644
--- a/fs/xfs/xfs_pwork.c
+++ b/fs/xfs/xfs_pwork.c
@@ -72,7 +72,7 @@ xfs_pwork_init(
 	trace_xfs_pwork_init(mp, nr_threads, current->pid);
 
 	pctl->wq = alloc_workqueue("%s-%d",
-			WQ_UNBOUND | WQ_SYSFS | WQ_FREEZABLE, nr_threads, tag,
+			WQ_UNBOUND | WQ_SYSFS, nr_threads, tag,
 			current->pid);
 	if (!pctl->wq)
 		return -ENOMEM;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 0c4b73e9b29d..54cbf15fc459 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -526,37 +526,37 @@ xfs_init_mount_workqueues(
 	struct xfs_mount	*mp)
 {
 	mp->m_buf_workqueue = alloc_workqueue("xfs-buf/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			1, mp->m_super->s_id);
 	if (!mp->m_buf_workqueue)
 		goto out;
 
 	mp->m_unwritten_workqueue = alloc_workqueue("xfs-conv/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_unwritten_workqueue)
 		goto out_destroy_buf;
 
 	mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_reclaim_workqueue)
 		goto out_destroy_unwritten;
 
 	mp->m_blockgc_wq = alloc_workqueue("xfs-blockgc/%s",
-			XFS_WQFLAGS(WQ_UNBOUND | WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_UNBOUND | WQ_MEM_RECLAIM),
 			0, mp->m_super->s_id);
 	if (!mp->m_blockgc_wq)
 		goto out_destroy_reclaim;
 
 	mp->m_inodegc_wq = alloc_workqueue("xfs-inodegc/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM),
+			XFS_WQFLAGS(WQ_MEM_RECLAIM),
 			1, mp->m_super->s_id);
 	if (!mp->m_inodegc_wq)
 		goto out_destroy_blockgc;
 
 	mp->m_sync_workqueue = alloc_workqueue("xfs-sync/%s",
-			XFS_WQFLAGS(WQ_FREEZABLE), 0, mp->m_super->s_id);
+			XFS_WQFLAGS(0), 0, mp->m_super->s_id);
 	if (!mp->m_sync_workqueue)
 		goto out_destroy_inodegc;
 
@@ -1966,7 +1966,7 @@ static struct file_system_type xfs_fs_type = {
 	.init_fs_context	= xfs_init_fs_context,
 	.parameters		= xfs_fs_parameters,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("xfs");
 
@@ -2205,7 +2205,7 @@ xfs_init_workqueues(void)
 	 * max_active value for this workqueue.
 	 */
 	xfs_alloc_wq = alloc_workqueue("xfsalloc",
-			XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 0);
+			XFS_WQFLAGS(WQ_MEM_RECLAIM), 0);
 	if (!xfs_alloc_wq)
 		return -ENOMEM;
 
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 7d4109af193e..03a9bb64927c 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -600,7 +600,6 @@ xfsaild(
 	unsigned int	noreclaim_flag;
 
 	noreclaim_flag = memalloc_noreclaim_save();
-	set_freezable();
 
 	while (1) {
 		if (tout && tout <= 20)
@@ -666,8 +665,6 @@ xfsaild(
 
 		__set_current_state(TASK_RUNNING);
 
-		try_to_freeze();
-
 		tout = xfsaild_push(ailp);
 	}
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 07/24] btrfs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/btrfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/btrfs/disk-io.c | 4 ++--
 fs/btrfs/scrub.c   | 2 +-
 fs/btrfs/super.c   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f330dfa066c0..bf7ad1f34e21 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2354,7 +2354,7 @@ static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info)
 static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 {
 	u32 max_active = fs_info->thread_pool_size;
-	unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
+	unsigned int flags = WQ_MEM_RECLAIM | WQ_UNBOUND;
 
 	fs_info->workers =
 		btrfs_alloc_workqueue(fs_info, "worker", flags, max_active, 16);
@@ -2395,7 +2395,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	fs_info->qgroup_rescan_workers =
 		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
 	fs_info->discard_ctl.discard_workers =
-		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+		alloc_workqueue("btrfs_discard", WQ_UNBOUND, 1);
 
 	if (!(fs_info->workers && fs_info->hipri_workers &&
 	      fs_info->delalloc_workers && fs_info->flush_workers &&
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 52b346795f66..d32d7308c3a1 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4207,7 +4207,7 @@ static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info,
 	struct workqueue_struct *scrub_workers = NULL;
 	struct workqueue_struct *scrub_wr_comp = NULL;
 	struct workqueue_struct *scrub_parity = NULL;
-	unsigned int flags = WQ_FREEZABLE | WQ_UNBOUND;
+	unsigned int flags = WQ_UNBOUND;
 	int max_active = fs_info->thread_pool_size;
 	int ret = -ENOMEM;
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 433ce221dc5c..35059fe276ac 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2138,7 +2138,7 @@ static struct file_system_type btrfs_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 
 static struct file_system_type btrfs_root_fs_type = {
@@ -2146,7 +2146,7 @@ static struct file_system_type btrfs_root_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount_root,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 
 MODULE_ALIAS_FS("btrfs");
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 07/24] btrfs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/btrfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/btrfs/disk-io.c | 4 ++--
 fs/btrfs/scrub.c   | 2 +-
 fs/btrfs/super.c   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f330dfa066c0..bf7ad1f34e21 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2354,7 +2354,7 @@ static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info)
 static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 {
 	u32 max_active = fs_info->thread_pool_size;
-	unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
+	unsigned int flags = WQ_MEM_RECLAIM | WQ_UNBOUND;
 
 	fs_info->workers =
 		btrfs_alloc_workqueue(fs_info, "worker", flags, max_active, 16);
@@ -2395,7 +2395,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	fs_info->qgroup_rescan_workers =
 		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
 	fs_info->discard_ctl.discard_workers =
-		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+		alloc_workqueue("btrfs_discard", WQ_UNBOUND, 1);
 
 	if (!(fs_info->workers && fs_info->hipri_workers &&
 	      fs_info->delalloc_workers && fs_info->flush_workers &&
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 52b346795f66..d32d7308c3a1 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -4207,7 +4207,7 @@ static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info,
 	struct workqueue_struct *scrub_workers = NULL;
 	struct workqueue_struct *scrub_wr_comp = NULL;
 	struct workqueue_struct *scrub_parity = NULL;
-	unsigned int flags = WQ_FREEZABLE | WQ_UNBOUND;
+	unsigned int flags = WQ_UNBOUND;
 	int max_active = fs_info->thread_pool_size;
 	int ret = -ENOMEM;
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 433ce221dc5c..35059fe276ac 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2138,7 +2138,7 @@ static struct file_system_type btrfs_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 
 static struct file_system_type btrfs_root_fs_type = {
@@ -2146,7 +2146,7 @@ static struct file_system_type btrfs_root_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount_root,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 
 MODULE_ALIAS_FS("btrfs");
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 08/24] ext4: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ext4 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ext4/super.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b31db521d6bf..0ae6f13c7fa4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -136,7 +136,7 @@ static struct file_system_type ext2_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV,
+	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -152,7 +152,7 @@ static struct file_system_type ext3_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV,
+	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -3734,7 +3734,6 @@ static int ext4_lazyinit_thread(void *arg)
 	unsigned long next_wakeup, cur;
 
 	BUG_ON(NULL == eli);
-	set_freezable();
 
 cont_thread:
 	while (true) {
@@ -3786,8 +3785,6 @@ static int ext4_lazyinit_thread(void *arg)
 		}
 		mutex_unlock(&eli->li_list_mtx);
 
-		try_to_freeze();
-
 		cur = jiffies;
 		if ((time_after_eq(cur, next_wakeup)) ||
 		    (MAX_JIFFY_OFFSET == next_wakeup)) {
@@ -7192,7 +7189,7 @@ static struct file_system_type ext4_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext4");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 08/24] ext4: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ext4 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ext4/super.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b31db521d6bf..0ae6f13c7fa4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -136,7 +136,7 @@ static struct file_system_type ext2_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV,
+	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -152,7 +152,7 @@ static struct file_system_type ext3_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV,
+	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -3734,7 +3734,6 @@ static int ext4_lazyinit_thread(void *arg)
 	unsigned long next_wakeup, cur;
 
 	BUG_ON(NULL == eli);
-	set_freezable();
 
 cont_thread:
 	while (true) {
@@ -3786,8 +3785,6 @@ static int ext4_lazyinit_thread(void *arg)
 		}
 		mutex_unlock(&eli->li_list_mtx);
 
-		try_to_freeze();
-
 		cur = jiffies;
 		if ((time_after_eq(cur, next_wakeup)) ||
 		    (MAX_JIFFY_OFFSET == next_wakeup)) {
@@ -7192,7 +7189,7 @@ static struct file_system_type ext4_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("ext4");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 09/24] f2fs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/f2fs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/f2fs/gc.c      | 3 +--
 fs/f2fs/segment.c | 6 +-----
 fs/f2fs/super.c   | 2 +-
 3 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8eac3042786b..627a7ea95851 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -42,12 +42,11 @@ static int gc_thread_func(void *data)
 
 	wait_ms = gc_th->min_sleep_time;
 
-	set_freezable();
 	do {
 		bool sync_mode, foreground = false;
 
 		wait_event_interruptible_timeout(*wq,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				waitqueue_active(fggc_wq) ||
 				gc_th->gc_wake,
 				msecs_to_jiffies(wait_ms));
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e2f95f46d298..11cad5287047 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1682,11 +1682,9 @@ static int issue_discard_thread(void *data)
 	unsigned int wait_ms = dcc->min_discard_issue_time;
 	int issued;
 
-	set_freezable();
-
 	do {
 		wait_event_interruptible_timeout(*q,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				dcc->discard_wake,
 				msecs_to_jiffies(wait_ms));
 
@@ -1704,8 +1702,6 @@ static int issue_discard_thread(void *data)
 		if (atomic_read(&dcc->queued_discard))
 			__wait_all_discard_cmd(sbi, NULL);
 
-		if (try_to_freeze())
-			continue;
 		if (f2fs_readonly(sbi->sb))
 			continue;
 		if (kthread_should_stop())
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 87d56a9883e6..e9c6fb04c713 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4645,7 +4645,7 @@ static struct file_system_type f2fs_fs_type = {
 	.name		= "f2fs",
 	.mount		= f2fs_mount,
 	.kill_sb	= kill_f2fs_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("f2fs");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 09/24] f2fs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/f2fs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/f2fs/gc.c      | 3 +--
 fs/f2fs/segment.c | 6 +-----
 fs/f2fs/super.c   | 2 +-
 3 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 8eac3042786b..627a7ea95851 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -42,12 +42,11 @@ static int gc_thread_func(void *data)
 
 	wait_ms = gc_th->min_sleep_time;
 
-	set_freezable();
 	do {
 		bool sync_mode, foreground = false;
 
 		wait_event_interruptible_timeout(*wq,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				waitqueue_active(fggc_wq) ||
 				gc_th->gc_wake,
 				msecs_to_jiffies(wait_ms));
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e2f95f46d298..11cad5287047 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1682,11 +1682,9 @@ static int issue_discard_thread(void *data)
 	unsigned int wait_ms = dcc->min_discard_issue_time;
 	int issued;
 
-	set_freezable();
-
 	do {
 		wait_event_interruptible_timeout(*q,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				dcc->discard_wake,
 				msecs_to_jiffies(wait_ms));
 
@@ -1704,8 +1702,6 @@ static int issue_discard_thread(void *data)
 		if (atomic_read(&dcc->queued_discard))
 			__wait_all_discard_cmd(sbi, NULL);
 
-		if (try_to_freeze())
-			continue;
 		if (f2fs_readonly(sbi->sb))
 			continue;
 		if (kthread_should_stop())
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 87d56a9883e6..e9c6fb04c713 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4645,7 +4645,7 @@ static struct file_system_type f2fs_fs_type = {
 	.name		= "f2fs",
 	.mount		= f2fs_mount,
 	.kill_sb	= kill_f2fs_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
+	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("f2fs");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 10/24] cifs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/cifs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/cifs/cifsfs.c    | 14 +++++++-------
 fs/cifs/connect.c   |  8 --------
 fs/cifs/dfs_cache.c |  2 +-
 3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index f052f190b2e8..25ee05c8af65 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1104,7 +1104,7 @@ struct file_system_type cifs_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("cifs");
 
@@ -1114,7 +1114,7 @@ struct file_system_type smb3_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("smb3");
 MODULE_ALIAS("smb3");
@@ -1668,7 +1668,7 @@ init_cifs(void)
 			 CIFS_MAX_REQ);
 	}
 
-	cifsiod_wq = alloc_workqueue("cifsiod", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+	cifsiod_wq = alloc_workqueue("cifsiod", WQ_MEM_RECLAIM, 0);
 	if (!cifsiod_wq) {
 		rc = -ENOMEM;
 		goto out_clean_proc;
@@ -1682,28 +1682,28 @@ init_cifs(void)
 
 	/* WQ_UNBOUND allows decrypt tasks to run on any CPU */
 	decrypt_wq = alloc_workqueue("smb3decryptd",
-				     WQ_UNBOUND|WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+				     WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
 	if (!decrypt_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_cifsiod_wq;
 	}
 
 	fileinfo_put_wq = alloc_workqueue("cifsfileinfoput",
-				     WQ_UNBOUND|WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+				     WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
 	if (!fileinfo_put_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_decrypt_wq;
 	}
 
 	cifsoplockd_wq = alloc_workqueue("cifsoplockd",
-					 WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+					 WQ_MEM_RECLAIM, 0);
 	if (!cifsoplockd_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_fileinfo_put_wq;
 	}
 
 	deferredclose_wq = alloc_workqueue("deferredclose",
-					   WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+					   WQ_MEM_RECLAIM, 0);
 	if (!deferredclose_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_cifsoplockd_wq;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 164beb365bfe..43a86a369a31 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -375,7 +375,6 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
 	cifs_abort_connection(server);
 
 	do {
-		try_to_freeze();
 		cifs_server_lock(server);
 
 		if (!cifs_swn_set_server_dstaddr(server)) {
@@ -504,7 +503,6 @@ static int reconnect_dfs_server(struct TCP_Server_Info *server)
 	cifs_abort_connection(server);
 
 	do {
-		try_to_freeze();
 		cifs_server_lock(server);
 
 		rc = reconnect_target_unlocked(server, &tl, &target_hint);
@@ -678,8 +676,6 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
 	int total_read;
 
 	for (total_read = 0; msg_data_left(smb_msg); total_read += length) {
-		try_to_freeze();
-
 		/* reconnect if no credits and no requests in flight */
 		if (zero_credits(server)) {
 			cifs_reconnect(server, false);
@@ -1132,12 +1128,8 @@ cifs_demultiplex_thread(void *p)
 	if (length > 1)
 		mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
 
-	set_freezable();
 	allow_kernel_signal(SIGKILL);
 	while (server->tcpStatus != CifsExiting) {
-		if (try_to_freeze())
-			continue;
-
 		if (!allocate_buffers(server))
 			continue;
 
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..371c5f0a3523 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -376,7 +376,7 @@ int dfs_cache_init(void)
 	int rc;
 	int i;
 
-	dfscache_wq = alloc_workqueue("cifs-dfscache", WQ_FREEZABLE | WQ_UNBOUND, 1);
+	dfscache_wq = alloc_workqueue("cifs-dfscache", WQ_UNBOUND, 1);
 	if (!dfscache_wq)
 		return -ENOMEM;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 10/24] cifs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/cifs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/cifs/cifsfs.c    | 14 +++++++-------
 fs/cifs/connect.c   |  8 --------
 fs/cifs/dfs_cache.c |  2 +-
 3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index f052f190b2e8..25ee05c8af65 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1104,7 +1104,7 @@ struct file_system_type cifs_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("cifs");
 
@@ -1114,7 +1114,7 @@ struct file_system_type smb3_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("smb3");
 MODULE_ALIAS("smb3");
@@ -1668,7 +1668,7 @@ init_cifs(void)
 			 CIFS_MAX_REQ);
 	}
 
-	cifsiod_wq = alloc_workqueue("cifsiod", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+	cifsiod_wq = alloc_workqueue("cifsiod", WQ_MEM_RECLAIM, 0);
 	if (!cifsiod_wq) {
 		rc = -ENOMEM;
 		goto out_clean_proc;
@@ -1682,28 +1682,28 @@ init_cifs(void)
 
 	/* WQ_UNBOUND allows decrypt tasks to run on any CPU */
 	decrypt_wq = alloc_workqueue("smb3decryptd",
-				     WQ_UNBOUND|WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+				     WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
 	if (!decrypt_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_cifsiod_wq;
 	}
 
 	fileinfo_put_wq = alloc_workqueue("cifsfileinfoput",
-				     WQ_UNBOUND|WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+				     WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
 	if (!fileinfo_put_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_decrypt_wq;
 	}
 
 	cifsoplockd_wq = alloc_workqueue("cifsoplockd",
-					 WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+					 WQ_MEM_RECLAIM, 0);
 	if (!cifsoplockd_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_fileinfo_put_wq;
 	}
 
 	deferredclose_wq = alloc_workqueue("deferredclose",
-					   WQ_FREEZABLE|WQ_MEM_RECLAIM, 0);
+					   WQ_MEM_RECLAIM, 0);
 	if (!deferredclose_wq) {
 		rc = -ENOMEM;
 		goto out_destroy_cifsoplockd_wq;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 164beb365bfe..43a86a369a31 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -375,7 +375,6 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
 	cifs_abort_connection(server);
 
 	do {
-		try_to_freeze();
 		cifs_server_lock(server);
 
 		if (!cifs_swn_set_server_dstaddr(server)) {
@@ -504,7 +503,6 @@ static int reconnect_dfs_server(struct TCP_Server_Info *server)
 	cifs_abort_connection(server);
 
 	do {
-		try_to_freeze();
 		cifs_server_lock(server);
 
 		rc = reconnect_target_unlocked(server, &tl, &target_hint);
@@ -678,8 +676,6 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
 	int total_read;
 
 	for (total_read = 0; msg_data_left(smb_msg); total_read += length) {
-		try_to_freeze();
-
 		/* reconnect if no credits and no requests in flight */
 		if (zero_credits(server)) {
 			cifs_reconnect(server, false);
@@ -1132,12 +1128,8 @@ cifs_demultiplex_thread(void *p)
 	if (length > 1)
 		mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
 
-	set_freezable();
 	allow_kernel_signal(SIGKILL);
 	while (server->tcpStatus != CifsExiting) {
-		if (try_to_freeze())
-			continue;
-
 		if (!allocate_buffers(server))
 			continue;
 
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..371c5f0a3523 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -376,7 +376,7 @@ int dfs_cache_init(void)
 	int rc;
 	int i;
 
-	dfscache_wq = alloc_workqueue("cifs-dfscache", WQ_FREEZABLE | WQ_UNBOUND, 1);
+	dfscache_wq = alloc_workqueue("cifs-dfscache", WQ_UNBOUND, 1);
 	if (!dfscache_wq)
 		return -ENOMEM;
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 11/24] gfs2: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/gfs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/gfs2/glock.c      | 6 +++---
 fs/gfs2/log.c        | 2 --
 fs/gfs2/main.c       | 4 ++--
 fs/gfs2/ops_fstype.c | 4 ++--
 fs/gfs2/quota.c      | 2 --
 5 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 524f3c96b9a4..7ad1a1229ae3 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -2459,14 +2459,14 @@ int __init gfs2_glock_init(void)
 	if (ret < 0)
 		return ret;
 
-	glock_workqueue = alloc_workqueue("glock_workqueue", WQ_MEM_RECLAIM |
-					  WQ_HIGHPRI | WQ_FREEZABLE, 0);
+	glock_workqueue = alloc_workqueue("glock_workqueue",
+					  WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
 	if (!glock_workqueue) {
 		rhashtable_destroy(&gl_hash_table);
 		return -ENOMEM;
 	}
 	gfs2_delete_workqueue = alloc_workqueue("delete_workqueue",
-						WQ_MEM_RECLAIM | WQ_FREEZABLE,
+						WQ_MEM_RECLAIM,
 						0);
 	if (!gfs2_delete_workqueue) {
 		destroy_workqueue(glock_workqueue);
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 1fcc829f02ab..213fafc367f4 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1330,8 +1330,6 @@ int gfs2_logd(void *data)
 
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
 
-		try_to_freeze();
-
 		do {
 			prepare_to_wait(&sdp->sd_logd_waitq, &wait,
 					TASK_INTERRUPTIBLE);
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index afcb32854f14..43d4748ad183 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -153,12 +153,12 @@ static int __init init_gfs2_fs(void)
 
 	error = -ENOMEM;
 	gfs_recovery_wq = alloc_workqueue("gfs_recovery",
-					  WQ_MEM_RECLAIM | WQ_FREEZABLE, 0);
+					  WQ_MEM_RECLAIM, 0);
 	if (!gfs_recovery_wq)
 		goto fail_wq1;
 
 	gfs2_control_wq = alloc_workqueue("gfs2_control",
-					  WQ_UNBOUND | WQ_FREEZABLE, 0);
+					  WQ_UNBOUND, 0);
 	if (!gfs2_control_wq)
 		goto fail_wq2;
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index c0cf1d2d0ef5..8f5a63148eaf 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1740,7 +1740,7 @@ static void gfs2_kill_sb(struct super_block *sb)
 
 struct file_system_type gfs2_fs_type = {
 	.name = "gfs2",
-	.fs_flags = FS_REQUIRES_DEV,
+	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
 	.init_fs_context = gfs2_init_fs_context,
 	.parameters = gfs2_fs_parameters,
 	.kill_sb = gfs2_kill_sb,
@@ -1750,7 +1750,7 @@ MODULE_ALIAS_FS("gfs2");
 
 struct file_system_type gfs2meta_fs_type = {
 	.name = "gfs2meta",
-	.fs_flags = FS_REQUIRES_DEV,
+	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
 	.init_fs_context = gfs2_meta_init_fs_context,
 	.owner = THIS_MODULE,
 };
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 1ed17226d9ed..710764af9d04 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1555,8 +1555,6 @@ int gfs2_quotad(void *data)
 		quotad_check_timeo(sdp, "sync", gfs2_quota_sync, t,
 				   &quotad_timeo, &tune->gt_quota_quantum);
 
-		try_to_freeze();
-
 bypass:
 		t = min(quotad_timeo, statfs_timeo);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 11/24] gfs2: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/gfs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/gfs2/glock.c      | 6 +++---
 fs/gfs2/log.c        | 2 --
 fs/gfs2/main.c       | 4 ++--
 fs/gfs2/ops_fstype.c | 4 ++--
 fs/gfs2/quota.c      | 2 --
 5 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 524f3c96b9a4..7ad1a1229ae3 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -2459,14 +2459,14 @@ int __init gfs2_glock_init(void)
 	if (ret < 0)
 		return ret;
 
-	glock_workqueue = alloc_workqueue("glock_workqueue", WQ_MEM_RECLAIM |
-					  WQ_HIGHPRI | WQ_FREEZABLE, 0);
+	glock_workqueue = alloc_workqueue("glock_workqueue",
+					  WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
 	if (!glock_workqueue) {
 		rhashtable_destroy(&gl_hash_table);
 		return -ENOMEM;
 	}
 	gfs2_delete_workqueue = alloc_workqueue("delete_workqueue",
-						WQ_MEM_RECLAIM | WQ_FREEZABLE,
+						WQ_MEM_RECLAIM,
 						0);
 	if (!gfs2_delete_workqueue) {
 		destroy_workqueue(glock_workqueue);
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 1fcc829f02ab..213fafc367f4 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1330,8 +1330,6 @@ int gfs2_logd(void *data)
 
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
 
-		try_to_freeze();
-
 		do {
 			prepare_to_wait(&sdp->sd_logd_waitq, &wait,
 					TASK_INTERRUPTIBLE);
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index afcb32854f14..43d4748ad183 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -153,12 +153,12 @@ static int __init init_gfs2_fs(void)
 
 	error = -ENOMEM;
 	gfs_recovery_wq = alloc_workqueue("gfs_recovery",
-					  WQ_MEM_RECLAIM | WQ_FREEZABLE, 0);
+					  WQ_MEM_RECLAIM, 0);
 	if (!gfs_recovery_wq)
 		goto fail_wq1;
 
 	gfs2_control_wq = alloc_workqueue("gfs2_control",
-					  WQ_UNBOUND | WQ_FREEZABLE, 0);
+					  WQ_UNBOUND, 0);
 	if (!gfs2_control_wq)
 		goto fail_wq2;
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index c0cf1d2d0ef5..8f5a63148eaf 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1740,7 +1740,7 @@ static void gfs2_kill_sb(struct super_block *sb)
 
 struct file_system_type gfs2_fs_type = {
 	.name = "gfs2",
-	.fs_flags = FS_REQUIRES_DEV,
+	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
 	.init_fs_context = gfs2_init_fs_context,
 	.parameters = gfs2_fs_parameters,
 	.kill_sb = gfs2_kill_sb,
@@ -1750,7 +1750,7 @@ MODULE_ALIAS_FS("gfs2");
 
 struct file_system_type gfs2meta_fs_type = {
 	.name = "gfs2meta",
-	.fs_flags = FS_REQUIRES_DEV,
+	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
 	.init_fs_context = gfs2_meta_init_fs_context,
 	.owner = THIS_MODULE,
 };
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 1ed17226d9ed..710764af9d04 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1555,8 +1555,6 @@ int gfs2_quotad(void *data)
 		quotad_check_timeo(sdp, "sync", gfs2_quota_sync, t,
 				   &quotad_timeo, &tune->gt_quota_quantum);
 
-		try_to_freeze();
-
 bypass:
 		t = min(quotad_timeo, statfs_timeo);
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 12/24] jfs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jfs/jfs_logmgr.c | 11 +++--------
 fs/jfs/jfs_txnmgr.c | 31 +++++++++----------------------
 fs/jfs/super.c      |  2 +-
 3 files changed, 13 insertions(+), 31 deletions(-)

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 695415cbfe98..32df79fc09a2 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2317,14 +2317,9 @@ int jfsIOWait(void *arg)
 			spin_lock_irq(&log_redrive_lock);
 		}
 
-		if (freezing(current)) {
-			spin_unlock_irq(&log_redrive_lock);
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			spin_unlock_irq(&log_redrive_lock);
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		spin_unlock_irq(&log_redrive_lock);
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfsIOWait being killed!");
diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
index ffd4feece078..6c6dee3a16cc 100644
--- a/fs/jfs/jfs_txnmgr.c
+++ b/fs/jfs/jfs_txnmgr.c
@@ -2696,6 +2696,7 @@ int jfs_lazycommit(void *arg)
 	struct tblock *tblk;
 	unsigned long flags;
 	struct jfs_sb_info *sbi;
+	DECLARE_WAITQUEUE(wq, current);
 
 	do {
 		LAZY_LOCK(flags);
@@ -2742,19 +2743,11 @@ int jfs_lazycommit(void *arg)
 		}
 		/* In case a wakeup came while all threads were active */
 		jfs_commit_thread_waking = 0;
-
-		if (freezing(current)) {
-			LAZY_UNLOCK(flags);
-			try_to_freeze();
-		} else {
-			DECLARE_WAITQUEUE(wq, current);
-
-			add_wait_queue(&jfs_commit_thread_wait, &wq);
-			set_current_state(TASK_INTERRUPTIBLE);
-			LAZY_UNLOCK(flags);
-			schedule();
-			remove_wait_queue(&jfs_commit_thread_wait, &wq);
-		}
+		add_wait_queue(&jfs_commit_thread_wait, &wq);
+		set_current_state(TASK_INTERRUPTIBLE);
+		LAZY_UNLOCK(flags);
+		schedule();
+		remove_wait_queue(&jfs_commit_thread_wait, &wq);
 	} while (!kthread_should_stop());
 
 	if (!list_empty(&TxAnchor.unlock_queue))
@@ -2931,15 +2924,9 @@ int jfs_sync(void *arg)
 		}
 		/* Add anon_list2 back to anon_list */
 		list_splice_init(&TxAnchor.anon_list2, &TxAnchor.anon_list);
-
-		if (freezing(current)) {
-			TXN_UNLOCK();
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			TXN_UNLOCK();
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		TXN_UNLOCK();
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfs_sync being killed");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index d2f82cb7db1b..8ca77aa0b6f9 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -906,7 +906,7 @@ static struct file_system_type jfs_fs_type = {
 	.name		= "jfs",
 	.mount		= jfs_do_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("jfs");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 12/24] jfs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jfs/jfs_logmgr.c | 11 +++--------
 fs/jfs/jfs_txnmgr.c | 31 +++++++++----------------------
 fs/jfs/super.c      |  2 +-
 3 files changed, 13 insertions(+), 31 deletions(-)

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 695415cbfe98..32df79fc09a2 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2317,14 +2317,9 @@ int jfsIOWait(void *arg)
 			spin_lock_irq(&log_redrive_lock);
 		}
 
-		if (freezing(current)) {
-			spin_unlock_irq(&log_redrive_lock);
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			spin_unlock_irq(&log_redrive_lock);
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		spin_unlock_irq(&log_redrive_lock);
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfsIOWait being killed!");
diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
index ffd4feece078..6c6dee3a16cc 100644
--- a/fs/jfs/jfs_txnmgr.c
+++ b/fs/jfs/jfs_txnmgr.c
@@ -2696,6 +2696,7 @@ int jfs_lazycommit(void *arg)
 	struct tblock *tblk;
 	unsigned long flags;
 	struct jfs_sb_info *sbi;
+	DECLARE_WAITQUEUE(wq, current);
 
 	do {
 		LAZY_LOCK(flags);
@@ -2742,19 +2743,11 @@ int jfs_lazycommit(void *arg)
 		}
 		/* In case a wakeup came while all threads were active */
 		jfs_commit_thread_waking = 0;
-
-		if (freezing(current)) {
-			LAZY_UNLOCK(flags);
-			try_to_freeze();
-		} else {
-			DECLARE_WAITQUEUE(wq, current);
-
-			add_wait_queue(&jfs_commit_thread_wait, &wq);
-			set_current_state(TASK_INTERRUPTIBLE);
-			LAZY_UNLOCK(flags);
-			schedule();
-			remove_wait_queue(&jfs_commit_thread_wait, &wq);
-		}
+		add_wait_queue(&jfs_commit_thread_wait, &wq);
+		set_current_state(TASK_INTERRUPTIBLE);
+		LAZY_UNLOCK(flags);
+		schedule();
+		remove_wait_queue(&jfs_commit_thread_wait, &wq);
 	} while (!kthread_should_stop());
 
 	if (!list_empty(&TxAnchor.unlock_queue))
@@ -2931,15 +2924,9 @@ int jfs_sync(void *arg)
 		}
 		/* Add anon_list2 back to anon_list */
 		list_splice_init(&TxAnchor.anon_list2, &TxAnchor.anon_list);
-
-		if (freezing(current)) {
-			TXN_UNLOCK();
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			TXN_UNLOCK();
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		TXN_UNLOCK();
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfs_sync being killed");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index d2f82cb7db1b..8ca77aa0b6f9 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -906,7 +906,7 @@ static struct file_system_type jfs_fs_type = {
 	.name		= "jfs",
 	.mount		= jfs_do_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("jfs");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 13/24] nilfs2: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nilfs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nilfs2/segment.c | 48 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index f7a14ed12a66..1c48aa9c7f56 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2541,6 +2541,8 @@ static int nilfs_segctor_thread(void *arg)
 	struct nilfs_sc_info *sci = (struct nilfs_sc_info *)arg;
 	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
 	int timeout = 0;
+	DEFINE_WAIT(wait);
+	int should_sleep = 1;
 
 	sci->sc_timer_task = current;
 
@@ -2572,38 +2574,28 @@ static int nilfs_segctor_thread(void *arg)
 		timeout = 0;
 	}
 
+	prepare_to_wait(&sci->sc_wait_daemon, &wait,
+			TASK_INTERRUPTIBLE);
 
-	if (freezing(current)) {
+	if (sci->sc_seq_request != sci->sc_seq_done)
+		should_sleep = 0;
+	else if (sci->sc_flush_request)
+		should_sleep = 0;
+	else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
+		should_sleep = time_before(jiffies,
+				sci->sc_timer.expires);
+
+	if (should_sleep) {
 		spin_unlock(&sci->sc_state_lock);
-		try_to_freeze();
+		schedule();
 		spin_lock(&sci->sc_state_lock);
-	} else {
-		DEFINE_WAIT(wait);
-		int should_sleep = 1;
-
-		prepare_to_wait(&sci->sc_wait_daemon, &wait,
-				TASK_INTERRUPTIBLE);
-
-		if (sci->sc_seq_request != sci->sc_seq_done)
-			should_sleep = 0;
-		else if (sci->sc_flush_request)
-			should_sleep = 0;
-		else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
-			should_sleep = time_before(jiffies,
-					sci->sc_timer.expires);
-
-		if (should_sleep) {
-			spin_unlock(&sci->sc_state_lock);
-			schedule();
-			spin_lock(&sci->sc_state_lock);
-		}
-		finish_wait(&sci->sc_wait_daemon, &wait);
-		timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
-			   time_after_eq(jiffies, sci->sc_timer.expires));
-
-		if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
-			set_nilfs_discontinued(nilfs);
 	}
+	finish_wait(&sci->sc_wait_daemon, &wait);
+	timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+		   time_after_eq(jiffies, sci->sc_timer.expires));
+
+	if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
+		set_nilfs_discontinued(nilfs);
 	goto loop;
 
  end_thread:
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 13/24] nilfs2: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nilfs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nilfs2/segment.c | 48 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index f7a14ed12a66..1c48aa9c7f56 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2541,6 +2541,8 @@ static int nilfs_segctor_thread(void *arg)
 	struct nilfs_sc_info *sci = (struct nilfs_sc_info *)arg;
 	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
 	int timeout = 0;
+	DEFINE_WAIT(wait);
+	int should_sleep = 1;
 
 	sci->sc_timer_task = current;
 
@@ -2572,38 +2574,28 @@ static int nilfs_segctor_thread(void *arg)
 		timeout = 0;
 	}
 
+	prepare_to_wait(&sci->sc_wait_daemon, &wait,
+			TASK_INTERRUPTIBLE);
 
-	if (freezing(current)) {
+	if (sci->sc_seq_request != sci->sc_seq_done)
+		should_sleep = 0;
+	else if (sci->sc_flush_request)
+		should_sleep = 0;
+	else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
+		should_sleep = time_before(jiffies,
+				sci->sc_timer.expires);
+
+	if (should_sleep) {
 		spin_unlock(&sci->sc_state_lock);
-		try_to_freeze();
+		schedule();
 		spin_lock(&sci->sc_state_lock);
-	} else {
-		DEFINE_WAIT(wait);
-		int should_sleep = 1;
-
-		prepare_to_wait(&sci->sc_wait_daemon, &wait,
-				TASK_INTERRUPTIBLE);
-
-		if (sci->sc_seq_request != sci->sc_seq_done)
-			should_sleep = 0;
-		else if (sci->sc_flush_request)
-			should_sleep = 0;
-		else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
-			should_sleep = time_before(jiffies,
-					sci->sc_timer.expires);
-
-		if (should_sleep) {
-			spin_unlock(&sci->sc_state_lock);
-			schedule();
-			spin_lock(&sci->sc_state_lock);
-		}
-		finish_wait(&sci->sc_wait_daemon, &wait);
-		timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
-			   time_after_eq(jiffies, sci->sc_timer.expires));
-
-		if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
-			set_nilfs_discontinued(nilfs);
 	}
+	finish_wait(&sci->sc_wait_daemon, &wait);
+	timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+		   time_after_eq(jiffies, sci->sc_timer.expires));
+
+	if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
+		set_nilfs_discontinued(nilfs);
 	goto loop;
 
  end_thread:
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 14/24] nfs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:33   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nfs/callback.c   | 4 ----
 fs/nfs/fs_context.c | 4 ++--
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 456af7d230cf..f5ba4d6bf2a7 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -77,8 +77,6 @@ nfs4_callback_svc(void *vrqstp)
 	int err;
 	struct svc_rqst *rqstp = vrqstp;
 
-	set_freezable();
-
 	while (!kthread_freezable_should_stop(NULL)) {
 
 		if (signal_pending(current))
@@ -109,8 +107,6 @@ nfs41_callback_svc(void *vrqstp)
 	int error;
 	DEFINE_WAIT(wq);
 
-	set_freezable();
-
 	while (!kthread_freezable_should_stop(NULL)) {
 
 		if (signal_pending(current))
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 9bcd53d5c7d4..04753962db9a 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1583,7 +1583,7 @@ struct file_system_type nfs_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("nfs");
 EXPORT_SYMBOL_GPL(nfs_fs_type);
@@ -1595,7 +1595,7 @@ struct file_system_type nfs4_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("nfs4");
 MODULE_ALIAS("nfs4");
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 14/24] nfs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:33   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:33 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nfs/callback.c   | 4 ----
 fs/nfs/fs_context.c | 4 ++--
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 456af7d230cf..f5ba4d6bf2a7 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -77,8 +77,6 @@ nfs4_callback_svc(void *vrqstp)
 	int err;
 	struct svc_rqst *rqstp = vrqstp;
 
-	set_freezable();
-
 	while (!kthread_freezable_should_stop(NULL)) {
 
 		if (signal_pending(current))
@@ -109,8 +107,6 @@ nfs41_callback_svc(void *vrqstp)
 	int error;
 	DEFINE_WAIT(wq);
 
-	set_freezable();
-
 	while (!kthread_freezable_should_stop(NULL)) {
 
 		if (signal_pending(current))
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 9bcd53d5c7d4..04753962db9a 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1583,7 +1583,7 @@ struct file_system_type nfs_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("nfs");
 EXPORT_SYMBOL_GPL(nfs_fs_type);
@@ -1595,7 +1595,7 @@ struct file_system_type nfs4_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
 };
 MODULE_ALIAS_FS("nfs4");
 MODULE_ALIAS("nfs4");
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 15/24] nfsd: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nfsd/nfssvc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 1ed29eac80ed..5345c415c2bc 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -963,8 +963,6 @@ nfsd(void *vrqstp)
 
 	atomic_inc(&nfsdstats.th_cnt);
 
-	set_freezable();
-
 	/*
 	 * The main request loop
 	 */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 15/24] nfsd: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/nfsd/nfssvc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 1ed29eac80ed..5345c415c2bc 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -963,8 +963,6 @@ nfsd(void *vrqstp)
 
 	atomic_inc(&nfsdstats.th_cnt);
 
-	set_freezable();
-
 	/*
 	 * The main request loop
 	 */
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 16/24] ubifs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ubifs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ubifs/commit.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c
index c4fc1047fc07..bdebb1702e88 100644
--- a/fs/ubifs/commit.c
+++ b/fs/ubifs/commit.c
@@ -279,15 +279,11 @@ int ubifs_bg_thread(void *info)
 
 	ubifs_msg(c, "background thread \"%s\" started, PID %d",
 		  c->bgt_name, current->pid);
-	set_freezable();
 
 	while (1) {
 		if (kthread_should_stop())
 			break;
 
-		if (try_to_freeze())
-			continue;
-
 		set_current_state(TASK_INTERRUPTIBLE);
 		/* Check if there is something to do */
 		if (!c->need_bgt) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 16/24] ubifs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ubifs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ubifs/commit.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c
index c4fc1047fc07..bdebb1702e88 100644
--- a/fs/ubifs/commit.c
+++ b/fs/ubifs/commit.c
@@ -279,15 +279,11 @@ int ubifs_bg_thread(void *info)
 
 	ubifs_msg(c, "background thread \"%s\" started, PID %d",
 		  c->bgt_name, current->pid);
-	set_freezable();
 
 	while (1) {
 		if (kthread_should_stop())
 			break;
 
-		if (try_to_freeze())
-			continue;
-
 		set_current_state(TASK_INTERRUPTIBLE);
 		/* Check if there is something to do */
 		if (!c->need_bgt) {
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 17/24] ksmbd: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ksmbd/connection.c    | 3 ---
 fs/ksmbd/transport_tcp.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
index fd0a288af299..4ed17de1423e 100644
--- a/fs/ksmbd/connection.c
+++ b/fs/ksmbd/connection.c
@@ -292,9 +292,6 @@ int ksmbd_conn_handler_loop(void *p)
 
 	conn->last_active = jiffies;
 	while (ksmbd_conn_alive(conn)) {
-		if (try_to_freeze())
-			continue;
-
 		kvfree(conn->request_buf);
 		conn->request_buf = NULL;
 
diff --git a/fs/ksmbd/transport_tcp.c b/fs/ksmbd/transport_tcp.c
index 4c6bd0b69979..dadb4f306428 100644
--- a/fs/ksmbd/transport_tcp.c
+++ b/fs/ksmbd/transport_tcp.c
@@ -305,8 +305,6 @@ static int ksmbd_tcp_readv(struct tcp_transport *t, struct kvec *iov_orig,
 	ksmbd_msg.msg_controllen = 0;
 
 	for (total_read = 0; to_read; total_read += length, to_read -= length) {
-		try_to_freeze();
-
 		if (!ksmbd_conn_alive(conn)) {
 			total_read = -ESHUTDOWN;
 			break;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 17/24] ksmbd: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/nfs --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ksmbd/connection.c    | 3 ---
 fs/ksmbd/transport_tcp.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
index fd0a288af299..4ed17de1423e 100644
--- a/fs/ksmbd/connection.c
+++ b/fs/ksmbd/connection.c
@@ -292,9 +292,6 @@ int ksmbd_conn_handler_loop(void *p)
 
 	conn->last_active = jiffies;
 	while (ksmbd_conn_alive(conn)) {
-		if (try_to_freeze())
-			continue;
-
 		kvfree(conn->request_buf);
 		conn->request_buf = NULL;
 
diff --git a/fs/ksmbd/transport_tcp.c b/fs/ksmbd/transport_tcp.c
index 4c6bd0b69979..dadb4f306428 100644
--- a/fs/ksmbd/transport_tcp.c
+++ b/fs/ksmbd/transport_tcp.c
@@ -305,8 +305,6 @@ static int ksmbd_tcp_readv(struct tcp_transport *t, struct kvec *iov_orig,
 	ksmbd_msg.msg_controllen = 0;
 
 	for (total_read = 0; to_read; total_read += length, to_read -= length) {
-		try_to_freeze();
-
 		if (!ksmbd_conn_alive(conn)) {
 			total_read = -ESHUTDOWN;
 			break;
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 18/24] jffs2: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jffs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jffs2/background.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 6da92ecaf66d..e29fdf1ed878 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -87,7 +87,6 @@ static int jffs2_garbage_collect_thread(void *_c)
 
 	set_user_nice(current, 10);
 
-	set_freezable();
 	for (;;) {
 		sigprocmask(SIG_UNBLOCK, &hupmask, NULL);
 	again:
@@ -119,7 +118,7 @@ static int jffs2_garbage_collect_thread(void *_c)
 
 		/* Put_super will send a SIGKILL and then wait on the sem.
 		 */
-		while (signal_pending(current) || freezing(current)) {
+		while (signal_pending(current) ||) {
 			unsigned long signr;
 
 			if (try_to_freeze())
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 18/24] jffs2: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jffs2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jffs2/background.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 6da92ecaf66d..e29fdf1ed878 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -87,7 +87,6 @@ static int jffs2_garbage_collect_thread(void *_c)
 
 	set_user_nice(current, 10);
 
-	set_freezable();
 	for (;;) {
 		sigprocmask(SIG_UNBLOCK, &hupmask, NULL);
 	again:
@@ -119,7 +118,7 @@ static int jffs2_garbage_collect_thread(void *_c)
 
 		/* Put_super will send a SIGKILL and then wait on the sem.
 		 */
-		while (signal_pending(current) || freezing(current)) {
+		while (signal_pending(current) ||) {
 			unsigned long signr;
 
 			if (try_to_freeze())
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 19/24] jbd2: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jbd2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jbd2/journal.c | 54 ++++++++++++++++++-----------------------------
 1 file changed, 20 insertions(+), 34 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index e80c781731f8..99a4db5b40fc 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -169,6 +169,8 @@ static int kjournald2(void *arg)
 {
 	journal_t *journal = arg;
 	transaction_t *transaction;
+	DEFINE_WAIT(wait);
+	int should_sleep = 1;
 
 	/*
 	 * Set up an interval timer which can be used to trigger a commit wakeup
@@ -176,8 +178,6 @@ static int kjournald2(void *arg)
 	 */
 	timer_setup(&journal->j_commit_timer, commit_timeout, 0);
 
-	set_freezable();
-
 	/* Record that the journal thread is running */
 	journal->j_task = current;
 	wake_up(&journal->j_wait_done_commit);
@@ -212,41 +212,27 @@ static int kjournald2(void *arg)
 	}
 
 	wake_up(&journal->j_wait_done_commit);
-	if (freezing(current)) {
-		/*
-		 * The simpler the better. Flushing journal isn't a
-		 * good idea, because that depends on threads that may
-		 * be already stopped.
-		 */
-		jbd2_debug(1, "Now suspending kjournald2\n");
+	/*
+	 * We assume on resume that commits are already there,
+	 * so we don't sleep
+	 */
+
+	prepare_to_wait(&journal->j_wait_commit, &wait,
+			TASK_INTERRUPTIBLE);
+	if (journal->j_commit_sequence != journal->j_commit_request)
+		should_sleep = 0;
+	transaction = journal->j_running_transaction;
+	if (transaction && time_after_eq(jiffies,
+					transaction->t_expires))
+		should_sleep = 0;
+	if (journal->j_flags & JBD2_UNMOUNT)
+		should_sleep = 0;
+	if (should_sleep) {
 		write_unlock(&journal->j_state_lock);
-		try_to_freeze();
+		schedule();
 		write_lock(&journal->j_state_lock);
-	} else {
-		/*
-		 * We assume on resume that commits are already there,
-		 * so we don't sleep
-		 */
-		DEFINE_WAIT(wait);
-		int should_sleep = 1;
-
-		prepare_to_wait(&journal->j_wait_commit, &wait,
-				TASK_INTERRUPTIBLE);
-		if (journal->j_commit_sequence != journal->j_commit_request)
-			should_sleep = 0;
-		transaction = journal->j_running_transaction;
-		if (transaction && time_after_eq(jiffies,
-						transaction->t_expires))
-			should_sleep = 0;
-		if (journal->j_flags & JBD2_UNMOUNT)
-			should_sleep = 0;
-		if (should_sleep) {
-			write_unlock(&journal->j_state_lock);
-			schedule();
-			write_lock(&journal->j_state_lock);
-		}
-		finish_wait(&journal->j_wait_commit, &wait);
 	}
+	finish_wait(&journal->j_wait_commit, &wait);
 
 	jbd2_debug(1, "kjournald2 wakes\n");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 19/24] jbd2: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/jbd2 --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/jbd2/journal.c | 54 ++++++++++++++++++-----------------------------
 1 file changed, 20 insertions(+), 34 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index e80c781731f8..99a4db5b40fc 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -169,6 +169,8 @@ static int kjournald2(void *arg)
 {
 	journal_t *journal = arg;
 	transaction_t *transaction;
+	DEFINE_WAIT(wait);
+	int should_sleep = 1;
 
 	/*
 	 * Set up an interval timer which can be used to trigger a commit wakeup
@@ -176,8 +178,6 @@ static int kjournald2(void *arg)
 	 */
 	timer_setup(&journal->j_commit_timer, commit_timeout, 0);
 
-	set_freezable();
-
 	/* Record that the journal thread is running */
 	journal->j_task = current;
 	wake_up(&journal->j_wait_done_commit);
@@ -212,41 +212,27 @@ static int kjournald2(void *arg)
 	}
 
 	wake_up(&journal->j_wait_done_commit);
-	if (freezing(current)) {
-		/*
-		 * The simpler the better. Flushing journal isn't a
-		 * good idea, because that depends on threads that may
-		 * be already stopped.
-		 */
-		jbd2_debug(1, "Now suspending kjournald2\n");
+	/*
+	 * We assume on resume that commits are already there,
+	 * so we don't sleep
+	 */
+
+	prepare_to_wait(&journal->j_wait_commit, &wait,
+			TASK_INTERRUPTIBLE);
+	if (journal->j_commit_sequence != journal->j_commit_request)
+		should_sleep = 0;
+	transaction = journal->j_running_transaction;
+	if (transaction && time_after_eq(jiffies,
+					transaction->t_expires))
+		should_sleep = 0;
+	if (journal->j_flags & JBD2_UNMOUNT)
+		should_sleep = 0;
+	if (should_sleep) {
 		write_unlock(&journal->j_state_lock);
-		try_to_freeze();
+		schedule();
 		write_lock(&journal->j_state_lock);
-	} else {
-		/*
-		 * We assume on resume that commits are already there,
-		 * so we don't sleep
-		 */
-		DEFINE_WAIT(wait);
-		int should_sleep = 1;
-
-		prepare_to_wait(&journal->j_wait_commit, &wait,
-				TASK_INTERRUPTIBLE);
-		if (journal->j_commit_sequence != journal->j_commit_request)
-			should_sleep = 0;
-		transaction = journal->j_running_transaction;
-		if (transaction && time_after_eq(jiffies,
-						transaction->t_expires))
-			should_sleep = 0;
-		if (journal->j_flags & JBD2_UNMOUNT)
-			should_sleep = 0;
-		if (should_sleep) {
-			write_unlock(&journal->j_state_lock);
-			schedule();
-			write_lock(&journal->j_state_lock);
-		}
-		finish_wait(&journal->j_wait_commit, &wait);
 	}
+	finish_wait(&journal->j_wait_commit, &wait);
 
 	jbd2_debug(1, "kjournald2 wakes\n");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 20/24] coredump: drop freezer usage
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/coredump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index f27d734f3102..3a0a5c946bf8 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -459,7 +459,7 @@ static bool dump_interrupted(void)
 	 * but then we need to teach dump_write() to restart and clear
 	 * TIF_SIGPENDING.
 	 */
-	return fatal_signal_pending(current) || freezing(current);
+	return fatal_signal_pending(current);
 }
 
 static void wait_for_dump_helpers(struct file *file)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 20/24] coredump: drop freezer usage
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/coredump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index f27d734f3102..3a0a5c946bf8 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -459,7 +459,7 @@ static bool dump_interrupted(void)
 	 * but then we need to teach dump_write() to restart and clear
 	 * TIF_SIGPENDING.
 	 */
-	return fatal_signal_pending(current) || freezing(current);
+	return fatal_signal_pending(current);
 }
 
 static void wait_for_dump_helpers(struct file *file)
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 21/24] ecryptfs: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ecryptfs/ --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ecryptfs/kthread.c | 1 -
 fs/ecryptfs/main.c    | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ecryptfs/kthread.c b/fs/ecryptfs/kthread.c
index ae4cb4e2e134..ff1226615f03 100644
--- a/fs/ecryptfs/kthread.c
+++ b/fs/ecryptfs/kthread.c
@@ -41,7 +41,6 @@ static struct task_struct *ecryptfs_kthread;
  */
 static int ecryptfs_threadfn(void *ignored)
 {
-	set_freezable();
 	while (1)  {
 		struct ecryptfs_open_req *req;
 
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 2dc927ba067f..a91f5184edb7 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -637,7 +637,7 @@ static struct file_system_type ecryptfs_fs_type = {
 	.name = "ecryptfs",
 	.mount = ecryptfs_mount,
 	.kill_sb = ecryptfs_kill_block_super,
-	.fs_flags = 0
+	.fs_flags = 0| FS_AUTOFREEZE
 };
 MODULE_ALIAS_FS("ecryptfs");
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 21/24] ecryptfs: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/ecryptfs/ --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/ecryptfs/kthread.c | 1 -
 fs/ecryptfs/main.c    | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ecryptfs/kthread.c b/fs/ecryptfs/kthread.c
index ae4cb4e2e134..ff1226615f03 100644
--- a/fs/ecryptfs/kthread.c
+++ b/fs/ecryptfs/kthread.c
@@ -41,7 +41,6 @@ static struct task_struct *ecryptfs_kthread;
  */
 static int ecryptfs_threadfn(void *ignored)
 {
-	set_freezable();
 	while (1)  {
 		struct ecryptfs_open_req *req;
 
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 2dc927ba067f..a91f5184edb7 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -637,7 +637,7 @@ static struct file_system_type ecryptfs_fs_type = {
 	.name = "ecryptfs",
 	.mount = ecryptfs_mount,
 	.kill_sb = ecryptfs_kill_block_super,
-	.fs_flags = 0
+	.fs_flags = 0| FS_AUTOFREEZE
 };
 MODULE_ALIAS_FS("ecryptfs");
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 22/24] fscache: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/fscache/ --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/fscache/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fscache/main.c b/fs/fscache/main.c
index dad85fd84f6f..a6ae36da2315 100644
--- a/fs/fscache/main.c
+++ b/fs/fscache/main.c
@@ -75,7 +75,7 @@ static int __init fscache_init(void)
 {
 	int ret = -ENOMEM;
 
-	fscache_wq = alloc_workqueue("fscache", WQ_UNBOUND | WQ_FREEZABLE, 0);
+	fscache_wq = alloc_workqueue("fscache", WQ_UNBOUND, 0);
 	if (!fscache_wq)
 		goto error_wq;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 22/24] fscache: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/fscache/ --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/fscache/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fscache/main.c b/fs/fscache/main.c
index dad85fd84f6f..a6ae36da2315 100644
--- a/fs/fscache/main.c
+++ b/fs/fscache/main.c
@@ -75,7 +75,7 @@ static int __init fscache_init(void)
 {
 	int ret = -ENOMEM;
 
-	fscache_wq = alloc_workqueue("fscache", WQ_UNBOUND | WQ_FREEZABLE, 0);
+	fscache_wq = alloc_workqueue("fscache", WQ_UNBOUND, 0);
 	if (!fscache_wq)
 		goto error_wq;
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 23/24] lockd: replace kthread freezing with auto fs freezing
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/lockd --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/lockd/clntproc.c | 1 -
 fs/lockd/svc.c      | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index e875a3571c41..996f5b4d5d17 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -247,7 +247,6 @@ static int nlm_wait_on_grace(wait_queue_head_t *queue)
 	prepare_to_wait(queue, &wait, TASK_INTERRUPTIBLE);
 	if (!signalled ()) {
 		schedule_timeout(NLMCLNT_GRACE_WAIT);
-		try_to_freeze();
 		if (!signalled ())
 			status = 0;
 	}
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index e56d85335599..bfccbd6f20ed 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -135,9 +135,6 @@ lockd(void *vrqstp)
 	struct net *net = &init_net;
 	struct lockd_net *ln = net_generic(net, lockd_net_id);
 
-	/* try_to_freeze() is called from svc_recv() */
-	set_freezable();
-
 	/* Allow SIGKILL to tell lockd to drop all of its locks */
 	allow_signal(SIGKILL);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 23/24] lockd: replace kthread freezing with auto fs freezing
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

The kernel power management now supports allowing the VFS
to handle filesystem freezing freezes and thawing. Take advantage
of that and remove the kthread freezing. This is needed so that we
properly really stop IO in flight without races after userspace
has been frozen. Without this we rely on kthread freezing and
its semantics are loose and error prone.

The filesystem therefore is in charge of properly dealing with
quiescing of the filesystem through its callbacks if it thinks
it knows better than how the VFS handles it.

The following Coccinelle rule was used as to remove the now superflous
freezer calls:

spatch --sp-file fs-freeze-cleanup.cocci --in-place --timeout 120 --dir fs/lockd --jobs 12 --use-gitgrep

@ remove_set_freezable @
expression time;
statement S, S2;
expression task, current;
@@

(
-       set_freezable();
|
-       if (try_to_freeze())
-               continue;
|
-       try_to_freeze();
|
-       freezable_schedule();
+       schedule();
|
-       freezable_schedule_timeout(time);
+       schedule_timeout(time);
|
-       if (freezing(task)) { S }
|
-       if (freezing(task)) { S }
-       else
	    { S2 }
|
-       freezing(current)
)

@ remove_wq_freezable @
expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4;
identifier fs_wq_fn;
@@

(
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE,
+                              WQ_ARG2,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE,
+                              WQ_ARG2 | WQ_ARG3,
			   ...);
|
    WQ_E = alloc_workqueue(WQ_ARG1,
-                              WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4,
+                              WQ_ARG2 | WQ_ARG3 | WQ_ARG4,
			   ...);
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE
+               WQ_ARG1
|
	    WQ_E =
-               WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3
+               WQ_ARG1 | WQ_ARG3
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3
+               WQ_ARG2 | WQ_ARG3
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE | WQ_ARG2
+               WQ_ARG2
    )
|
    fs_wq_fn(
-               WQ_FREEZABLE
+               0
    )
)

@ add_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
+                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/lockd/clntproc.c | 1 -
 fs/lockd/svc.c      | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index e875a3571c41..996f5b4d5d17 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -247,7 +247,6 @@ static int nlm_wait_on_grace(wait_queue_head_t *queue)
 	prepare_to_wait(queue, &wait, TASK_INTERRUPTIBLE);
 	if (!signalled ()) {
 		schedule_timeout(NLMCLNT_GRACE_WAIT);
-		try_to_freeze();
 		if (!signalled ())
 			status = 0;
 	}
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index e56d85335599..bfccbd6f20ed 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -135,9 +135,6 @@ lockd(void *vrqstp)
 	struct net *net = &init_net;
 	struct lockd_net *ln = net_generic(net, lockd_net_id);
 
-	/* try_to_freeze() is called from svc_recv() */
-	set_freezable();
-
 	/* Allow SIGKILL to tell lockd to drop all of its locks */
 	allow_signal(SIGKILL);
 
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 24/24] fs: remove FS_AUTOFREEZE
  2023-01-14  0:33 ` Luis Chamberlain
@ 2023-01-14  0:34   ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Now that all filesystems have been converted over to stop using the
kthread freezer APIs we can remove FS_AUTOFREEZE and its check.

The following Coccinelle rule was used as to remove the flag:

spatch --sp-file remove-fs-autofreezeflag.cocci --in-place --timeout 120 --dir fs/ --jobs 12 --use-gitgrep
@ rm_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
-                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/btrfs/super.c     | 4 ++--
 fs/cifs/cifsfs.c     | 4 ++--
 fs/ecryptfs/main.c   | 2 +-
 fs/ext4/super.c      | 6 +++---
 fs/f2fs/super.c      | 2 +-
 fs/gfs2/ops_fstype.c | 4 ++--
 fs/jfs/super.c       | 2 +-
 fs/nfs/fs_context.c  | 4 ++--
 fs/super.c           | 2 --
 fs/xfs/xfs_super.c   | 2 +-
 include/linux/fs.h   | 1 -
 11 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 35059fe276ac..433ce221dc5c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2138,7 +2138,7 @@ static struct file_system_type btrfs_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
 };
 
 static struct file_system_type btrfs_root_fs_type = {
@@ -2146,7 +2146,7 @@ static struct file_system_type btrfs_root_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount_root,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP,
 };
 
 MODULE_ALIAS_FS("btrfs");
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 25ee05c8af65..1f7af4087b44 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1104,7 +1104,7 @@ struct file_system_type cifs_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE,
 };
 MODULE_ALIAS_FS("cifs");
 
@@ -1114,7 +1114,7 @@ struct file_system_type smb3_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE,
 };
 MODULE_ALIAS_FS("smb3");
 MODULE_ALIAS("smb3");
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index a91f5184edb7..2dc927ba067f 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -637,7 +637,7 @@ static struct file_system_type ecryptfs_fs_type = {
 	.name = "ecryptfs",
 	.mount = ecryptfs_mount,
 	.kill_sb = ecryptfs_kill_block_super,
-	.fs_flags = 0| FS_AUTOFREEZE
+	.fs_flags = 0
 };
 MODULE_ALIAS_FS("ecryptfs");
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0ae6f13c7fa4..4c83eab8d769 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -136,7 +136,7 @@ static struct file_system_type ext2_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -152,7 +152,7 @@ static struct file_system_type ext3_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -7189,7 +7189,7 @@ static struct file_system_type ext4_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("ext4");
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e9c6fb04c713..87d56a9883e6 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4645,7 +4645,7 @@ static struct file_system_type f2fs_fs_type = {
 	.name		= "f2fs",
 	.mount		= f2fs_mount,
 	.kill_sb	= kill_f2fs_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("f2fs");
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 8f5a63148eaf..c0cf1d2d0ef5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1740,7 +1740,7 @@ static void gfs2_kill_sb(struct super_block *sb)
 
 struct file_system_type gfs2_fs_type = {
 	.name = "gfs2",
-	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags = FS_REQUIRES_DEV,
 	.init_fs_context = gfs2_init_fs_context,
 	.parameters = gfs2_fs_parameters,
 	.kill_sb = gfs2_kill_sb,
@@ -1750,7 +1750,7 @@ MODULE_ALIAS_FS("gfs2");
 
 struct file_system_type gfs2meta_fs_type = {
 	.name = "gfs2meta",
-	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags = FS_REQUIRES_DEV,
 	.init_fs_context = gfs2_meta_init_fs_context,
 	.owner = THIS_MODULE,
 };
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 8ca77aa0b6f9..d2f82cb7db1b 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -906,7 +906,7 @@ static struct file_system_type jfs_fs_type = {
 	.name		= "jfs",
 	.mount		= jfs_do_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("jfs");
 
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 04753962db9a..9bcd53d5c7d4 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1583,7 +1583,7 @@ struct file_system_type nfs_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
 };
 MODULE_ALIAS_FS("nfs");
 EXPORT_SYMBOL_GPL(nfs_fs_type);
@@ -1595,7 +1595,7 @@ struct file_system_type nfs4_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
 };
 MODULE_ALIAS_FS("nfs4");
 MODULE_ALIAS("nfs4");
diff --git a/fs/super.c b/fs/super.c
index e8af4c8269ad..2943157aa41c 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1857,8 +1857,6 @@ EXPORT_SYMBOL(thaw_super);
 #ifdef CONFIG_PM_SLEEP
 static bool super_should_freeze(struct super_block *sb)
 {
-	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
-		return false;
 	/*
 	 * We don't freeze virtual filesystems, we skip those filesystems with
 	 * no backing device.
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 54cbf15fc459..e71e69895a94 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1966,7 +1966,7 @@ static struct file_system_type xfs_fs_type = {
 	.init_fs_context	= xfs_init_fs_context,
 	.parameters		= xfs_fs_parameters,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("xfs");
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e5bee359e804..64b0ed66e87f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2231,7 +2231,6 @@ struct file_system_type {
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
-#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC v3 24/24] fs: remove FS_AUTOFREEZE
@ 2023-01-14  0:34   ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:34 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, Luis Chamberlain

Now that all filesystems have been converted over to stop using the
kthread freezer APIs we can remove FS_AUTOFREEZE and its check.

The following Coccinelle rule was used as to remove the flag:

spatch --sp-file remove-fs-autofreezeflag.cocci --in-place --timeout 120 --dir fs/ --jobs 12 --use-gitgrep
@ rm_auto_flag @
expression E1;
identifier fs_type;
@@

struct file_system_type fs_type = {
	.fs_flags = E1
-                   | FS_AUTOFREEZE
	,
};

Generated-by: Coccinelle SmPL
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/btrfs/super.c     | 4 ++--
 fs/cifs/cifsfs.c     | 4 ++--
 fs/ecryptfs/main.c   | 2 +-
 fs/ext4/super.c      | 6 +++---
 fs/f2fs/super.c      | 2 +-
 fs/gfs2/ops_fstype.c | 4 ++--
 fs/jfs/super.c       | 2 +-
 fs/nfs/fs_context.c  | 4 ++--
 fs/super.c           | 2 --
 fs/xfs/xfs_super.c   | 2 +-
 include/linux/fs.h   | 1 -
 11 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 35059fe276ac..433ce221dc5c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2138,7 +2138,7 @@ static struct file_system_type btrfs_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
 };
 
 static struct file_system_type btrfs_root_fs_type = {
@@ -2146,7 +2146,7 @@ static struct file_system_type btrfs_root_fs_type = {
 	.name		= "btrfs",
 	.mount		= btrfs_mount_root,
 	.kill_sb	= btrfs_kill_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA | FS_ALLOW_IDMAP,
 };
 
 MODULE_ALIAS_FS("btrfs");
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 25ee05c8af65..1f7af4087b44 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1104,7 +1104,7 @@ struct file_system_type cifs_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE,
 };
 MODULE_ALIAS_FS("cifs");
 
@@ -1114,7 +1114,7 @@ struct file_system_type smb3_fs_type = {
 	.init_fs_context = smb3_init_fs_context,
 	.parameters = smb3_fs_parameters,
 	.kill_sb = cifs_kill_sb,
-	.fs_flags = FS_RENAME_DOES_D_MOVE | FS_AUTOFREEZE,
+	.fs_flags = FS_RENAME_DOES_D_MOVE,
 };
 MODULE_ALIAS_FS("smb3");
 MODULE_ALIAS("smb3");
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index a91f5184edb7..2dc927ba067f 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -637,7 +637,7 @@ static struct file_system_type ecryptfs_fs_type = {
 	.name = "ecryptfs",
 	.mount = ecryptfs_mount,
 	.kill_sb = ecryptfs_kill_block_super,
-	.fs_flags = 0| FS_AUTOFREEZE
+	.fs_flags = 0
 };
 MODULE_ALIAS_FS("ecryptfs");
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0ae6f13c7fa4..4c83eab8d769 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -136,7 +136,7 @@ static struct file_system_type ext2_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -152,7 +152,7 @@ static struct file_system_type ext3_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -7189,7 +7189,7 @@ static struct file_system_type ext4_fs_type = {
 	.init_fs_context	= ext4_init_fs_context,
 	.parameters		= ext4_param_specs,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("ext4");
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e9c6fb04c713..87d56a9883e6 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4645,7 +4645,7 @@ static struct file_system_type f2fs_fs_type = {
 	.name		= "f2fs",
 	.mount		= f2fs_mount,
 	.kill_sb	= kill_f2fs_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("f2fs");
 
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 8f5a63148eaf..c0cf1d2d0ef5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1740,7 +1740,7 @@ static void gfs2_kill_sb(struct super_block *sb)
 
 struct file_system_type gfs2_fs_type = {
 	.name = "gfs2",
-	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags = FS_REQUIRES_DEV,
 	.init_fs_context = gfs2_init_fs_context,
 	.parameters = gfs2_fs_parameters,
 	.kill_sb = gfs2_kill_sb,
@@ -1750,7 +1750,7 @@ MODULE_ALIAS_FS("gfs2");
 
 struct file_system_type gfs2meta_fs_type = {
 	.name = "gfs2meta",
-	.fs_flags = FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags = FS_REQUIRES_DEV,
 	.init_fs_context = gfs2_meta_init_fs_context,
 	.owner = THIS_MODULE,
 };
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 8ca77aa0b6f9..d2f82cb7db1b 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -906,7 +906,7 @@ static struct file_system_type jfs_fs_type = {
 	.name		= "jfs",
 	.mount		= jfs_do_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV | FS_AUTOFREEZE,
+	.fs_flags	= FS_REQUIRES_DEV,
 };
 MODULE_ALIAS_FS("jfs");
 
diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 04753962db9a..9bcd53d5c7d4 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -1583,7 +1583,7 @@ struct file_system_type nfs_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
 };
 MODULE_ALIAS_FS("nfs");
 EXPORT_SYMBOL_GPL(nfs_fs_type);
@@ -1595,7 +1595,7 @@ struct file_system_type nfs4_fs_type = {
 	.init_fs_context	= nfs_init_fs_context,
 	.parameters		= nfs_fs_parameters,
 	.kill_sb		= nfs_kill_super,
-	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA | FS_AUTOFREEZE,
+	.fs_flags		= FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
 };
 MODULE_ALIAS_FS("nfs4");
 MODULE_ALIAS("nfs4");
diff --git a/fs/super.c b/fs/super.c
index e8af4c8269ad..2943157aa41c 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1857,8 +1857,6 @@ EXPORT_SYMBOL(thaw_super);
 #ifdef CONFIG_PM_SLEEP
 static bool super_should_freeze(struct super_block *sb)
 {
-	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
-		return false;
 	/*
 	 * We don't freeze virtual filesystems, we skip those filesystems with
 	 * no backing device.
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 54cbf15fc459..e71e69895a94 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1966,7 +1966,7 @@ static struct file_system_type xfs_fs_type = {
 	.init_fs_context	= xfs_init_fs_context,
 	.parameters		= xfs_fs_parameters,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_AUTOFREEZE,
+	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
 };
 MODULE_ALIAS_FS("xfs");
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e5bee359e804..64b0ed66e87f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2231,7 +2231,6 @@ struct file_system_type {
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
-#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
-- 
2.35.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-14  0:57     ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:57 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec, linux-kernel

On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;

This is used.

> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;

I however had dropped this and forgot to update my branch.

> +
> +	return true;
> +}


So the call to super_should_freeze() is removed and the check for the
flag is open coded.

  Luis

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
@ 2023-01-14  0:57     ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-01-14  0:57 UTC (permalink / raw)
  To: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche, ebiederm
  Cc: mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec, linux-kernel

On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;

This is used.

> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;

I however had dropped this and forgot to update my branch.

> +
> +	return true;
> +}


So the call to super_should_freeze() is removed and the check for the
flag is open coded.

  Luis

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-16 15:14     ` Jan Kara
  -1 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 15:14 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:46, Luis Chamberlain wrote:
> Right now freeze_super()  and thaw_super() are called with
> different locking contexts. To expand on this is messy, so
> just unify the requirement to require grabbing an active
> reference and keep the superblock locked.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

The cleanup is nice but now freeze_super() does not increment s_active
anymore so nothing prevents the superblock from being torn down while it is
frozen. This is a behavioral change that needs documenting in the changelog
if nothing else but I think it may be actually problematic if the
filesystem's ->put_super method gets called on frozen filesystem. So I
think we may need to also block attempts to unmount frozen filesystem -
actually GFS2 needs this as well [1].

								Honza

[1] lore.kernel.org/r/20221129230736.3462830-1-agruenba@redhat.com

> ---
>  block/bdev.c    |  5 ++++-
>  fs/f2fs/gc.c    |  5 +++++
>  fs/gfs2/super.c |  9 +++++++--
>  fs/gfs2/sys.c   |  6 ++++++
>  fs/gfs2/util.c  |  5 +++++
>  fs/ioctl.c      | 12 ++++++++++--
>  fs/super.c      | 51 ++++++++++++++-----------------------------------
>  7 files changed, 51 insertions(+), 42 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index edc110d90df4..8fd3a7991c02 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
>  		error = sb->s_op->freeze_super(sb);
>  	else
>  		error = freeze_super(sb);
> -	deactivate_super(sb);
> +	deactivate_locked_super(sb);
>  
>  	if (error) {
>  		bdev->bd_fsfreeze_count--;
> @@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
>  	sb = bdev->bd_fsfreeze_sb;
>  	if (!sb)
>  		goto out;
> +	if (!get_active_super(bdev))
> +		goto out;
>  
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
> @@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
>  		bdev->bd_fsfreeze_count++;
>  	else
>  		bdev->bd_fsfreeze_sb = NULL;
> +	deactivate_locked_super(sb);
>  out:
>  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
>  	return error;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 7444c392eab1..4c681fe487ee 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2139,7 +2139,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	if (err)
>  		return err;
>  
> +	if (!get_active_super(sbi->sb->s_bdev))
> +		return -ENOTTY;
>  	freeze_super(sbi->sb);
> +
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
>  
> @@ -2190,6 +2193,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  out_err:
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
> +	/* We use the same active reference from freeze */
>  	thaw_super(sbi->sb);
> +	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 999cc146d708..48df7b276b64 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -661,7 +661,12 @@ void gfs2_freeze_func(struct work_struct *work)
>  	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
>  	struct super_block *sb = sdp->sd_vfs;
>  
> -	atomic_inc(&sb->s_active);
> +	if (!get_active_super(sb->s_bdev)) {
> +		fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
> +		gfs2_assert_withdraw(sdp, 0);
> +		return;
> +	}
> +
>  	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
>  	if (error) {
>  		gfs2_assert_withdraw(sdp, 0);
> @@ -675,7 +680,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		}
>  		gfs2_freeze_unlock(&freeze_gh);
>  	}
> -	deactivate_super(sb);
> +	deactivate_locked_super(sb);
>  	clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
>  	wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
>  	return;
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index d87ea98cf535..d0b80552a678 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -162,6 +162,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	if (!get_active_super(sb->s_bdev))
> +		return -ENOTTY;
> +
>  	switch (n) {
>  	case 0:
>  		error = thaw_super(sdp->sd_vfs);
> @@ -170,9 +173,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  		error = freeze_super(sdp->sd_vfs);
>  		break;
>  	default:
> +		deactivate_locked_super(sb);
>  		return -EINVAL;
>  	}
>  
> +	deactivate_locked_super(sb);
> +
>  	if (error) {
>  		fs_warn(sdp, "freeze %d error %d\n", n, error);
>  		return error;
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 7a6aeffcdf5c..3a0cd5e9ad84 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
>  	set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);
>  
>  	if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
> +		if (!get_active_super(sb->s_bdev)) {
> +			fs_err(sdp, "could not grab super on withdraw for file system\n");
> +			return -1;
> +		}
>  		fs_err(sdp, "about to withdraw this file system\n");
>  		BUG_ON(sdp->sd_args.ar_debug);
>  
>  		signal_our_withdraw(sdp);
> +		deactivate_locked_super(sb);
>  
>  		kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);
>  
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 80ac36aea913..3d2536e1ea58 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
>  static int ioctl_fsfreeze(struct file *filp)
>  {
>  	struct super_block *sb = file_inode(filp)->i_sb;
> +	int ret;
>  
>  	if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
>  		return -EPERM;
> @@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
>  	if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
>  		return -EOPNOTSUPP;
>  
> +	if (!get_active_super(sb->s_bdev))
> +		return -ENOTTY;
> +
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
> -		return sb->s_op->freeze_super(sb);
> -	return freeze_super(sb);
> +		ret = sb->s_op->freeze_super(sb);
> +	ret = freeze_super(sb);
> +
> +	deactivate_locked_super(sb);
> +
> +	return ret;
>  }
>  
>  static int ioctl_fsthaw(struct file *filp)
> diff --git a/fs/super.c b/fs/super.c
> index 12c08cb20405..a31a41b313f3 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -39,8 +39,6 @@
>  #include <uapi/linux/mount.h>
>  #include "internal.h"
>  
> -static int thaw_super_locked(struct super_block *sb);
> -
>  static LIST_HEAD(super_blocks);
>  static DEFINE_SPINLOCK(sb_lock);
>  
> @@ -830,7 +828,6 @@ struct super_block *get_active_super(struct block_device *bdev)
>  		if (sb->s_bdev == bdev) {
>  			if (!grab_super(sb))
>  				goto restart;
> -			up_write(&sb->s_umount);
>  			return sb;
>  		}
>  	}
> @@ -1003,13 +1000,13 @@ void emergency_remount(void)
>  
>  static void do_thaw_all_callback(struct super_block *sb)
>  {
> -	down_write(&sb->s_umount);
> +	if (!get_active_super(sb->s_bdev))
> +		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super_locked(sb);
> -	} else {
> -		up_write(&sb->s_umount);
> +		thaw_super(sb);
>  	}
> +	deactivate_locked_super(sb);
>  }
>  
>  static void do_thaw_all(struct work_struct *work)
> @@ -1651,22 +1648,15 @@ int freeze_super(struct super_block *sb)
>  {
>  	int ret;
>  
> -	atomic_inc(&sb->s_active);
> -	down_write(&sb->s_umount);
> -	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> +	if (sb->s_writers.frozen != SB_UNFROZEN)
>  		return -EBUSY;
> -	}
>  
> -	if (!(sb->s_flags & SB_BORN)) {
> -		up_write(&sb->s_umount);
> +	if (!(sb->s_flags & SB_BORN))
>  		return 0;	/* sic - it's "nothing to do" */
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> -		up_write(&sb->s_umount);
>  		return 0;
>  	}
>  
> @@ -1686,7 +1676,6 @@ int freeze_super(struct super_block *sb)
>  		sb->s_writers.frozen = SB_UNFROZEN;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
> -		deactivate_locked_super(sb);
>  		return ret;
>  	}
>  
> @@ -1702,7 +1691,6 @@ int freeze_super(struct super_block *sb)
>  			sb->s_writers.frozen = SB_UNFROZEN;
>  			sb_freeze_unlock(sb, SB_FREEZE_FS);
>  			wake_up(&sb->s_writers.wait_unfrozen);
> -			deactivate_locked_super(sb);
>  			return ret;
>  		}
>  	}
> @@ -1712,19 +1700,22 @@ int freeze_super(struct super_block *sb)
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>  	lockdep_sb_freeze_release(sb);
> -	up_write(&sb->s_umount);
>  	return 0;
>  }
>  EXPORT_SYMBOL(freeze_super);
>  
> -static int thaw_super_locked(struct super_block *sb)
> +/**
> + * thaw_super -- unlock filesystem
> + * @sb: the super to thaw
> + *
> + * Unlocks the filesystem and marks it writeable again after freeze_super().
> + */
> +int thaw_super(struct super_block *sb)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> -		up_write(&sb->s_umount);
> +	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
>  		return -EINVAL;
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> @@ -1739,7 +1730,6 @@ static int thaw_super_locked(struct super_block *sb)
>  			printk(KERN_ERR
>  				"VFS:Filesystem thaw failed\n");
>  			lockdep_sb_freeze_release(sb);
> -			up_write(&sb->s_umount);
>  			return error;
>  		}
>  	}
> @@ -1748,19 +1738,6 @@ static int thaw_super_locked(struct super_block *sb)
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> -	deactivate_locked_super(sb);
>  	return 0;
>  }
> -
> -/**
> - * thaw_super -- unlock filesystem
> - * @sb: the super to thaw
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_super().
> - */
> -int thaw_super(struct super_block *sb)
> -{
> -	down_write(&sb->s_umount);
> -	return thaw_super_locked(sb);
> -}
>  EXPORT_SYMBOL(thaw_super);
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
@ 2023-01-16 15:14     ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 15:14 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:46, Luis Chamberlain wrote:
> Right now freeze_super()  and thaw_super() are called with
> different locking contexts. To expand on this is messy, so
> just unify the requirement to require grabbing an active
> reference and keep the superblock locked.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

The cleanup is nice but now freeze_super() does not increment s_active
anymore so nothing prevents the superblock from being torn down while it is
frozen. This is a behavioral change that needs documenting in the changelog
if nothing else but I think it may be actually problematic if the
filesystem's ->put_super method gets called on frozen filesystem. So I
think we may need to also block attempts to unmount frozen filesystem -
actually GFS2 needs this as well [1].

								Honza

[1] lore.kernel.org/r/20221129230736.3462830-1-agruenba@redhat.com

> ---
>  block/bdev.c    |  5 ++++-
>  fs/f2fs/gc.c    |  5 +++++
>  fs/gfs2/super.c |  9 +++++++--
>  fs/gfs2/sys.c   |  6 ++++++
>  fs/gfs2/util.c  |  5 +++++
>  fs/ioctl.c      | 12 ++++++++++--
>  fs/super.c      | 51 ++++++++++++++-----------------------------------
>  7 files changed, 51 insertions(+), 42 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index edc110d90df4..8fd3a7991c02 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
>  		error = sb->s_op->freeze_super(sb);
>  	else
>  		error = freeze_super(sb);
> -	deactivate_super(sb);
> +	deactivate_locked_super(sb);
>  
>  	if (error) {
>  		bdev->bd_fsfreeze_count--;
> @@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
>  	sb = bdev->bd_fsfreeze_sb;
>  	if (!sb)
>  		goto out;
> +	if (!get_active_super(bdev))
> +		goto out;
>  
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
> @@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
>  		bdev->bd_fsfreeze_count++;
>  	else
>  		bdev->bd_fsfreeze_sb = NULL;
> +	deactivate_locked_super(sb);
>  out:
>  	mutex_unlock(&bdev->bd_fsfreeze_mutex);
>  	return error;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 7444c392eab1..4c681fe487ee 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2139,7 +2139,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	if (err)
>  		return err;
>  
> +	if (!get_active_super(sbi->sb->s_bdev))
> +		return -ENOTTY;
>  	freeze_super(sbi->sb);
> +
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
>  
> @@ -2190,6 +2193,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  out_err:
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
> +	/* We use the same active reference from freeze */
>  	thaw_super(sbi->sb);
> +	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 999cc146d708..48df7b276b64 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -661,7 +661,12 @@ void gfs2_freeze_func(struct work_struct *work)
>  	struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
>  	struct super_block *sb = sdp->sd_vfs;
>  
> -	atomic_inc(&sb->s_active);
> +	if (!get_active_super(sb->s_bdev)) {
> +		fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
> +		gfs2_assert_withdraw(sdp, 0);
> +		return;
> +	}
> +
>  	error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
>  	if (error) {
>  		gfs2_assert_withdraw(sdp, 0);
> @@ -675,7 +680,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		}
>  		gfs2_freeze_unlock(&freeze_gh);
>  	}
> -	deactivate_super(sb);
> +	deactivate_locked_super(sb);
>  	clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
>  	wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
>  	return;
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index d87ea98cf535..d0b80552a678 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -162,6 +162,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
>  
> +	if (!get_active_super(sb->s_bdev))
> +		return -ENOTTY;
> +
>  	switch (n) {
>  	case 0:
>  		error = thaw_super(sdp->sd_vfs);
> @@ -170,9 +173,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  		error = freeze_super(sdp->sd_vfs);
>  		break;
>  	default:
> +		deactivate_locked_super(sb);
>  		return -EINVAL;
>  	}
>  
> +	deactivate_locked_super(sb);
> +
>  	if (error) {
>  		fs_warn(sdp, "freeze %d error %d\n", n, error);
>  		return error;
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 7a6aeffcdf5c..3a0cd5e9ad84 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
>  	set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);
>  
>  	if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
> +		if (!get_active_super(sb->s_bdev)) {
> +			fs_err(sdp, "could not grab super on withdraw for file system\n");
> +			return -1;
> +		}
>  		fs_err(sdp, "about to withdraw this file system\n");
>  		BUG_ON(sdp->sd_args.ar_debug);
>  
>  		signal_our_withdraw(sdp);
> +		deactivate_locked_super(sb);
>  
>  		kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);
>  
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 80ac36aea913..3d2536e1ea58 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
>  static int ioctl_fsfreeze(struct file *filp)
>  {
>  	struct super_block *sb = file_inode(filp)->i_sb;
> +	int ret;
>  
>  	if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
>  		return -EPERM;
> @@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
>  	if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
>  		return -EOPNOTSUPP;
>  
> +	if (!get_active_super(sb->s_bdev))
> +		return -ENOTTY;
> +
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
> -		return sb->s_op->freeze_super(sb);
> -	return freeze_super(sb);
> +		ret = sb->s_op->freeze_super(sb);
> +	ret = freeze_super(sb);
> +
> +	deactivate_locked_super(sb);
> +
> +	return ret;
>  }
>  
>  static int ioctl_fsthaw(struct file *filp)
> diff --git a/fs/super.c b/fs/super.c
> index 12c08cb20405..a31a41b313f3 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -39,8 +39,6 @@
>  #include <uapi/linux/mount.h>
>  #include "internal.h"
>  
> -static int thaw_super_locked(struct super_block *sb);
> -
>  static LIST_HEAD(super_blocks);
>  static DEFINE_SPINLOCK(sb_lock);
>  
> @@ -830,7 +828,6 @@ struct super_block *get_active_super(struct block_device *bdev)
>  		if (sb->s_bdev == bdev) {
>  			if (!grab_super(sb))
>  				goto restart;
> -			up_write(&sb->s_umount);
>  			return sb;
>  		}
>  	}
> @@ -1003,13 +1000,13 @@ void emergency_remount(void)
>  
>  static void do_thaw_all_callback(struct super_block *sb)
>  {
> -	down_write(&sb->s_umount);
> +	if (!get_active_super(sb->s_bdev))
> +		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super_locked(sb);
> -	} else {
> -		up_write(&sb->s_umount);
> +		thaw_super(sb);
>  	}
> +	deactivate_locked_super(sb);
>  }
>  
>  static void do_thaw_all(struct work_struct *work)
> @@ -1651,22 +1648,15 @@ int freeze_super(struct super_block *sb)
>  {
>  	int ret;
>  
> -	atomic_inc(&sb->s_active);
> -	down_write(&sb->s_umount);
> -	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> +	if (sb->s_writers.frozen != SB_UNFROZEN)
>  		return -EBUSY;
> -	}
>  
> -	if (!(sb->s_flags & SB_BORN)) {
> -		up_write(&sb->s_umount);
> +	if (!(sb->s_flags & SB_BORN))
>  		return 0;	/* sic - it's "nothing to do" */
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> -		up_write(&sb->s_umount);
>  		return 0;
>  	}
>  
> @@ -1686,7 +1676,6 @@ int freeze_super(struct super_block *sb)
>  		sb->s_writers.frozen = SB_UNFROZEN;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
> -		deactivate_locked_super(sb);
>  		return ret;
>  	}
>  
> @@ -1702,7 +1691,6 @@ int freeze_super(struct super_block *sb)
>  			sb->s_writers.frozen = SB_UNFROZEN;
>  			sb_freeze_unlock(sb, SB_FREEZE_FS);
>  			wake_up(&sb->s_writers.wait_unfrozen);
> -			deactivate_locked_super(sb);
>  			return ret;
>  		}
>  	}
> @@ -1712,19 +1700,22 @@ int freeze_super(struct super_block *sb)
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>  	lockdep_sb_freeze_release(sb);
> -	up_write(&sb->s_umount);
>  	return 0;
>  }
>  EXPORT_SYMBOL(freeze_super);
>  
> -static int thaw_super_locked(struct super_block *sb)
> +/**
> + * thaw_super -- unlock filesystem
> + * @sb: the super to thaw
> + *
> + * Unlocks the filesystem and marks it writeable again after freeze_super().
> + */
> +int thaw_super(struct super_block *sb)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> -		up_write(&sb->s_umount);
> +	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
>  		return -EINVAL;
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> @@ -1739,7 +1730,6 @@ static int thaw_super_locked(struct super_block *sb)
>  			printk(KERN_ERR
>  				"VFS:Filesystem thaw failed\n");
>  			lockdep_sb_freeze_release(sb);
> -			up_write(&sb->s_umount);
>  			return error;
>  		}
>  	}
> @@ -1748,19 +1738,6 @@ static int thaw_super_locked(struct super_block *sb)
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> -	deactivate_locked_super(sb);
>  	return 0;
>  }
> -
> -/**
> - * thaw_super -- unlock filesystem
> - * @sb: the super to thaw
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_super().
> - */
> -int thaw_super(struct super_block *sb)
> -{
> -	down_write(&sb->s_umount);
> -	return thaw_super_locked(sb);
> -}
>  EXPORT_SYMBOL(thaw_super);
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-16 16:09     ` Jan Kara
  -1 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:09 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:50, Luis Chamberlain wrote:
> Add support to automatically handle freezing and thawing filesystems
> during the kernel's suspend/resume cycle.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting during suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.
> 
> This does not remove the superflous freezer calls on all filesystems.
> Each filesystem must remove all the kthread freezer stuff and peg
> the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> flag.
> 
> Subsequent patches remove the kthread freezer usage from each
> filesystem, one at a time to make all this work bisectable.
> Once all filesystems remove the usage of the kthread freezer we
> can remove the FS_AUTOFREEZE flag.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 14 +++++++++
>  kernel/power/process.c | 15 ++++++++-
>  3 files changed, 97 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 2f77fcb6e555..e8af4c8269ad 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
>  	return 0;
>  }
>  EXPORT_SYMBOL(thaw_super);
> +
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;
> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;
> +
> +	return true;
> +}
> +
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): freezing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> +
> +	error = freeze_super(sb, false);
> +	if (!error)
> +		lockdep_sb_freeze_release(sb);
> +	else if (error != -EBUSY)
> +		pr_notice("%s (%s): Unable to freeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): thawing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
> +
> +	error = thaw_super(sb, false);
> +	if (error && error != -EBUSY)
> +		pr_notice("%s (%s): Unable to unfreeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f168e72f6ca1..e5bee359e804 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2231,6 +2231,7 @@ struct file_system_type {
>  #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
>  #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
>  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
> +#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
>  	int (*init_fs_context)(struct fs_context *);
>  	const struct fs_parameter_spec *parameters;
>  	struct dentry *(*mount) (struct file_system_type *, int,
> @@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
>  extern int freeze_super(struct super_block *super, bool usercall);
>  extern int thaw_super(struct super_block *super, bool usercall);
> +#ifdef CONFIG_PM_SLEEP
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
> +#else
> +static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +#endif
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 6c1c7e566d35..1dd6b0b6b4e5 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -140,6 +140,16 @@ int freeze_processes(void)
>  
>  	BUG_ON(in_atomic());
>  
> +	pr_info("Freezing filesystems ... ");
> +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> +	if (error) {
> +		pr_cont("failed\n");
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> +		thaw_processes();
> +		return error;
> +	}
> +	pr_cont("done.\n");
> +
>  	/*
>  	 * Now that the whole userspace is frozen we need to disable
>  	 * the OOM killer to disallow any further interference with
> @@ -149,8 +159,10 @@ int freeze_processes(void)
>  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
>  		error = -EBUSY;
>  
> -	if (error)
> +	if (error) {
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  		thaw_processes();
> +	}
>  	return error;
>  }
>  
> @@ -188,6 +200,7 @@ void thaw_processes(void)
>  	pm_nosig_freezing = false;
>  
>  	oom_killer_enable();
> +	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  
>  	pr_info("Restarting tasks ... ");
>  
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
@ 2023-01-16 16:09     ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:09 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:50, Luis Chamberlain wrote:
> Add support to automatically handle freezing and thawing filesystems
> during the kernel's suspend/resume cycle.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting during suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.
> 
> This does not remove the superflous freezer calls on all filesystems.
> Each filesystem must remove all the kthread freezer stuff and peg
> the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> flag.
> 
> Subsequent patches remove the kthread freezer usage from each
> filesystem, one at a time to make all this work bisectable.
> Once all filesystems remove the usage of the kthread freezer we
> can remove the FS_AUTOFREEZE flag.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 14 +++++++++
>  kernel/power/process.c | 15 ++++++++-
>  3 files changed, 97 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 2f77fcb6e555..e8af4c8269ad 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
>  	return 0;
>  }
>  EXPORT_SYMBOL(thaw_super);
> +
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;
> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;
> +
> +	return true;
> +}
> +
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): freezing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> +
> +	error = freeze_super(sb, false);
> +	if (!error)
> +		lockdep_sb_freeze_release(sb);
> +	else if (error != -EBUSY)
> +		pr_notice("%s (%s): Unable to freeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): thawing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
> +
> +	error = thaw_super(sb, false);
> +	if (error && error != -EBUSY)
> +		pr_notice("%s (%s): Unable to unfreeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f168e72f6ca1..e5bee359e804 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2231,6 +2231,7 @@ struct file_system_type {
>  #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
>  #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
>  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
> +#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
>  	int (*init_fs_context)(struct fs_context *);
>  	const struct fs_parameter_spec *parameters;
>  	struct dentry *(*mount) (struct file_system_type *, int,
> @@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
>  extern int freeze_super(struct super_block *super, bool usercall);
>  extern int thaw_super(struct super_block *super, bool usercall);
> +#ifdef CONFIG_PM_SLEEP
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
> +#else
> +static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +#endif
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 6c1c7e566d35..1dd6b0b6b4e5 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -140,6 +140,16 @@ int freeze_processes(void)
>  
>  	BUG_ON(in_atomic());
>  
> +	pr_info("Freezing filesystems ... ");
> +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> +	if (error) {
> +		pr_cont("failed\n");
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> +		thaw_processes();
> +		return error;
> +	}
> +	pr_cont("done.\n");
> +
>  	/*
>  	 * Now that the whole userspace is frozen we need to disable
>  	 * the OOM killer to disallow any further interference with
> @@ -149,8 +159,10 @@ int freeze_processes(void)
>  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
>  		error = -EBUSY;
>  
> -	if (error)
> +	if (error) {
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  		thaw_processes();
> +	}
>  	return error;
>  }
>  
> @@ -188,6 +200,7 @@ void thaw_processes(void)
>  	pm_nosig_freezing = false;
>  
>  	oom_killer_enable();
> +	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  
>  	pr_info("Restarting tasks ... ");
>  
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-16 16:10     ` Jan Kara
  -1 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:10 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:48, Luis Chamberlain wrote:
> Userspace can initiate a freeze call using ioctls. If the kernel decides
> to freeze a filesystem later it must be able to distinguish if userspace
> had initiated the freeze, so that it does not unfreeze it later
> automatically on resume.
> 
> Likewise if the kernel is initiating a freeze on its own it should *not*
> fail to freeze a filesystem if a user had already frozen it on our behalf.
> This same concept applies to thawing, even if its not possible for
> userspace to beat the kernel in thawing a filesystem. This logic however
> has never applied to userspace freezing and thawing, two consecutive
> userspace freeze calls will results in only the first one succeeding, so
> we must retain the same behaviour in userspace.
> 
> This doesn't implement yet kernel initiated filesystem freeze calls,
> this will be done in subsequent calls. This change should introduce
> no functional changes, it just extends the definitions of a frozen
> filesystem to account for future kernel initiated filesystem freeze
> and let's us keep record of when userpace initiated it so the kernel
> can respect a userspace initiated freeze upon kernel initiated freeze
> and its respective thaw cycle.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

This is slightly ugly but it should work and I don't have a better solution
so feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/bdev.c       |  4 ++--
>  fs/f2fs/gc.c       |  4 ++--
>  fs/gfs2/glops.c    |  2 +-
>  fs/gfs2/super.c    |  2 +-
>  fs/gfs2/sys.c      |  4 ++--
>  fs/gfs2/util.c     |  2 +-
>  fs/ioctl.c         |  4 ++--
>  fs/super.c         | 31 ++++++++++++++++++++++++++-----
>  include/linux/fs.h | 16 ++++++++++++++--
>  9 files changed, 51 insertions(+), 18 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 8fd3a7991c02..668ebf2015bf 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
>  	if (sb->s_op->freeze_super)
>  		error = sb->s_op->freeze_super(sb);
>  	else
> -		error = freeze_super(sb);
> +		error = freeze_super(sb, true);
>  	deactivate_locked_super(sb);
>  
>  	if (error) {
> @@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
>  	else
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  	if (error)
>  		bdev->bd_fsfreeze_count++;
>  	else
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 4c681fe487ee..8eac3042786b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	if (!get_active_super(sbi->sb->s_bdev))
>  		return -ENOTTY;
> -	freeze_super(sbi->sb);
> +	freeze_super(sbi->sb, true);
>  
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
> @@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
>  	/* We use the same active reference from freeze */
> -	thaw_super(sbi->sb);
> +	thaw_super(sbi->sb, true);
>  	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 081422644ec5..62a7e0693efa 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
>  	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
>  	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
>  		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 48df7b276b64..9c55b8042aa4 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		gfs2_assert_withdraw(sdp, 0);
>  	} else {
>  		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index b98be03d0d1e..69514294215b 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  
>  	switch (n) {
>  	case 0:
> -		error = thaw_super(sdp->sd_vfs);
> +		error = thaw_super(sdp->sd_vfs, true);
>  		break;
>  	case 1:
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		break;
>  	default:
>  		deactivate_locked_super(sb);
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 3a0cd5e9ad84..be9705d618ec 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>  		/* Make sure gfs2_unfreeze works if partially-frozen */
>  		flush_work(&sdp->sd_freeze_work);
>  		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> -		thaw_super(sdp->sd_vfs);
> +		thaw_super(sdp->sd_vfs, true);
>  	} else {
>  		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
>  			    TASK_UNINTERRUPTIBLE);
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 3d2536e1ea58..0ac1622785ad 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
>  		ret = sb->s_op->freeze_super(sb);
> -	ret = freeze_super(sb);
> +	ret = freeze_super(sb, true);
>  
>  	deactivate_locked_super(sb);
>  
> @@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
>  	/* Thaw */
>  	if (sb->s_op->thaw_super)
>  		return sb->s_op->thaw_super(sb);
> -	return thaw_super(sb);
> +	return thaw_super(sb, true);
>  }
>  
>  static int ioctl_file_dedupe_range(struct file *file,
> diff --git a/fs/super.c b/fs/super.c
> index fdcf5a87af0a..0d6b4de8da88 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>  		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super(sb);
> +		thaw_super(sb, true);
>  	}
>  	deactivate_locked_super(sb);
>  }
> @@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @usercall: whether or not userspace initiated this via an ioctl or if it
> + * 	was a kernel freeze
>   *
>   * Syncs the super to make sure the filesystem is consistent and calls the fs's
>   * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> @@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>   *
>   * sb->s_writers.frozen is protected by sb->s_umount.
>   */
> -int freeze_super(struct super_block *sb)
> +int freeze_super(struct super_block *sb, bool usercall)
>  {
>  	int ret;
>  
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!usercall && sb_is_frozen(sb))
> +		return 0;
> +
> +	if (!sb_is_unfrozen(sb))
>  		return -EBUSY;
>  
>  	if (!(sb->s_flags & SB_BORN))
> @@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +		sb->s_writers.frozen_by_user = usercall;
>  		return 0;
>  	}
>  
> @@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
>  	ret = sync_filesystem(sb);
>  	if (ret) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
>  		return ret;
> @@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
>  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +	sb->s_writers.frozen_by_user = usercall;
>  	lockdep_sb_freeze_release(sb);
>  	return 0;
>  }
> @@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
>  /**
>   * thaw_super -- unlock filesystem
>   * @sb: the super to thaw
> + * @usercall: whether or not userspace initiated this thaw or if it was the
> + * 	kernel which initiated it
>   *
>   * Unlocks the filesystem and marks it writeable again after freeze_super().
>   */
> -int thaw_super(struct super_block *sb)
> +int thaw_super(struct super_block *sb, bool usercall)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
> +	if (!usercall) {
> +		/*
> +		 * If userspace initiated the freeze don't let the kernel
> +		 * thaw it on return from a kernel initiated freeze.
> +		 */
> +		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
> +			return 0;
> +	}
> +
> +	if (!sb_is_frozen(sb))
>  		return -EINVAL;
>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		goto out;
>  	}
>  
> @@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
>  	}
>  
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	sb->s_writers.frozen_by_user = false;
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c0cab61f9f9a..3b2586de4364 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1129,6 +1129,7 @@ enum {
>  
>  struct sb_writers {
>  	int				frozen;		/* Is sb frozen? */
> +	bool				frozen_by_user;	/* User freeze? */
>  	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
>  	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
>  };
> @@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
>  	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
>  }
>  
> +/**
> + * sb_is_frozen_by_user - was the superblock frozen by userspace?
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen by userspace, such as an ioctl.
> + */
> +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> +{
> +	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
> +}
> +
>  /**
>   * sb_is_unfrozen - is superblock unfrozen
>   * @sb: the super to check
> @@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
>  extern int vfs_statfs(const struct path *, struct kstatfs *);
>  extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
> -extern int freeze_super(struct super_block *super);
> -extern int thaw_super(struct super_block *super);
> +extern int freeze_super(struct super_block *super, bool usercall);
> +extern int thaw_super(struct super_block *super, bool usercall);
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-01-16 16:10     ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:10 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:48, Luis Chamberlain wrote:
> Userspace can initiate a freeze call using ioctls. If the kernel decides
> to freeze a filesystem later it must be able to distinguish if userspace
> had initiated the freeze, so that it does not unfreeze it later
> automatically on resume.
> 
> Likewise if the kernel is initiating a freeze on its own it should *not*
> fail to freeze a filesystem if a user had already frozen it on our behalf.
> This same concept applies to thawing, even if its not possible for
> userspace to beat the kernel in thawing a filesystem. This logic however
> has never applied to userspace freezing and thawing, two consecutive
> userspace freeze calls will results in only the first one succeeding, so
> we must retain the same behaviour in userspace.
> 
> This doesn't implement yet kernel initiated filesystem freeze calls,
> this will be done in subsequent calls. This change should introduce
> no functional changes, it just extends the definitions of a frozen
> filesystem to account for future kernel initiated filesystem freeze
> and let's us keep record of when userpace initiated it so the kernel
> can respect a userspace initiated freeze upon kernel initiated freeze
> and its respective thaw cycle.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

This is slightly ugly but it should work and I don't have a better solution
so feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/bdev.c       |  4 ++--
>  fs/f2fs/gc.c       |  4 ++--
>  fs/gfs2/glops.c    |  2 +-
>  fs/gfs2/super.c    |  2 +-
>  fs/gfs2/sys.c      |  4 ++--
>  fs/gfs2/util.c     |  2 +-
>  fs/ioctl.c         |  4 ++--
>  fs/super.c         | 31 ++++++++++++++++++++++++++-----
>  include/linux/fs.h | 16 ++++++++++++++--
>  9 files changed, 51 insertions(+), 18 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 8fd3a7991c02..668ebf2015bf 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
>  	if (sb->s_op->freeze_super)
>  		error = sb->s_op->freeze_super(sb);
>  	else
> -		error = freeze_super(sb);
> +		error = freeze_super(sb, true);
>  	deactivate_locked_super(sb);
>  
>  	if (error) {
> @@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
>  	else
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  	if (error)
>  		bdev->bd_fsfreeze_count++;
>  	else
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 4c681fe487ee..8eac3042786b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	if (!get_active_super(sbi->sb->s_bdev))
>  		return -ENOTTY;
> -	freeze_super(sbi->sb);
> +	freeze_super(sbi->sb, true);
>  
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
> @@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
>  	/* We use the same active reference from freeze */
> -	thaw_super(sbi->sb);
> +	thaw_super(sbi->sb, true);
>  	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 081422644ec5..62a7e0693efa 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
>  	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
>  	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
>  		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 48df7b276b64..9c55b8042aa4 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		gfs2_assert_withdraw(sdp, 0);
>  	} else {
>  		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index b98be03d0d1e..69514294215b 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  
>  	switch (n) {
>  	case 0:
> -		error = thaw_super(sdp->sd_vfs);
> +		error = thaw_super(sdp->sd_vfs, true);
>  		break;
>  	case 1:
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		break;
>  	default:
>  		deactivate_locked_super(sb);
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 3a0cd5e9ad84..be9705d618ec 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>  		/* Make sure gfs2_unfreeze works if partially-frozen */
>  		flush_work(&sdp->sd_freeze_work);
>  		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> -		thaw_super(sdp->sd_vfs);
> +		thaw_super(sdp->sd_vfs, true);
>  	} else {
>  		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
>  			    TASK_UNINTERRUPTIBLE);
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 3d2536e1ea58..0ac1622785ad 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
>  		ret = sb->s_op->freeze_super(sb);
> -	ret = freeze_super(sb);
> +	ret = freeze_super(sb, true);
>  
>  	deactivate_locked_super(sb);
>  
> @@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
>  	/* Thaw */
>  	if (sb->s_op->thaw_super)
>  		return sb->s_op->thaw_super(sb);
> -	return thaw_super(sb);
> +	return thaw_super(sb, true);
>  }
>  
>  static int ioctl_file_dedupe_range(struct file *file,
> diff --git a/fs/super.c b/fs/super.c
> index fdcf5a87af0a..0d6b4de8da88 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>  		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super(sb);
> +		thaw_super(sb, true);
>  	}
>  	deactivate_locked_super(sb);
>  }
> @@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @usercall: whether or not userspace initiated this via an ioctl or if it
> + * 	was a kernel freeze
>   *
>   * Syncs the super to make sure the filesystem is consistent and calls the fs's
>   * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> @@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>   *
>   * sb->s_writers.frozen is protected by sb->s_umount.
>   */
> -int freeze_super(struct super_block *sb)
> +int freeze_super(struct super_block *sb, bool usercall)
>  {
>  	int ret;
>  
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!usercall && sb_is_frozen(sb))
> +		return 0;
> +
> +	if (!sb_is_unfrozen(sb))
>  		return -EBUSY;
>  
>  	if (!(sb->s_flags & SB_BORN))
> @@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +		sb->s_writers.frozen_by_user = usercall;
>  		return 0;
>  	}
>  
> @@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
>  	ret = sync_filesystem(sb);
>  	if (ret) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
>  		return ret;
> @@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
>  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +	sb->s_writers.frozen_by_user = usercall;
>  	lockdep_sb_freeze_release(sb);
>  	return 0;
>  }
> @@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
>  /**
>   * thaw_super -- unlock filesystem
>   * @sb: the super to thaw
> + * @usercall: whether or not userspace initiated this thaw or if it was the
> + * 	kernel which initiated it
>   *
>   * Unlocks the filesystem and marks it writeable again after freeze_super().
>   */
> -int thaw_super(struct super_block *sb)
> +int thaw_super(struct super_block *sb, bool usercall)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
> +	if (!usercall) {
> +		/*
> +		 * If userspace initiated the freeze don't let the kernel
> +		 * thaw it on return from a kernel initiated freeze.
> +		 */
> +		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
> +			return 0;
> +	}
> +
> +	if (!sb_is_frozen(sb))
>  		return -EINVAL;
>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		goto out;
>  	}
>  
> @@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
>  	}
>  
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	sb->s_writers.frozen_by_user = false;
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c0cab61f9f9a..3b2586de4364 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1129,6 +1129,7 @@ enum {
>  
>  struct sb_writers {
>  	int				frozen;		/* Is sb frozen? */
> +	bool				frozen_by_user;	/* User freeze? */
>  	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
>  	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
>  };
> @@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
>  	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
>  }
>  
> +/**
> + * sb_is_frozen_by_user - was the superblock frozen by userspace?
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen by userspace, such as an ioctl.
> + */
> +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> +{
> +	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
> +}
> +
>  /**
>   * sb_is_unfrozen - is superblock unfrozen
>   * @sb: the super to check
> @@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
>  extern int vfs_statfs(const struct path *, struct kstatfs *);
>  extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
> -extern int freeze_super(struct super_block *super);
> -extern int thaw_super(struct super_block *super);
> +extern int freeze_super(struct super_block *super, bool usercall);
> +extern int thaw_super(struct super_block *super, bool usercall);
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 02/24] fs: add frozen sb state helpers
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-16 16:11     ` Jan Kara
  -1 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:11 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:47, Luis Chamberlain wrote:
> Provide helpers so that we can check a superblock frozen state.
> This will make subsequent changes easier to read. This makes
> no functional changes.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Sure. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ext4_jbd2.c |  2 +-
>  fs/gfs2/sys.c       |  2 +-
>  fs/quota/quota.c    |  4 ++--
>  fs/super.c          |  4 ++--
>  fs/xfs/xfs_trans.c  |  3 +--
>  include/linux/fs.h  | 22 ++++++++++++++++++++++
>  6 files changed, 29 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 77f318ec8abb..ef441f15053b 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -72,7 +72,7 @@ static int ext4_journal_check_start(struct super_block *sb)
>  
>  	if (sb_rdonly(sb))
>  		return -EROFS;
> -	WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> +	WARN_ON(sb_is_frozen(sb));
>  	journal = EXT4_SB(sb)->s_journal;
>  	/*
>  	 * Special case here: if the journal has aborted behind our
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index d0b80552a678..b98be03d0d1e 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -146,7 +146,7 @@ static ssize_t uuid_show(struct gfs2_sbd *sdp, char *buf)
>  static ssize_t freeze_show(struct gfs2_sbd *sdp, char *buf)
>  {
>  	struct super_block *sb = sdp->sd_vfs;
> -	int frozen = (sb->s_writers.frozen == SB_UNFROZEN) ? 0 : 1;
> +	int frozen = sb_is_unfrozen(sb) ? 0 : 1;
>  
>  	return snprintf(buf, PAGE_SIZE, "%d\n", frozen);
>  }
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 052f143e2e0e..d8147c21bf03 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -890,13 +890,13 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
>  	sb = user_get_super(dev, excl);
>  	if (!sb)
>  		return ERR_PTR(-ENODEV);
> -	if (thawed && sb->s_writers.frozen != SB_UNFROZEN) {
> +	if (thawed && sb_is_unfrozen(sb)) {
>  		if (excl)
>  			up_write(&sb->s_umount);
>  		else
>  			up_read(&sb->s_umount);
>  		wait_event(sb->s_writers.wait_unfrozen,
> -			   sb->s_writers.frozen == SB_UNFROZEN);
> +			   sb_is_unfrozen(sb));
>  		put_super(sb);
>  		goto retry;
>  	}
> diff --git a/fs/super.c b/fs/super.c
> index a31a41b313f3..fdcf5a87af0a 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -883,7 +883,7 @@ int reconfigure_super(struct fs_context *fc)
>  
>  	if (fc->sb_flags_mask & ~MS_RMT_MASK)
>  		return -EINVAL;
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!(sb_is_unfrozen(sb)))
>  		return -EBUSY;
>  
>  	retval = security_sb_remount(sb, fc->security);
> @@ -907,7 +907,7 @@ int reconfigure_super(struct fs_context *fc)
>  			down_write(&sb->s_umount);
>  			if (!sb->s_root)
>  				return 0;
> -			if (sb->s_writers.frozen != SB_UNFROZEN)
> +			if (!sb_is_unfrozen(sb))
>  				return -EBUSY;
>  			remount_ro = !sb_rdonly(sb);
>  		}
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 7bd16fbff534..ceb4890a4c96 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -267,8 +267,7 @@ xfs_trans_alloc(
>  	 * Zero-reservation ("empty") transactions can't modify anything, so
>  	 * they're allowed to run while we're frozen.
>  	 */
> -	WARN_ON(resp->tr_logres > 0 &&
> -		mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
> +	WARN_ON(resp->tr_logres > 0 && sb_is_frozen(mp->m_super));
>  	ASSERT(!(flags & XFS_TRANS_RES_FDBLKS) ||
>  	       xfs_has_lazysbcount(mp));
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 5042f5ab74a4..c0cab61f9f9a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1604,6 +1604,28 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
>  	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
>  }
>  
> +/**
> + * sb_is_frozen - is superblock frozen
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen.
> + */
> +static inline bool sb_is_frozen(struct super_block *sb)
> +{
> +	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
> +}
> +
> +/**
> + * sb_is_unfrozen - is superblock unfrozen
> + * @sb: the super to check
> + *
> + * Returns true if the super is unfrozen.
> + */
> +static inline bool sb_is_unfrozen(struct super_block *sb)
> +{
> +	return sb->s_writers.frozen == SB_UNFROZEN;
> +}
> +
>  bool inode_owner_or_capable(struct user_namespace *mnt_userns,
>  			    const struct inode *inode);
>  
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 02/24] fs: add frozen sb state helpers
@ 2023-01-16 16:11     ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-16 16:11 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, djwong, song, rafael, gregkh, viro, jack, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel

On Fri 13-01-23 16:33:47, Luis Chamberlain wrote:
> Provide helpers so that we can check a superblock frozen state.
> This will make subsequent changes easier to read. This makes
> no functional changes.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Sure. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ext4_jbd2.c |  2 +-
>  fs/gfs2/sys.c       |  2 +-
>  fs/quota/quota.c    |  4 ++--
>  fs/super.c          |  4 ++--
>  fs/xfs/xfs_trans.c  |  3 +--
>  include/linux/fs.h  | 22 ++++++++++++++++++++++
>  6 files changed, 29 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 77f318ec8abb..ef441f15053b 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -72,7 +72,7 @@ static int ext4_journal_check_start(struct super_block *sb)
>  
>  	if (sb_rdonly(sb))
>  		return -EROFS;
> -	WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> +	WARN_ON(sb_is_frozen(sb));
>  	journal = EXT4_SB(sb)->s_journal;
>  	/*
>  	 * Special case here: if the journal has aborted behind our
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index d0b80552a678..b98be03d0d1e 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -146,7 +146,7 @@ static ssize_t uuid_show(struct gfs2_sbd *sdp, char *buf)
>  static ssize_t freeze_show(struct gfs2_sbd *sdp, char *buf)
>  {
>  	struct super_block *sb = sdp->sd_vfs;
> -	int frozen = (sb->s_writers.frozen == SB_UNFROZEN) ? 0 : 1;
> +	int frozen = sb_is_unfrozen(sb) ? 0 : 1;
>  
>  	return snprintf(buf, PAGE_SIZE, "%d\n", frozen);
>  }
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 052f143e2e0e..d8147c21bf03 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -890,13 +890,13 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
>  	sb = user_get_super(dev, excl);
>  	if (!sb)
>  		return ERR_PTR(-ENODEV);
> -	if (thawed && sb->s_writers.frozen != SB_UNFROZEN) {
> +	if (thawed && sb_is_unfrozen(sb)) {
>  		if (excl)
>  			up_write(&sb->s_umount);
>  		else
>  			up_read(&sb->s_umount);
>  		wait_event(sb->s_writers.wait_unfrozen,
> -			   sb->s_writers.frozen == SB_UNFROZEN);
> +			   sb_is_unfrozen(sb));
>  		put_super(sb);
>  		goto retry;
>  	}
> diff --git a/fs/super.c b/fs/super.c
> index a31a41b313f3..fdcf5a87af0a 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -883,7 +883,7 @@ int reconfigure_super(struct fs_context *fc)
>  
>  	if (fc->sb_flags_mask & ~MS_RMT_MASK)
>  		return -EINVAL;
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!(sb_is_unfrozen(sb)))
>  		return -EBUSY;
>  
>  	retval = security_sb_remount(sb, fc->security);
> @@ -907,7 +907,7 @@ int reconfigure_super(struct fs_context *fc)
>  			down_write(&sb->s_umount);
>  			if (!sb->s_root)
>  				return 0;
> -			if (sb->s_writers.frozen != SB_UNFROZEN)
> +			if (!sb_is_unfrozen(sb))
>  				return -EBUSY;
>  			remount_ro = !sb_rdonly(sb);
>  		}
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 7bd16fbff534..ceb4890a4c96 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -267,8 +267,7 @@ xfs_trans_alloc(
>  	 * Zero-reservation ("empty") transactions can't modify anything, so
>  	 * they're allowed to run while we're frozen.
>  	 */
> -	WARN_ON(resp->tr_logres > 0 &&
> -		mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
> +	WARN_ON(resp->tr_logres > 0 && sb_is_frozen(mp->m_super));
>  	ASSERT(!(flags & XFS_TRANS_RES_FDBLKS) ||
>  	       xfs_has_lazysbcount(mp));
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 5042f5ab74a4..c0cab61f9f9a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1604,6 +1604,28 @@ static inline bool sb_start_intwrite_trylock(struct super_block *sb)
>  	return __sb_start_write_trylock(sb, SB_FREEZE_FS);
>  }
>  
> +/**
> + * sb_is_frozen - is superblock frozen
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen.
> + */
> +static inline bool sb_is_frozen(struct super_block *sb)
> +{
> +	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
> +}
> +
> +/**
> + * sb_is_unfrozen - is superblock unfrozen
> + * @sb: the super to check
> + *
> + * Returns true if the super is unfrozen.
> + */
> +static inline bool sb_is_unfrozen(struct super_block *sb)
> +{
> +	return sb->s_writers.frozen == SB_UNFROZEN;
> +}
> +
>  bool inode_owner_or_capable(struct user_namespace *mnt_userns,
>  			    const struct inode *inode);
>  
> -- 
> 2.35.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-01-18  2:25     ` Darrick J. Wong
  -1 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-01-18  2:25 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, xfs

[add linux-xfs to cc on this one]

On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> Userspace can initiate a freeze call using ioctls. If the kernel decides
> to freeze a filesystem later it must be able to distinguish if userspace
> had initiated the freeze, so that it does not unfreeze it later
> automatically on resume.

Hm.  Zooming out a bit here, I want to think about how kernel freezes
should behave...

> Likewise if the kernel is initiating a freeze on its own it should *not*
> fail to freeze a filesystem if a user had already frozen it on our behalf.

...because kernel freezes can absorb an existing userspace freeze.  Does
that mean that userspace should be prevented from undoing a kernel
freeze?  Even in that absorption case?

Also, should we permit multiple kernel freezes of the same fs at the
same time?  And if we do allow that, would they nest like freeze used to
do?

(My suggestions here are 'yes', 'yes', and '**** no'.)

The reason I ask (besides wanting to drop the xfs vs. suspend fix
that I've been carrying for years) is that I've been playing in this
space in the online fsck patchset[1].

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/commit/?h=repair-fscounters&id=3f842a53b29f70502a4331b34decb06bf46130c8

For this somewhat different use case, I need to stabilize the free block
and inodes counters in the incore xfs superblock so that I can check and
repair them.  To do that, I've forked enough of the vfs freeze code
(yuck) to block the filesystem from updating those counters or starting
new transactions.  To prevent anyone /else/ from thawing the fs, I set
sb->s_writers.frozen to an unknown value (SB_FREEZE_COMPLETE + 1) for
the duration.

I /think/ these are pretty similar concepts, with two differences:

1. nobody else may thaw the fs while fsck is running

2. online fsck doesn't need to quiesce the log, which means that suspend
   must wait for fsck to finish

> This same concept applies to thawing, even if its not possible for
> userspace to beat the kernel in thawing a filesystem. This logic however
> has never applied to userspace freezing and thawing, two consecutive
> userspace freeze calls will results in only the first one succeeding, so
> we must retain the same behaviour in userspace.

(ISTR that we used to allow nested freezes, but that's been gone for
years.)

> This doesn't implement yet kernel initiated filesystem freeze calls,
> this will be done in subsequent calls. This change should introduce
> no functional changes, it just extends the definitions of a frozen
> filesystem to account for future kernel initiated filesystem freeze
> and let's us keep record of when userpace initiated it so the kernel
> can respect a userspace initiated freeze upon kernel initiated freeze
> and its respective thaw cycle.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  block/bdev.c       |  4 ++--
>  fs/f2fs/gc.c       |  4 ++--
>  fs/gfs2/glops.c    |  2 +-
>  fs/gfs2/super.c    |  2 +-
>  fs/gfs2/sys.c      |  4 ++--
>  fs/gfs2/util.c     |  2 +-
>  fs/ioctl.c         |  4 ++--
>  fs/super.c         | 31 ++++++++++++++++++++++++++-----
>  include/linux/fs.h | 16 ++++++++++++++--
>  9 files changed, 51 insertions(+), 18 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 8fd3a7991c02..668ebf2015bf 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
>  	if (sb->s_op->freeze_super)
>  		error = sb->s_op->freeze_super(sb);
>  	else
> -		error = freeze_super(sb);
> +		error = freeze_super(sb, true);
>  	deactivate_locked_super(sb);
>  
>  	if (error) {
> @@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
>  	else
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  	if (error)
>  		bdev->bd_fsfreeze_count++;
>  	else
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 4c681fe487ee..8eac3042786b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	if (!get_active_super(sbi->sb->s_bdev))
>  		return -ENOTTY;
> -	freeze_super(sbi->sb);
> +	freeze_super(sbi->sb, true);
>  
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
> @@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
>  	/* We use the same active reference from freeze */
> -	thaw_super(sbi->sb);
> +	thaw_super(sbi->sb, true);
>  	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 081422644ec5..62a7e0693efa 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
>  	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
>  	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
>  		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 48df7b276b64..9c55b8042aa4 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		gfs2_assert_withdraw(sdp, 0);
>  	} else {
>  		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index b98be03d0d1e..69514294215b 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  
>  	switch (n) {
>  	case 0:
> -		error = thaw_super(sdp->sd_vfs);
> +		error = thaw_super(sdp->sd_vfs, true);
>  		break;
>  	case 1:
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		break;
>  	default:
>  		deactivate_locked_super(sb);
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 3a0cd5e9ad84..be9705d618ec 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>  		/* Make sure gfs2_unfreeze works if partially-frozen */
>  		flush_work(&sdp->sd_freeze_work);
>  		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> -		thaw_super(sdp->sd_vfs);
> +		thaw_super(sdp->sd_vfs, true);
>  	} else {
>  		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
>  			    TASK_UNINTERRUPTIBLE);
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 3d2536e1ea58..0ac1622785ad 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
>  		ret = sb->s_op->freeze_super(sb);
> -	ret = freeze_super(sb);
> +	ret = freeze_super(sb, true);
>  
>  	deactivate_locked_super(sb);
>  
> @@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
>  	/* Thaw */
>  	if (sb->s_op->thaw_super)
>  		return sb->s_op->thaw_super(sb);
> -	return thaw_super(sb);
> +	return thaw_super(sb, true);
>  }
>  
>  static int ioctl_file_dedupe_range(struct file *file,
> diff --git a/fs/super.c b/fs/super.c
> index fdcf5a87af0a..0d6b4de8da88 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>  		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super(sb);
> +		thaw_super(sb, true);
>  	}
>  	deactivate_locked_super(sb);
>  }
> @@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @usercall: whether or not userspace initiated this via an ioctl or if it
> + * 	was a kernel freeze
>   *
>   * Syncs the super to make sure the filesystem is consistent and calls the fs's
>   * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> @@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>   *
>   * sb->s_writers.frozen is protected by sb->s_umount.
>   */
> -int freeze_super(struct super_block *sb)
> +int freeze_super(struct super_block *sb, bool usercall)
>  {
>  	int ret;
>  
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!usercall && sb_is_frozen(sb))
> +		return 0;

Hrm.  Are user freezes capable of thawing a kernel freeze?  Let's say
the following happens:

1. userspace calls FIFREEZE

2. kernel calls freeze_super(, true) due to suspend

3. "Freezing filesystems..." step completes, process gets preempted

4. userspace calls FITHAW

AFAICT at this point the fs is now thawed, but the freezer thinks it
finished freezing all filesystems.  That's not good, I don't think.

Also: does hibernation need to wake the fs back up?  I hope it doesn't,
but I do not know.

> +
> +	if (!sb_is_unfrozen(sb))
>  		return -EBUSY;
>  
>  	if (!(sb->s_flags & SB_BORN))
> @@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +		sb->s_writers.frozen_by_user = usercall;
>  		return 0;
>  	}
>  
> @@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
>  	ret = sync_filesystem(sb);
>  	if (ret) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
>  		return ret;
> @@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
>  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +	sb->s_writers.frozen_by_user = usercall;
>  	lockdep_sb_freeze_release(sb);
>  	return 0;
>  }
> @@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
>  /**
>   * thaw_super -- unlock filesystem
>   * @sb: the super to thaw
> + * @usercall: whether or not userspace initiated this thaw or if it was the
> + * 	kernel which initiated it
>   *
>   * Unlocks the filesystem and marks it writeable again after freeze_super().
>   */
> -int thaw_super(struct super_block *sb)
> +int thaw_super(struct super_block *sb, bool usercall)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
> +	if (!usercall) {
> +		/*
> +		 * If userspace initiated the freeze don't let the kernel
> +		 * thaw it on return from a kernel initiated freeze.
> +		 */
> +		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
> +			return 0;
> +	}
> +
> +	if (!sb_is_frozen(sb))
>  		return -EINVAL;

I guess the downside of implementing my ramblings above is that now
userspace can freeze and thaw the fs, and the thaw can return EINVAL
because the program is racing with a suspend.

--D

>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		goto out;
>  	}
>  
> @@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
>  	}
>  
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	sb->s_writers.frozen_by_user = false;
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c0cab61f9f9a..3b2586de4364 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1129,6 +1129,7 @@ enum {
>  
>  struct sb_writers {
>  	int				frozen;		/* Is sb frozen? */
> +	bool				frozen_by_user;	/* User freeze? */
>  	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
>  	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
>  };
> @@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
>  	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
>  }
>  
> +/**
> + * sb_is_frozen_by_user - was the superblock frozen by userspace?
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen by userspace, such as an ioctl.
> + */
> +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> +{
> +	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
> +}
> +
>  /**
>   * sb_is_unfrozen - is superblock unfrozen
>   * @sb: the super to check
> @@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
>  extern int vfs_statfs(const struct path *, struct kstatfs *);
>  extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
> -extern int freeze_super(struct super_block *super);
> -extern int thaw_super(struct super_block *super);
> +extern int freeze_super(struct super_block *super, bool usercall);
> +extern int thaw_super(struct super_block *super, bool usercall);
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-01-18  2:25     ` Darrick J. Wong
  0 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-01-18  2:25 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, xfs

[add linux-xfs to cc on this one]

On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> Userspace can initiate a freeze call using ioctls. If the kernel decides
> to freeze a filesystem later it must be able to distinguish if userspace
> had initiated the freeze, so that it does not unfreeze it later
> automatically on resume.

Hm.  Zooming out a bit here, I want to think about how kernel freezes
should behave...

> Likewise if the kernel is initiating a freeze on its own it should *not*
> fail to freeze a filesystem if a user had already frozen it on our behalf.

...because kernel freezes can absorb an existing userspace freeze.  Does
that mean that userspace should be prevented from undoing a kernel
freeze?  Even in that absorption case?

Also, should we permit multiple kernel freezes of the same fs at the
same time?  And if we do allow that, would they nest like freeze used to
do?

(My suggestions here are 'yes', 'yes', and '**** no'.)

The reason I ask (besides wanting to drop the xfs vs. suspend fix
that I've been carrying for years) is that I've been playing in this
space in the online fsck patchset[1].

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/commit/?h=repair-fscounters&id=3f842a53b29f70502a4331b34decb06bf46130c8

For this somewhat different use case, I need to stabilize the free block
and inodes counters in the incore xfs superblock so that I can check and
repair them.  To do that, I've forked enough of the vfs freeze code
(yuck) to block the filesystem from updating those counters or starting
new transactions.  To prevent anyone /else/ from thawing the fs, I set
sb->s_writers.frozen to an unknown value (SB_FREEZE_COMPLETE + 1) for
the duration.

I /think/ these are pretty similar concepts, with two differences:

1. nobody else may thaw the fs while fsck is running

2. online fsck doesn't need to quiesce the log, which means that suspend
   must wait for fsck to finish

> This same concept applies to thawing, even if its not possible for
> userspace to beat the kernel in thawing a filesystem. This logic however
> has never applied to userspace freezing and thawing, two consecutive
> userspace freeze calls will results in only the first one succeeding, so
> we must retain the same behaviour in userspace.

(ISTR that we used to allow nested freezes, but that's been gone for
years.)

> This doesn't implement yet kernel initiated filesystem freeze calls,
> this will be done in subsequent calls. This change should introduce
> no functional changes, it just extends the definitions of a frozen
> filesystem to account for future kernel initiated filesystem freeze
> and let's us keep record of when userpace initiated it so the kernel
> can respect a userspace initiated freeze upon kernel initiated freeze
> and its respective thaw cycle.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  block/bdev.c       |  4 ++--
>  fs/f2fs/gc.c       |  4 ++--
>  fs/gfs2/glops.c    |  2 +-
>  fs/gfs2/super.c    |  2 +-
>  fs/gfs2/sys.c      |  4 ++--
>  fs/gfs2/util.c     |  2 +-
>  fs/ioctl.c         |  4 ++--
>  fs/super.c         | 31 ++++++++++++++++++++++++++-----
>  include/linux/fs.h | 16 ++++++++++++++--
>  9 files changed, 51 insertions(+), 18 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 8fd3a7991c02..668ebf2015bf 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -250,7 +250,7 @@ int freeze_bdev(struct block_device *bdev)
>  	if (sb->s_op->freeze_super)
>  		error = sb->s_op->freeze_super(sb);
>  	else
> -		error = freeze_super(sb);
> +		error = freeze_super(sb, true);
>  	deactivate_locked_super(sb);
>  
>  	if (error) {
> @@ -295,7 +295,7 @@ int thaw_bdev(struct block_device *bdev)
>  	if (sb->s_op->thaw_super)
>  		error = sb->s_op->thaw_super(sb);
>  	else
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  	if (error)
>  		bdev->bd_fsfreeze_count++;
>  	else
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 4c681fe487ee..8eac3042786b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2141,7 +2141,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  
>  	if (!get_active_super(sbi->sb->s_bdev))
>  		return -ENOTTY;
> -	freeze_super(sbi->sb);
> +	freeze_super(sbi->sb, true);
>  
>  	f2fs_down_write(&sbi->gc_lock);
>  	f2fs_down_write(&sbi->cp_global_sem);
> @@ -2194,7 +2194,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
>  	f2fs_up_write(&sbi->cp_global_sem);
>  	f2fs_up_write(&sbi->gc_lock);
>  	/* We use the same active reference from freeze */
> -	thaw_super(sbi->sb);
> +	thaw_super(sbi->sb, true);
>  	deactivate_locked_super(sbi->sb);
>  	return err;
>  }
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index 081422644ec5..62a7e0693efa 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -574,7 +574,7 @@ static int freeze_go_sync(struct gfs2_glock *gl)
>  	if (gl->gl_state == LM_ST_SHARED && !gfs2_withdrawn(sdp) &&
>  	    !test_bit(SDF_NORECOVERY, &sdp->sd_flags)) {
>  		atomic_set(&sdp->sd_freeze_state, SFS_STARTING_FREEZE);
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 48df7b276b64..9c55b8042aa4 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -672,7 +672,7 @@ void gfs2_freeze_func(struct work_struct *work)
>  		gfs2_assert_withdraw(sdp, 0);
>  	} else {
>  		atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN);
> -		error = thaw_super(sb);
> +		error = thaw_super(sb, true);
>  		if (error) {
>  			fs_info(sdp, "GFS2: couldn't thaw filesystem: %d\n",
>  				error);
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index b98be03d0d1e..69514294215b 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -167,10 +167,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
>  
>  	switch (n) {
>  	case 0:
> -		error = thaw_super(sdp->sd_vfs);
> +		error = thaw_super(sdp->sd_vfs, true);
>  		break;
>  	case 1:
> -		error = freeze_super(sdp->sd_vfs);
> +		error = freeze_super(sdp->sd_vfs, true);
>  		break;
>  	default:
>  		deactivate_locked_super(sb);
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 3a0cd5e9ad84..be9705d618ec 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -191,7 +191,7 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
>  		/* Make sure gfs2_unfreeze works if partially-frozen */
>  		flush_work(&sdp->sd_freeze_work);
>  		atomic_set(&sdp->sd_freeze_state, SFS_FROZEN);
> -		thaw_super(sdp->sd_vfs);
> +		thaw_super(sdp->sd_vfs, true);
>  	} else {
>  		wait_on_bit(&i_gl->gl_flags, GLF_DEMOTE,
>  			    TASK_UNINTERRUPTIBLE);
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 3d2536e1ea58..0ac1622785ad 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -401,7 +401,7 @@ static int ioctl_fsfreeze(struct file *filp)
>  	/* Freeze */
>  	if (sb->s_op->freeze_super)
>  		ret = sb->s_op->freeze_super(sb);
> -	ret = freeze_super(sb);
> +	ret = freeze_super(sb, true);
>  
>  	deactivate_locked_super(sb);
>  
> @@ -418,7 +418,7 @@ static int ioctl_fsthaw(struct file *filp)
>  	/* Thaw */
>  	if (sb->s_op->thaw_super)
>  		return sb->s_op->thaw_super(sb);
> -	return thaw_super(sb);
> +	return thaw_super(sb, true);
>  }
>  
>  static int ioctl_file_dedupe_range(struct file *file,
> diff --git a/fs/super.c b/fs/super.c
> index fdcf5a87af0a..0d6b4de8da88 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1004,7 +1004,7 @@ static void do_thaw_all_callback(struct super_block *sb)
>  		return;
>  	if (sb->s_root && sb->s_flags & SB_BORN) {
>  		emergency_thaw_bdev(sb);
> -		thaw_super(sb);
> +		thaw_super(sb, true);
>  	}
>  	deactivate_locked_super(sb);
>  }
> @@ -1614,6 +1614,8 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>  /**
>   * freeze_super - lock the filesystem and force it into a consistent state
>   * @sb: the super to lock
> + * @usercall: whether or not userspace initiated this via an ioctl or if it
> + * 	was a kernel freeze
>   *
>   * Syncs the super to make sure the filesystem is consistent and calls the fs's
>   * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> @@ -1644,11 +1646,14 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
>   *
>   * sb->s_writers.frozen is protected by sb->s_umount.
>   */
> -int freeze_super(struct super_block *sb)
> +int freeze_super(struct super_block *sb, bool usercall)
>  {
>  	int ret;
>  
> -	if (sb->s_writers.frozen != SB_UNFROZEN)
> +	if (!usercall && sb_is_frozen(sb))
> +		return 0;

Hrm.  Are user freezes capable of thawing a kernel freeze?  Let's say
the following happens:

1. userspace calls FIFREEZE

2. kernel calls freeze_super(, true) due to suspend

3. "Freezing filesystems..." step completes, process gets preempted

4. userspace calls FITHAW

AFAICT at this point the fs is now thawed, but the freezer thinks it
finished freezing all filesystems.  That's not good, I don't think.

Also: does hibernation need to wake the fs back up?  I hope it doesn't,
but I do not know.

> +
> +	if (!sb_is_unfrozen(sb))
>  		return -EBUSY;
>  
>  	if (!(sb->s_flags & SB_BORN))
> @@ -1657,6 +1662,7 @@ int freeze_super(struct super_block *sb)
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +		sb->s_writers.frozen_by_user = usercall;
>  		return 0;
>  	}
>  
> @@ -1674,6 +1680,7 @@ int freeze_super(struct super_block *sb)
>  	ret = sync_filesystem(sb);
>  	if (ret) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
>  		wake_up(&sb->s_writers.wait_unfrozen);
>  		return ret;
> @@ -1699,6 +1706,7 @@ int freeze_super(struct super_block *sb)
>  	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> +	sb->s_writers.frozen_by_user = usercall;
>  	lockdep_sb_freeze_release(sb);
>  	return 0;
>  }
> @@ -1707,18 +1715,30 @@ EXPORT_SYMBOL(freeze_super);
>  /**
>   * thaw_super -- unlock filesystem
>   * @sb: the super to thaw
> + * @usercall: whether or not userspace initiated this thaw or if it was the
> + * 	kernel which initiated it
>   *
>   * Unlocks the filesystem and marks it writeable again after freeze_super().
>   */
> -int thaw_super(struct super_block *sb)
> +int thaw_super(struct super_block *sb, bool usercall)
>  {
>  	int error;
>  
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
> +	if (!usercall) {
> +		/*
> +		 * If userspace initiated the freeze don't let the kernel
> +		 * thaw it on return from a kernel initiated freeze.
> +		 */
> +		if (sb_is_unfrozen(sb) || sb_is_frozen_by_user(sb))
> +			return 0;
> +	}
> +
> +	if (!sb_is_frozen(sb))
>  		return -EINVAL;

I guess the downside of implementing my ramblings above is that now
userspace can freeze and thaw the fs, and the thaw can return EINVAL
because the program is racing with a suspend.

--D

>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> +		sb->s_writers.frozen_by_user = false;
>  		goto out;
>  	}
>  
> @@ -1735,6 +1755,7 @@ int thaw_super(struct super_block *sb)
>  	}
>  
>  	sb->s_writers.frozen = SB_UNFROZEN;
> +	sb->s_writers.frozen_by_user = false;
>  	sb_freeze_unlock(sb, SB_FREEZE_FS);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c0cab61f9f9a..3b2586de4364 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1129,6 +1129,7 @@ enum {
>  
>  struct sb_writers {
>  	int				frozen;		/* Is sb frozen? */
> +	bool				frozen_by_user;	/* User freeze? */
>  	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
>  	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
>  };
> @@ -1615,6 +1616,17 @@ static inline bool sb_is_frozen(struct super_block *sb)
>  	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
>  }
>  
> +/**
> + * sb_is_frozen_by_user - was the superblock frozen by userspace?
> + * @sb: the super to check
> + *
> + * Returns true if the super is frozen by userspace, such as an ioctl.
> + */
> +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> +{
> +	return sb_is_frozen(sb) && sb->s_writers.frozen_by_user;
> +}
> +
>  /**
>   * sb_is_unfrozen - is superblock unfrozen
>   * @sb: the super to check
> @@ -2292,8 +2304,8 @@ extern int unregister_filesystem(struct file_system_type *);
>  extern int vfs_statfs(const struct path *, struct kstatfs *);
>  extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
> -extern int freeze_super(struct super_block *super);
> -extern int thaw_super(struct super_block *super);
> +extern int freeze_super(struct super_block *super, bool usercall);
> +extern int thaw_super(struct super_block *super, bool usercall);
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> -- 
> 2.35.1
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-01-18  2:25     ` Darrick J. Wong
@ 2023-01-18  9:28       ` Jan Kara
  -1 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-18  9:28 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Luis Chamberlain, hch, song, rafael, gregkh, viro, jack,
	bvanassche, ebiederm, mchehab, keescook, p.raghav, linux-fsdevel,
	kernel, kexec, linux-kernel, xfs

On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> [add linux-xfs to cc on this one]
> 
> On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > to freeze a filesystem later it must be able to distinguish if userspace
> > had initiated the freeze, so that it does not unfreeze it later
> > automatically on resume.
> 
> Hm.  Zooming out a bit here, I want to think about how kernel freezes
> should behave...
> 
> > Likewise if the kernel is initiating a freeze on its own it should *not*
> > fail to freeze a filesystem if a user had already frozen it on our behalf.
> 
> ...because kernel freezes can absorb an existing userspace freeze.  Does
> that mean that userspace should be prevented from undoing a kernel
> freeze?  Even in that absorption case?
> 
> Also, should we permit multiple kernel freezes of the same fs at the
> same time?  And if we do allow that, would they nest like freeze used to
> do?
> 
> (My suggestions here are 'yes', 'yes', and '**** no'.)

Yeah, makes sense to me. So I think the mental model to make things safe
is that there are two flags - frozen_by_user, frozen_by_kernel - and the
superblock is kept frozen as long as either of these is set.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-01-18  9:28       ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2023-01-18  9:28 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Luis Chamberlain, hch, song, rafael, gregkh, viro, jack,
	bvanassche, ebiederm, mchehab, keescook, p.raghav, linux-fsdevel,
	kernel, kexec, linux-kernel, xfs

On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> [add linux-xfs to cc on this one]
> 
> On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > to freeze a filesystem later it must be able to distinguish if userspace
> > had initiated the freeze, so that it does not unfreeze it later
> > automatically on resume.
> 
> Hm.  Zooming out a bit here, I want to think about how kernel freezes
> should behave...
> 
> > Likewise if the kernel is initiating a freeze on its own it should *not*
> > fail to freeze a filesystem if a user had already frozen it on our behalf.
> 
> ...because kernel freezes can absorb an existing userspace freeze.  Does
> that mean that userspace should be prevented from undoing a kernel
> freeze?  Even in that absorption case?
> 
> Also, should we permit multiple kernel freezes of the same fs at the
> same time?  And if we do allow that, would they nest like freeze used to
> do?
> 
> (My suggestions here are 'yes', 'yes', and '**** no'.)

Yeah, makes sense to me. So I think the mental model to make things safe
is that there are two flags - frozen_by_user, frozen_by_kernel - and the
superblock is kept frozen as long as either of these is set.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  2023-01-14  0:33   ` Luis Chamberlain
@ 2023-02-24  3:08     ` Darrick J. Wong
  -1 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-02-24  3:08 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> Add support to automatically handle freezing and thawing filesystems
> during the kernel's suspend/resume cycle.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting during suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.

Hooray!

One evil question though --

Say you have dm devices A and B.  Each has a distinct fs on it.
If you mount A and then B and initiate a suspend, that should result in
first B and then A freezing, right?

After resuming, you then change A's dm-table definition to point it
at a loop device backed by a file on B.

What happens now when you initiate a suspend?  B freezes, then A tries
to flush data to the loop-mounted file on B, but it's too late for that.
That sounds like a deadlock?

Though I don't know how much we care about this corner case, since (a)
freezing has been busted on xfs for years and (b) one can think up all
sorts of horrid ouroborous scenarios like:

Change A's dm-table to point to a loop-mounted file on B, and changing B
to point to a loop-mounted file on A.  Then try to write to either
filesystem and see what kind of storm you get.

Anyway, just wondering if you'd thought about that kind of doomsday
scenario that a nutty sysadmin could set up.

The only way I can think of to solve that kind of thing would be to hook
filesystems and loop devices into the device model, make fs "device"
suspend actually freeze, hope the suspend code suspends from the leaves
inward, and hope I actually understand how the device model works (I
don't.)

--D

> This does not remove the superflous freezer calls on all filesystems.
> Each filesystem must remove all the kthread freezer stuff and peg
> the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> flag.
> 
> Subsequent patches remove the kthread freezer usage from each
> filesystem, one at a time to make all this work bisectable.
> Once all filesystems remove the usage of the kthread freezer we
> can remove the FS_AUTOFREEZE flag.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 14 +++++++++
>  kernel/power/process.c | 15 ++++++++-
>  3 files changed, 97 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 2f77fcb6e555..e8af4c8269ad 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
>  	return 0;
>  }
>  EXPORT_SYMBOL(thaw_super);
> +
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;
> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;
> +
> +	return true;
> +}
> +
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): freezing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> +
> +	error = freeze_super(sb, false);
> +	if (!error)
> +		lockdep_sb_freeze_release(sb);
> +	else if (error != -EBUSY)
> +		pr_notice("%s (%s): Unable to freeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): thawing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
> +
> +	error = thaw_super(sb, false);
> +	if (error && error != -EBUSY)
> +		pr_notice("%s (%s): Unable to unfreeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f168e72f6ca1..e5bee359e804 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2231,6 +2231,7 @@ struct file_system_type {
>  #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
>  #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
>  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
> +#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
>  	int (*init_fs_context)(struct fs_context *);
>  	const struct fs_parameter_spec *parameters;
>  	struct dentry *(*mount) (struct file_system_type *, int,
> @@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
>  extern int freeze_super(struct super_block *super, bool usercall);
>  extern int thaw_super(struct super_block *super, bool usercall);
> +#ifdef CONFIG_PM_SLEEP
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
> +#else
> +static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +#endif
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 6c1c7e566d35..1dd6b0b6b4e5 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -140,6 +140,16 @@ int freeze_processes(void)
>  
>  	BUG_ON(in_atomic());
>  
> +	pr_info("Freezing filesystems ... ");
> +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> +	if (error) {
> +		pr_cont("failed\n");
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> +		thaw_processes();
> +		return error;
> +	}
> +	pr_cont("done.\n");
> +
>  	/*
>  	 * Now that the whole userspace is frozen we need to disable
>  	 * the OOM killer to disallow any further interference with
> @@ -149,8 +159,10 @@ int freeze_processes(void)
>  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
>  		error = -EBUSY;
>  
> -	if (error)
> +	if (error) {
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  		thaw_processes();
> +	}
>  	return error;
>  }
>  
> @@ -188,6 +200,7 @@ void thaw_processes(void)
>  	pm_nosig_freezing = false;
>  
>  	oom_killer_enable();
> +	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  
>  	pr_info("Restarting tasks ... ");
>  
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
@ 2023-02-24  3:08     ` Darrick J. Wong
  0 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-02-24  3:08 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> Add support to automatically handle freezing and thawing filesystems
> during the kernel's suspend/resume cycle.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting during suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.

Hooray!

One evil question though --

Say you have dm devices A and B.  Each has a distinct fs on it.
If you mount A and then B and initiate a suspend, that should result in
first B and then A freezing, right?

After resuming, you then change A's dm-table definition to point it
at a loop device backed by a file on B.

What happens now when you initiate a suspend?  B freezes, then A tries
to flush data to the loop-mounted file on B, but it's too late for that.
That sounds like a deadlock?

Though I don't know how much we care about this corner case, since (a)
freezing has been busted on xfs for years and (b) one can think up all
sorts of horrid ouroborous scenarios like:

Change A's dm-table to point to a loop-mounted file on B, and changing B
to point to a loop-mounted file on A.  Then try to write to either
filesystem and see what kind of storm you get.

Anyway, just wondering if you'd thought about that kind of doomsday
scenario that a nutty sysadmin could set up.

The only way I can think of to solve that kind of thing would be to hook
filesystems and loop devices into the device model, make fs "device"
suspend actually freeze, hope the suspend code suspends from the leaves
inward, and hope I actually understand how the device model works (I
don't.)

--D

> This does not remove the superflous freezer calls on all filesystems.
> Each filesystem must remove all the kthread freezer stuff and peg
> the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> flag.
> 
> Subsequent patches remove the kthread freezer usage from each
> filesystem, one at a time to make all this work bisectable.
> Once all filesystems remove the usage of the kthread freezer we
> can remove the FS_AUTOFREEZE flag.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  fs/super.c             | 69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 14 +++++++++
>  kernel/power/process.c | 15 ++++++++-
>  3 files changed, 97 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 2f77fcb6e555..e8af4c8269ad 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1853,3 +1853,72 @@ int thaw_super(struct super_block *sb, bool usercall)
>  	return 0;
>  }
>  EXPORT_SYMBOL(thaw_super);
> +
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
> +		return false;
> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;
> +
> +	return true;
> +}
> +
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): freezing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> +
> +	error = freeze_super(sb, false);
> +	if (!error)
> +		lockdep_sb_freeze_release(sb);
> +	else if (error != -EBUSY)
> +		pr_notice("%s (%s): Unable to freeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	if (!grab_lock_super(sb)) {
> +		pr_err("%s (%s): thawing failed to grab_super()\n",
> +		       sb->s_type->name, sb->s_id);
> +		return -ENOTTY;
> +	}
> +
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
> +
> +	error = thaw_super(sb, false);
> +	if (error && error != -EBUSY)
> +		pr_notice("%s (%s): Unable to unfreeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +
> +out:
> +	deactivate_locked_super(sb);
> +	return error;
> +}
> +
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f168e72f6ca1..e5bee359e804 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2231,6 +2231,7 @@ struct file_system_type {
>  #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
>  #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
>  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
> +#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
>  	int (*init_fs_context)(struct fs_context *);
>  	const struct fs_parameter_spec *parameters;
>  	struct dentry *(*mount) (struct file_system_type *, int,
> @@ -2306,6 +2307,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
>  extern int fd_statfs(int, struct kstatfs *);
>  extern int freeze_super(struct super_block *super, bool usercall);
>  extern int thaw_super(struct super_block *super, bool usercall);
> +#ifdef CONFIG_PM_SLEEP
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
> +int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
> +#else
> +static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> +{
> +	return 0;
> +}
> +#endif
>  extern __printf(2, 3)
>  int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
>  extern int super_setup_bdi(struct super_block *sb);
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 6c1c7e566d35..1dd6b0b6b4e5 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -140,6 +140,16 @@ int freeze_processes(void)
>  
>  	BUG_ON(in_atomic());
>  
> +	pr_info("Freezing filesystems ... ");
> +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> +	if (error) {
> +		pr_cont("failed\n");
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> +		thaw_processes();
> +		return error;
> +	}
> +	pr_cont("done.\n");
> +
>  	/*
>  	 * Now that the whole userspace is frozen we need to disable
>  	 * the OOM killer to disallow any further interference with
> @@ -149,8 +159,10 @@ int freeze_processes(void)
>  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
>  		error = -EBUSY;
>  
> -	if (error)
> +	if (error) {
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  		thaw_processes();
> +	}
>  	return error;
>  }
>  
> @@ -188,6 +200,7 @@ void thaw_processes(void)
>  	pm_nosig_freezing = false;
>  
>  	oom_killer_enable();
> +	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  
>  	pr_info("Restarting tasks ... ");
>  
> -- 
> 2.35.1
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
  2023-01-16 15:14     ` Jan Kara
@ 2023-05-07  3:47       ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  3:47 UTC (permalink / raw)
  To: Jan Kara, Darrick J. Wong, Andreas Gruenbacher, Alexander Viro
  Cc: hch, djwong, song, rafael, gregkh, viro, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Mon, Jan 16, 2023 at 04:14:55PM +0100, Jan Kara wrote:
> So I  think we may need to also block attempts to unmount frozen filesystem -
> actually GFS2 needs this as well [1].
> 
> [1] lore.kernel.org/r/20221129230736.3462830-1-agruenba@redhat.com

Yes, I reviewed Andreas's patch and I think we end up just complicating
things by allowing us to continue to "support" unmounting frozen
filesystems. Instead it is easier for us to just block that insanity.

Current attempt / non-boot tested or anything.

From: Luis Chamberlain <mcgrof@kernel.org>
Date: Sat, 6 May 2023 20:13:49 -0700
Subject: [RFC] fs: prevent mount / umount of frozen filesystems

Today you can unmount a frozen filesystem. Doing that turns it into
a zombie filesystem, you cannot shut it down until first you remounting
it and then unthawing it.

Enabling this sort of behaviour is madness.

Simplify this by instead just preventing us to unmount frozen
filesystems, and likewise prevent mounting frozen filesystems.

Suggested-by: Jan Kara <jack@suse.cz>
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/namespace.c |  3 +++
 fs/super.c     | 14 ++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 54847db5b819..9c21d8662fc8 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1636,6 +1636,9 @@ static int do_umount(struct mount *mnt, int flags)
 	if (retval)
 		return retval;
 
+	if (!(sb_is_unfrozen(sb)))
+		return -EBUSY;
+
 	/*
 	 * Allow userspace to request a mountpoint be expired rather than
 	 * unmounting unconditionally. Unmount only happens if:
diff --git a/fs/super.c b/fs/super.c
index 34afe411cf2b..55f5728f5090 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -441,6 +441,7 @@ void retire_super(struct super_block *sb)
 {
 	WARN_ON(!sb->s_bdev);
 	down_write(&sb->s_umount);
+	WARN_ON_ONCE(!(sb_is_unfrozen(sb)));
 	if (sb->s_iflags & SB_I_PERSB_BDI) {
 		bdi_unregister(sb->s_bdi);
 		sb->s_iflags &= ~SB_I_PERSB_BDI;
@@ -468,6 +469,7 @@ void generic_shutdown_super(struct super_block *sb)
 {
 	const struct super_operations *sop = sb->s_op;
 
+	WARN_ON_ONCE(!(sb_is_unfrozen(sb)));
 	if (sb->s_root) {
 		shrink_dcache_for_umount(sb);
 		sync_filesystem(sb);
@@ -1354,6 +1356,12 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
 	if (IS_ERR(s))
 		goto error_s;
 
+	if (!(sb_is_unfrozen(sb))) {
+			deactivate_locked_super(s);
+			error = -EBUSY;
+			goto error_bdev;
+	}
+
 	if (s->s_root) {
 		if ((flags ^ s->s_flags) & SB_RDONLY) {
 			deactivate_locked_super(s);
@@ -1473,6 +1481,10 @@ struct dentry *mount_single(struct file_system_type *fs_type,
 	s = sget(fs_type, compare_single, set_anon_super, flags, NULL);
 	if (IS_ERR(s))
 		return ERR_CAST(s);
+	if (!(sb_is_unfrozen(sb))) {
+		deactivate_locked_super(s);
+		return ERR_PTR(-EBUSY);
+	}
 	if (!s->s_root) {
 		error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
 		if (!error)
@@ -1522,6 +1534,8 @@ int vfs_get_tree(struct fs_context *fc)
 
 	sb = fc->root->d_sb;
 	WARN_ON(!sb->s_bdi);
+	if (!(sb_is_unfrozen(sb)))
+		return -EBUSY;
 
 	/*
 	 * Write barrier is for super_cache_count(). We place it before setting
-- 
2.39.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw
@ 2023-05-07  3:47       ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  3:47 UTC (permalink / raw)
  To: Jan Kara, Darrick J. Wong, Andreas Gruenbacher, Alexander Viro
  Cc: hch, djwong, song, rafael, gregkh, viro, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Mon, Jan 16, 2023 at 04:14:55PM +0100, Jan Kara wrote:
> So I  think we may need to also block attempts to unmount frozen filesystem -
> actually GFS2 needs this as well [1].
> 
> [1] lore.kernel.org/r/20221129230736.3462830-1-agruenba@redhat.com

Yes, I reviewed Andreas's patch and I think we end up just complicating
things by allowing us to continue to "support" unmounting frozen
filesystems. Instead it is easier for us to just block that insanity.

Current attempt / non-boot tested or anything.

From: Luis Chamberlain <mcgrof@kernel.org>
Date: Sat, 6 May 2023 20:13:49 -0700
Subject: [RFC] fs: prevent mount / umount of frozen filesystems

Today you can unmount a frozen filesystem. Doing that turns it into
a zombie filesystem, you cannot shut it down until first you remounting
it and then unthawing it.

Enabling this sort of behaviour is madness.

Simplify this by instead just preventing us to unmount frozen
filesystems, and likewise prevent mounting frozen filesystems.

Suggested-by: Jan Kara <jack@suse.cz>
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/namespace.c |  3 +++
 fs/super.c     | 14 ++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 54847db5b819..9c21d8662fc8 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1636,6 +1636,9 @@ static int do_umount(struct mount *mnt, int flags)
 	if (retval)
 		return retval;
 
+	if (!(sb_is_unfrozen(sb)))
+		return -EBUSY;
+
 	/*
 	 * Allow userspace to request a mountpoint be expired rather than
 	 * unmounting unconditionally. Unmount only happens if:
diff --git a/fs/super.c b/fs/super.c
index 34afe411cf2b..55f5728f5090 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -441,6 +441,7 @@ void retire_super(struct super_block *sb)
 {
 	WARN_ON(!sb->s_bdev);
 	down_write(&sb->s_umount);
+	WARN_ON_ONCE(!(sb_is_unfrozen(sb)));
 	if (sb->s_iflags & SB_I_PERSB_BDI) {
 		bdi_unregister(sb->s_bdi);
 		sb->s_iflags &= ~SB_I_PERSB_BDI;
@@ -468,6 +469,7 @@ void generic_shutdown_super(struct super_block *sb)
 {
 	const struct super_operations *sop = sb->s_op;
 
+	WARN_ON_ONCE(!(sb_is_unfrozen(sb)));
 	if (sb->s_root) {
 		shrink_dcache_for_umount(sb);
 		sync_filesystem(sb);
@@ -1354,6 +1356,12 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
 	if (IS_ERR(s))
 		goto error_s;
 
+	if (!(sb_is_unfrozen(sb))) {
+			deactivate_locked_super(s);
+			error = -EBUSY;
+			goto error_bdev;
+	}
+
 	if (s->s_root) {
 		if ((flags ^ s->s_flags) & SB_RDONLY) {
 			deactivate_locked_super(s);
@@ -1473,6 +1481,10 @@ struct dentry *mount_single(struct file_system_type *fs_type,
 	s = sget(fs_type, compare_single, set_anon_super, flags, NULL);
 	if (IS_ERR(s))
 		return ERR_CAST(s);
+	if (!(sb_is_unfrozen(sb))) {
+		deactivate_locked_super(s);
+		return ERR_PTR(-EBUSY);
+	}
 	if (!s->s_root) {
 		error = fill_super(s, data, flags & SB_SILENT ? 1 : 0);
 		if (!error)
@@ -1522,6 +1534,8 @@ int vfs_get_tree(struct fs_context *fc)
 
 	sb = fc->root->d_sb;
 	WARN_ON(!sb->s_bdi);
+	if (!(sb_is_unfrozen(sb)))
+		return -EBUSY;
 
 	/*
 	 * Write barrier is for super_cache_count(). We place it before setting
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
  2023-02-24  3:08     ` Darrick J. Wong
@ 2023-05-07  4:07       ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  4:07 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Thu, Feb 23, 2023 at 07:08:37PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> > Add support to automatically handle freezing and thawing filesystems
> > during the kernel's suspend/resume cycle.
> > 
> > This is needed so that we properly really stop IO in flight without
> > races after userspace has been frozen. Without this we rely on
> > kthread freezing and its semantics are loose and error prone.
> > For instance, even though a kthread may use try_to_freeze() and end
> > up being frozen we have no way of being sure that everything that
> > has been spawned asynchronously from it (such as timers) have also
> > been stopped as well.
> > 
> > A long term advantage of also adding filesystem freeze / thawing
> > supporting during suspend / hibernation is that long term we may
> > be able to eventually drop the kernel's thread freezing completely
> > as it was originally added to stop disk IO in flight as we hibernate
> > or suspend.
> 
> Hooray!
> 
> One evil question though --
> 
> Say you have dm devices A and B.  Each has a distinct fs on it.
> If you mount A and then B and initiate a suspend, that should result in
> first B and then A freezing, right?
> 
> After resuming, you then change A's dm-table definition to point it
> at a loop device backed by a file on B.
> 
> What happens now when you initiate a suspend?  B freezes, then A tries
> to flush data to the loop-mounted file on B, but it's too late for that.
> That sounds like a deadlock?
> 
> Though I don't know how much we care about this corner case,

As you suggest this is not the only corner case that one could draw
upon. There was that evil ioctl added years ago to allow flipping an
installed system bootted from a USB or ISO over to the real freshly
installed root mount point. To make this bullet-proof we'll need to
eventually add a simple graph implementation to keep tags on ordering
requirements for the super blocks. I have some C code which tries to
implement a graph Linux style but since these are all corner cases at
this time, I think it's best we fix first suspend for most and later
address a proper graph solution.

> Anyway, just wondering if you'd thought about that kind of doomsday
> scenario that a nutty sysadmin could set up.
> 
> The only way I can think of to solve that kind of thing would be to hook
> filesystems and loop devices into the device model, make fs "device"
> suspend actually freeze, hope the suspend code suspends from the leaves
> inward, and hope I actually understand how the device model works (I
> don't.)

There's probably really odd things one can do, and one thing I think
we can later do is simply annotate those cases and *not* allow auto-freeze
with time for those horrible situations.

A real long term solution I think will involve a graph.

  Luis

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing
@ 2023-05-07  4:07       ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  4:07 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: hch, song, rafael, gregkh, viro, jack, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel

On Thu, Feb 23, 2023 at 07:08:37PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote:
> > Add support to automatically handle freezing and thawing filesystems
> > during the kernel's suspend/resume cycle.
> > 
> > This is needed so that we properly really stop IO in flight without
> > races after userspace has been frozen. Without this we rely on
> > kthread freezing and its semantics are loose and error prone.
> > For instance, even though a kthread may use try_to_freeze() and end
> > up being frozen we have no way of being sure that everything that
> > has been spawned asynchronously from it (such as timers) have also
> > been stopped as well.
> > 
> > A long term advantage of also adding filesystem freeze / thawing
> > supporting during suspend / hibernation is that long term we may
> > be able to eventually drop the kernel's thread freezing completely
> > as it was originally added to stop disk IO in flight as we hibernate
> > or suspend.
> 
> Hooray!
> 
> One evil question though --
> 
> Say you have dm devices A and B.  Each has a distinct fs on it.
> If you mount A and then B and initiate a suspend, that should result in
> first B and then A freezing, right?
> 
> After resuming, you then change A's dm-table definition to point it
> at a loop device backed by a file on B.
> 
> What happens now when you initiate a suspend?  B freezes, then A tries
> to flush data to the loop-mounted file on B, but it's too late for that.
> That sounds like a deadlock?
> 
> Though I don't know how much we care about this corner case,

As you suggest this is not the only corner case that one could draw
upon. There was that evil ioctl added years ago to allow flipping an
installed system bootted from a USB or ISO over to the real freshly
installed root mount point. To make this bullet-proof we'll need to
eventually add a simple graph implementation to keep tags on ordering
requirements for the super blocks. I have some C code which tries to
implement a graph Linux style but since these are all corner cases at
this time, I think it's best we fix first suspend for most and later
address a proper graph solution.

> Anyway, just wondering if you'd thought about that kind of doomsday
> scenario that a nutty sysadmin could set up.
> 
> The only way I can think of to solve that kind of thing would be to hook
> filesystems and loop devices into the device model, make fs "device"
> suspend actually freeze, hope the suspend code suspends from the leaves
> inward, and hope I actually understand how the device model works (I
> don't.)

There's probably really odd things one can do, and one thing I think
we can later do is simply annotate those cases and *not* allow auto-freeze
with time for those horrible situations.

A real long term solution I think will involve a graph.

  Luis

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-01-18  9:28       ` Jan Kara
@ 2023-05-07  4:08         ` Luis Chamberlain
  -1 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  4:08 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, hch, song, rafael, gregkh, viro, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel, xfs

On Wed, Jan 18, 2023 at 10:28:12AM +0100, Jan Kara wrote:
> On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> > [add linux-xfs to cc on this one]
> > 
> > On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > > to freeze a filesystem later it must be able to distinguish if userspace
> > > had initiated the freeze, so that it does not unfreeze it later
> > > automatically on resume.
> > 
> > Hm.  Zooming out a bit here, I want to think about how kernel freezes
> > should behave...
> > 
> > > Likewise if the kernel is initiating a freeze on its own it should *not*
> > > fail to freeze a filesystem if a user had already frozen it on our behalf.
> > 
> > ...because kernel freezes can absorb an existing userspace freeze.  Does
> > that mean that userspace should be prevented from undoing a kernel
> > freeze?  Even in that absorption case?
> > 
> > Also, should we permit multiple kernel freezes of the same fs at the
> > same time?  And if we do allow that, would they nest like freeze used to
> > do?
> > 
> > (My suggestions here are 'yes', 'yes', and '**** no'.)
> 
> Yeah, makes sense to me. So I think the mental model to make things safe
> is that there are two flags - frozen_by_user, frozen_by_kernel - and the
> superblock is kept frozen as long as either of these is set.

Makes sense to me.

  Luis

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-05-07  4:08         ` Luis Chamberlain
  0 siblings, 0 replies; 74+ messages in thread
From: Luis Chamberlain @ 2023-05-07  4:08 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, hch, song, rafael, gregkh, viro, bvanassche,
	ebiederm, mchehab, keescook, p.raghav, linux-fsdevel, kernel,
	kexec, linux-kernel, xfs

On Wed, Jan 18, 2023 at 10:28:12AM +0100, Jan Kara wrote:
> On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> > [add linux-xfs to cc on this one]
> > 
> > On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > > to freeze a filesystem later it must be able to distinguish if userspace
> > > had initiated the freeze, so that it does not unfreeze it later
> > > automatically on resume.
> > 
> > Hm.  Zooming out a bit here, I want to think about how kernel freezes
> > should behave...
> > 
> > > Likewise if the kernel is initiating a freeze on its own it should *not*
> > > fail to freeze a filesystem if a user had already frozen it on our behalf.
> > 
> > ...because kernel freezes can absorb an existing userspace freeze.  Does
> > that mean that userspace should be prevented from undoing a kernel
> > freeze?  Even in that absorption case?
> > 
> > Also, should we permit multiple kernel freezes of the same fs at the
> > same time?  And if we do allow that, would they nest like freeze used to
> > do?
> > 
> > (My suggestions here are 'yes', 'yes', and '**** no'.)
> 
> Yeah, makes sense to me. So I think the mental model to make things safe
> is that there are two flags - frozen_by_user, frozen_by_kernel - and the
> superblock is kept frozen as long as either of these is set.

Makes sense to me.

  Luis

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
  2023-05-07  4:08         ` Luis Chamberlain
@ 2023-05-23  0:33           ` Darrick J. Wong
  -1 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-05-23  0:33 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jan Kara, hch, song, rafael, gregkh, viro, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, xfs

On Sat, May 06, 2023 at 09:08:35PM -0700, Luis Chamberlain wrote:
> On Wed, Jan 18, 2023 at 10:28:12AM +0100, Jan Kara wrote:
> > On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> > > [add linux-xfs to cc on this one]
> > > 
> > > On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > > > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > > > to freeze a filesystem later it must be able to distinguish if userspace
> > > > had initiated the freeze, so that it does not unfreeze it later
> > > > automatically on resume.
> > > 
> > > Hm.  Zooming out a bit here, I want to think about how kernel freezes
> > > should behave...
> > > 
> > > > Likewise if the kernel is initiating a freeze on its own it should *not*
> > > > fail to freeze a filesystem if a user had already frozen it on our behalf.
> > > 
> > > ...because kernel freezes can absorb an existing userspace freeze.  Does
> > > that mean that userspace should be prevented from undoing a kernel
> > > freeze?  Even in that absorption case?
> > > 
> > > Also, should we permit multiple kernel freezes of the same fs at the
> > > same time?  And if we do allow that, would they nest like freeze used to
> > > do?
> > > 
> > > (My suggestions here are 'yes', 'yes', and '**** no'.)
> > 
> > Yeah, makes sense to me. So I think the mental model to make things safe
> > is that there are two flags - frozen_by_user, frozen_by_kernel - and the
> > superblock is kept frozen as long as either of these is set.
> 
> Makes sense to me.

Just sent a patch for this, sorry it took a couple of weeks while I was
busy merging in parent pointers...

--D

>   Luis

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze
@ 2023-05-23  0:33           ` Darrick J. Wong
  0 siblings, 0 replies; 74+ messages in thread
From: Darrick J. Wong @ 2023-05-23  0:33 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jan Kara, hch, song, rafael, gregkh, viro, bvanassche, ebiederm,
	mchehab, keescook, p.raghav, linux-fsdevel, kernel, kexec,
	linux-kernel, xfs

On Sat, May 06, 2023 at 09:08:35PM -0700, Luis Chamberlain wrote:
> On Wed, Jan 18, 2023 at 10:28:12AM +0100, Jan Kara wrote:
> > On Tue 17-01-23 18:25:40, Darrick J. Wong wrote:
> > > [add linux-xfs to cc on this one]
> > > 
> > > On Fri, Jan 13, 2023 at 04:33:48PM -0800, Luis Chamberlain wrote:
> > > > Userspace can initiate a freeze call using ioctls. If the kernel decides
> > > > to freeze a filesystem later it must be able to distinguish if userspace
> > > > had initiated the freeze, so that it does not unfreeze it later
> > > > automatically on resume.
> > > 
> > > Hm.  Zooming out a bit here, I want to think about how kernel freezes
> > > should behave...
> > > 
> > > > Likewise if the kernel is initiating a freeze on its own it should *not*
> > > > fail to freeze a filesystem if a user had already frozen it on our behalf.
> > > 
> > > ...because kernel freezes can absorb an existing userspace freeze.  Does
> > > that mean that userspace should be prevented from undoing a kernel
> > > freeze?  Even in that absorption case?
> > > 
> > > Also, should we permit multiple kernel freezes of the same fs at the
> > > same time?  And if we do allow that, would they nest like freeze used to
> > > do?
> > > 
> > > (My suggestions here are 'yes', 'yes', and '**** no'.)
> > 
> > Yeah, makes sense to me. So I think the mental model to make things safe
> > is that there are two flags - frozen_by_user, frozen_by_kernel - and the
> > superblock is kept frozen as long as either of these is set.
> 
> Makes sense to me.

Just sent a patch for this, sorry it took a couple of weeks while I was
busy merging in parent pointers...

--D

>   Luis

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2023-05-23  0:46 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-14  0:33 [RFC v3 00/24] vfs: provide automatic kernel freeze / resume Luis Chamberlain
2023-01-14  0:33 ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 01/24] fs: unify locking semantics for fs freeze / thaw Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-16 15:14   ` Jan Kara
2023-01-16 15:14     ` Jan Kara
2023-05-07  3:47     ` Luis Chamberlain
2023-05-07  3:47       ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 02/24] fs: add frozen sb state helpers Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-16 16:11   ` Jan Kara
2023-01-16 16:11     ` Jan Kara
2023-01-14  0:33 ` [RFC v3 03/24] fs: distinguish between user initiated freeze and kernel initiated freeze Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-16 16:10   ` Jan Kara
2023-01-16 16:10     ` Jan Kara
2023-01-18  2:25   ` Darrick J. Wong
2023-01-18  2:25     ` Darrick J. Wong
2023-01-18  9:28     ` Jan Kara
2023-01-18  9:28       ` Jan Kara
2023-05-07  4:08       ` Luis Chamberlain
2023-05-07  4:08         ` Luis Chamberlain
2023-05-23  0:33         ` Darrick J. Wong
2023-05-23  0:33           ` Darrick J. Wong
2023-01-14  0:33 ` [RFC v3 04/24] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:57   ` Luis Chamberlain
2023-01-14  0:57     ` Luis Chamberlain
2023-01-16 16:09   ` Jan Kara
2023-01-16 16:09     ` Jan Kara
2023-02-24  3:08   ` Darrick J. Wong
2023-02-24  3:08     ` Darrick J. Wong
2023-05-07  4:07     ` Luis Chamberlain
2023-05-07  4:07       ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 06/24] xfs: replace kthread freezing with auto fs freezing Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 07/24] btrfs: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 08/24] ext4: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 09/24] f2fs: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 10/24] cifs: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 11/24] gfs2: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 12/24] jfs: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 13/24] nilfs2: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:33 ` [RFC v3 14/24] nfs: " Luis Chamberlain
2023-01-14  0:33   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 15/24] nfsd: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 16/24] ubifs: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 17/24] ksmbd: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 18/24] jffs2: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 19/24] jbd2: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 20/24] coredump: drop freezer usage Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 21/24] ecryptfs: replace kthread freezing with auto fs freezing Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 22/24] fscache: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 23/24] lockd: " Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain
2023-01-14  0:34 ` [RFC v3 24/24] fs: remove FS_AUTOFREEZE Luis Chamberlain
2023-01-14  0:34   ` Luis Chamberlain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.