linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
@ 2017-11-29 23:23 Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 01/11] fs: provide unlocked helper for freeze_super() Luis R. Rodriguez
                   ` (14 more replies)
  0 siblings, 15 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This is a followup from the original RFC which proposed to start
to kill kthread freezing all together [0]. Instead of going straight
out to the jugular for kthread freezing this series only addresses
killing freezer calls on filesystems which implement freeze_fs, after
we let the kernel freeze these filesystems for us on suspend.

This approach puts on a slow but steady path towards the original goal
though. Each subsystem could look for similar solutions. Even with
filesystems we're not all done yet, after this we'll still have to
decide what to do about filesystems which do not implement freeze_fs().

Motivation and problem:

kthreads have some semantics for freezing, which helps the kernel
freeze them when a system is going to suspend or hibernation. These
semantics are not well defined though, and it actually turns out
pretty hard to get it right.

Without a proper solution suspend and hibernation are fragile on filesystems,
it can easily break suspend and fixing such issues are in no way trivial [1]
[2].

Proposed solution:

Instead of fixing such semantics and trying to get all filesystems to do it
right, we can easily do away with all freezing calls if the filesystem
implements a proper freeze_fs() callback. The following 9 filesystems have
freeze_fs() implemented as such we can let the kernel issue the callback upon
suspend and thaw on resume automatically on our behalf.

  o xfs
  o reiserfs
  o nilfs2
  o jfs
  o f2fs
  o ext4
  o ext2
  o btrfs

Of these, the following have freezer helpers, which can then be removed
after the kernel automaticaly calls freeze_fs for us on suspend:

  o xfs
  o nilfs2
  o jfs
  o f2fs
  o ext4

I've tested this on a system with ext4 and XFS, and have let 0-day go at
without issues. I have branches availabe for linux-next [3] and Linus'
latest tree [4].

Further testing, thoughts, reviews, flames are all equally appreciated.

[0] https://lkml.kernel.org/r/20171003185313.1017-1-mcgrof@kernel.org
[1] https://bugzilla.suse.com/show_bug.cgi?id=1043449
[2] https://lkml.kernel.org/r/20171113103139.GA18936@yu-chen.sh.intel.com
[3] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20171129-fs-freeze-cleanup
[4] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20171129-fs-freeze-cleanup

Luis R. Rodriguez (11):
  fs: provide unlocked helper for freeze_super()
  fs: provide unlocked helper thaw_super()
  fs: add frozen sb state helpers
  fs: distinguish between user initiated freeze and kernel initiated
    freeze
  fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  fs: freeze on suspend and thaw on resume
  xfs: remove not needed freezing calls
  ext4: remove not needed freezing calls
  f2fs: remove not needed freezing calls
  nilfs2: remove not needed freezing calls
  jfs: remove not needed freezing calls

 fs/ext4/ext4_jbd2.c    |   2 +-
 fs/ext4/super.c        |   2 -
 fs/f2fs/gc.c           |   5 +-
 fs/f2fs/segment.c      |   6 +-
 fs/jfs/jfs_logmgr.c    |  11 +-
 fs/jfs/jfs_txnmgr.c    |  31 ++---
 fs/nilfs2/segment.c    |  48 ++++----
 fs/super.c             | 320 ++++++++++++++++++++++++++++++++++++++++---------
 fs/xfs/xfs_trans.c     |   2 +-
 fs/xfs/xfs_trans_ail.c |   7 +-
 include/linux/fs.h     |  63 +++++++++-
 kernel/power/process.c |  15 ++-
 12 files changed, 378 insertions(+), 134 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 01/11] fs: provide unlocked helper for freeze_super()
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-30 16:58   ` Jan Kara
  2017-11-29 23:23 ` [PATCH 02/11] fs: provide unlocked helper thaw_super() Luis R. Rodriguez
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

freeze_super() holds a write lock, however we wish to also enable
callers which already hold the write lock. To do this provide a helper
and make freeze_super() use it. This way, all that freeze_super() does
now is lock handling and active count management.

This change has no functional changes.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c | 100 +++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 45 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index d4e33e8f1e6f..a7650ff22f0e 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1387,59 +1387,20 @@ static void sb_freeze_unlock(struct super_block *sb)
 		percpu_up_write(sb->s_writers.rw_sem + level);
 }
 
-/**
- * freeze_super - lock the filesystem and force it into a consistent state
- * @sb: the super to lock
- *
- * Syncs the super to make sure the filesystem is consistent and calls the fs's
- * freeze_fs.  Subsequent calls to this without first thawing the fs will return
- * -EBUSY.
- *
- * During this function, sb->s_writers.frozen goes through these values:
- *
- * SB_UNFROZEN: File system is normal, all writes progress as usual.
- *
- * SB_FREEZE_WRITE: The file system is in the process of being frozen.  New
- * writes should be blocked, though page faults are still allowed. We wait for
- * all writes to complete and then proceed to the next stage.
- *
- * SB_FREEZE_PAGEFAULT: Freezing continues. Now also page faults are blocked
- * but internal fs threads can still modify the filesystem (although they
- * should not dirty new pages or inodes), writeback can run etc. After waiting
- * for all running page faults we sync the filesystem which will clean all
- * dirty pages and inodes (no new dirty pages or inodes can be created when
- * sync is running).
- *
- * SB_FREEZE_FS: The file system is frozen. Now all internal sources of fs
- * modification are blocked (e.g. XFS preallocation truncation on inode
- * reclaim). This is usually implemented by blocking new transactions for
- * filesystems that have them and need this additional guard. After all
- * internal writers are finished we call ->freeze_fs() to finish filesystem
- * freezing. Then we transition to SB_FREEZE_COMPLETE state. This state is
- * mostly auxiliary for filesystems to verify they do not modify frozen fs.
- *
- * sb->s_writers.frozen is protected by sb->s_umount.
- */
-int freeze_super(struct super_block *sb)
+/* Caller takes lock and handles active count */
+static int freeze_locked_super(struct super_block *sb)
 {
 	int ret;
 
-	atomic_inc(&sb->s_active);
-	down_write(&sb->s_umount);
-	if (sb->s_writers.frozen != SB_UNFROZEN) {
-		deactivate_locked_super(sb);
+	if (sb->s_writers.frozen != SB_UNFROZEN)
 		return -EBUSY;
-	}
 
-	if (!(sb->s_flags & SB_BORN)) {
-		up_write(&sb->s_umount);
+	if (!(sb->s_flags & SB_BORN))
 		return 0;	/* sic - it's "nothing to do" */
-	}
 
 	if (sb_rdonly(sb)) {
 		/* Nothing to do really... */
 		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
-		up_write(&sb->s_umount);
 		return 0;
 	}
 
@@ -1468,7 +1429,6 @@ int freeze_super(struct super_block *sb)
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb);
 			wake_up(&sb->s_writers.wait_unfrozen);
-			deactivate_locked_super(sb);
 			return ret;
 		}
 	}
@@ -1478,9 +1438,59 @@ int freeze_super(struct super_block *sb)
 	 */
 	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
 	lockdep_sb_freeze_release(sb);
-	up_write(&sb->s_umount);
 	return 0;
 }
+
+/**
+ * freeze_super - lock the filesystem and force it into a consistent state
+ * @sb: the super to lock
+ *
+ * Syncs the super to make sure the filesystem is consistent and calls the fs's
+ * freeze_fs.  Subsequent calls to this without first thawing the fs will return
+ * -EBUSY.
+ *
+ * During this function, sb->s_writers.frozen goes through these values:
+ *
+ * SB_UNFROZEN: File system is normal, all writes progress as usual.
+ *
+ * SB_FREEZE_WRITE: The file system is in the process of being frozen.  New
+ * writes should be blocked, though page faults are still allowed. We wait for
+ * all writes to complete and then proceed to the next stage.
+ *
+ * SB_FREEZE_PAGEFAULT: Freezing continues. Now also page faults are blocked
+ * but internal fs threads can still modify the filesystem (although they
+ * should not dirty new pages or inodes), writeback can run etc. After waiting
+ * for all running page faults we sync the filesystem which will clean all
+ * dirty pages and inodes (no new dirty pages or inodes can be created when
+ * sync is running).
+ *
+ * SB_FREEZE_FS: The file system is frozen. Now all internal sources of fs
+ * modification are blocked (e.g. XFS preallocation truncation on inode
+ * reclaim). This is usually implemented by blocking new transactions for
+ * filesystems that have them and need this additional guard. After all
+ * internal writers are finished we call ->freeze_fs() to finish filesystem
+ * freezing. Then we transition to SB_FREEZE_COMPLETE state. This state is
+ * mostly auxiliary for filesystems to verify they do not modify frozen fs.
+ *
+ * sb->s_writers.frozen is protected by sb->s_umount.
+ */
+int freeze_super(struct super_block *sb)
+{
+	int error;
+
+	atomic_inc(&sb->s_active);
+
+	down_write(&sb->s_umount);
+	error = freeze_locked_super(sb);
+	if (error) {
+		deactivate_locked_super(sb);
+		goto out;
+	}
+	up_write(&sb->s_umount);
+
+out:
+	return error;
+}
 EXPORT_SYMBOL(freeze_super);
 
 /**
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 02/11] fs: provide unlocked helper thaw_super()
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 01/11] fs: provide unlocked helper for freeze_super() Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-30 16:59   ` Jan Kara
  2017-11-29 23:23 ` [PATCH 03/11] fs: add frozen sb state helpers Luis R. Rodriguez
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

thaw_super() hold a write lock, however we wish to also enable
callers which already hold the write lock. To do this provide a helper
and make thaw_super() use it. This way, all that thaw_super() does
now is lock handling and active count management.

This change has no functional changes.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index a7650ff22f0e..cecc279beecd 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1493,21 +1493,13 @@ int freeze_super(struct super_block *sb)
 }
 EXPORT_SYMBOL(freeze_super);
 
-/**
- * thaw_super -- unlock filesystem
- * @sb: the super to thaw
- *
- * Unlocks the filesystem and marks it writeable again after freeze_super().
- */
-int thaw_super(struct super_block *sb)
+/* Caller takes lock and handles active count */
+static int thaw_locked_super(struct super_block *sb)
 {
 	int error;
 
-	down_write(&sb->s_umount);
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
-		up_write(&sb->s_umount);
+	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
 		return -EINVAL;
-	}
 
 	if (sb_rdonly(sb)) {
 		sb->s_writers.frozen = SB_UNFROZEN;
@@ -1522,7 +1514,6 @@ int thaw_super(struct super_block *sb)
 			printk(KERN_ERR
 				"VFS:Filesystem thaw failed\n");
 			lockdep_sb_freeze_release(sb);
-			up_write(&sb->s_umount);
 			return error;
 		}
 	}
@@ -1531,7 +1522,29 @@ int thaw_super(struct super_block *sb)
 	sb_freeze_unlock(sb);
 out:
 	wake_up(&sb->s_writers.wait_unfrozen);
-	deactivate_locked_super(sb);
 	return 0;
 }
+
+/**
+ * thaw_super -- unlock filesystem
+ * @sb: the super to thaw
+ *
+ * Unlocks the filesystem and marks it writeable again after freeze_super().
+ */
+int thaw_super(struct super_block *sb)
+{
+	int error;
+
+	down_write(&sb->s_umount);
+	error = thaw_locked_super(sb);
+	if (error) {
+		up_write(&sb->s_umount);
+		goto out;
+	}
+
+	deactivate_locked_super(sb);
+
+out:
+	return error;
+}
 EXPORT_SYMBOL(thaw_super);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 03/11] fs: add frozen sb state helpers
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 01/11] fs: provide unlocked helper for freeze_super() Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 02/11] fs: provide unlocked helper thaw_super() Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-30 17:13   ` Jan Kara
  2017-11-29 23:23 ` [PATCH 04/11] fs: distinguish between user initiated freeze and kernel initiated freeze Luis R. Rodriguez
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

The question of whether or not a superblock is frozen needs to be
augmented in the future to account for differences between a user
initiated freeze and a kernel initiated freeze done automatically
on behalf of the kernel.

Provide helpers so that these can be used instead so that we don't
have to expand checks later in these same call sites as we expand
the definition of a frozen superblock.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/ext4/ext4_jbd2.c |  2 +-
 fs/super.c          |  4 ++--
 fs/xfs/xfs_trans.c  |  2 +-
 include/linux/fs.h  | 33 +++++++++++++++++++++++++++++++++
 4 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 2d593201cf7a..090b8cd4551a 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -50,7 +50,7 @@ static int ext4_journal_check_start(struct super_block *sb)
 
 	if (sb_rdonly(sb))
 		return -EROFS;
-	WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(sb_is_frozen(sb));
 	journal = EXT4_SB(sb)->s_journal;
 	/*
 	 * Special case here: if the journal has aborted behind our
diff --git a/fs/super.c b/fs/super.c
index cecc279beecd..e8f5a7139b8f 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1392,7 +1392,7 @@ static int freeze_locked_super(struct super_block *sb)
 {
 	int ret;
 
-	if (sb->s_writers.frozen != SB_UNFROZEN)
+	if (!sb_is_unfrozen(sb))
 		return -EBUSY;
 
 	if (!(sb->s_flags & SB_BORN))
@@ -1498,7 +1498,7 @@ static int thaw_locked_super(struct super_block *sb)
 {
 	int error;
 
-	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
+	if (!sb_is_frozen(sb))
 		return -EINVAL;
 
 	if (sb_rdonly(sb)) {
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index a87f657f59c9..b1180c26d902 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -241,7 +241,7 @@ xfs_trans_alloc(
 	if (!(flags & XFS_TRANS_NO_WRITECOUNT))
 		sb_start_intwrite(mp->m_super);
 
-	WARN_ON(mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
+	WARN_ON(sb_is_frozen(mp->m_super));
 	atomic_inc(&mp->m_active_trans);
 
 	tp = kmem_zone_zalloc(xfs_trans_zone,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 511fbaabf624..1e10239c1d3b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1589,6 +1589,39 @@ static inline void sb_start_intwrite(struct super_block *sb)
 	__sb_start_write(sb, SB_FREEZE_FS, true);
 }
 
+/**
+ * sb_is_frozen_by_user - is superblock frozen by a user call
+ * @sb: the super to check
+ *
+ * Returns true if the super freeze was initiated by userspace, for instance,
+ * an ioctl call.
+ */
+static inline bool sb_is_frozen_by_user(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
+}
+
+/**
+ * sb_is_frozen - is superblock frozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is frozen.
+ */
+static inline bool sb_is_frozen(struct super_block *sb)
+{
+	return sb_is_frozen_by_user(sb);
+}
+
+/**
+ * sb_is_unfrozen - is superblock unfrozen
+ * @sb: the super to check
+ *
+ * Returns true if the super is unfrozen.
+ */
+static inline bool sb_is_unfrozen(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_UNFROZEN;
+}
 
 extern bool inode_owner_or_capable(const struct inode *inode);
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 04/11] fs: distinguish between user initiated freeze and kernel initiated freeze
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 03/11] fs: add frozen sb state helpers Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis R. Rodriguez
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

Userspace can initiate a freeze call using ioctls. If the kernel decides
to freeze a filesystem later it must be able to distinguish if userspace
had initiated the freeze, so that it does not unfreeze it later
automatically on resume.

Likewise if the kernel is initiating a freeze on its own it should *not*
fail to freeze a filesystem if a user had already frozen it on our behalf.
This same concept applies to thawing, even if its not possible for
userspace to beat the kernel in thawing a filesystem. This logic however
has never applied to userspace freezing and thawing, two consecutive
userspace freeze calls will results in only the first one succeeding, so
we must retain the same behaviour in userspace.

This doesn't implement yet kernel initiated filesystem freeze calls,
this will be done in subsequent calls. This change should introduce
no functional changes, it just extends the definitions a frozen
filesystem to account for future kernel initiated filesystem freeze.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c         | 19 ++++++++++++++-----
 include/linux/fs.h | 17 +++++++++++++++--
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index e8f5a7139b8f..a63513d187e8 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1388,10 +1388,13 @@ static void sb_freeze_unlock(struct super_block *sb)
 }
 
 /* Caller takes lock and handles active count */
-static int freeze_locked_super(struct super_block *sb)
+static int freeze_locked_super(struct super_block *sb, bool usercall)
 {
 	int ret;
 
+	if (!usercall && sb_is_frozen(sb))
+		return 0;
+
 	if (!sb_is_unfrozen(sb))
 		return -EBUSY;
 
@@ -1436,7 +1439,10 @@ static int freeze_locked_super(struct super_block *sb)
 	 * For debugging purposes so that fs can warn if it sees write activity
 	 * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super().
 	 */
-	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	if (usercall)
+		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
+	else
+		sb->s_writers.frozen = SB_FREEZE_COMPLETE_AUTO;
 	lockdep_sb_freeze_release(sb);
 	return 0;
 }
@@ -1481,7 +1487,7 @@ int freeze_super(struct super_block *sb)
 	atomic_inc(&sb->s_active);
 
 	down_write(&sb->s_umount);
-	error = freeze_locked_super(sb);
+	error = freeze_locked_super(sb, true);
 	if (error) {
 		deactivate_locked_super(sb);
 		goto out;
@@ -1494,10 +1500,13 @@ int freeze_super(struct super_block *sb)
 EXPORT_SYMBOL(freeze_super);
 
 /* Caller takes lock and handles active count */
-static int thaw_locked_super(struct super_block *sb)
+static int thaw_locked_super(struct super_block *sb, bool usercall)
 {
 	int error;
 
+	if (!usercall && sb_is_unfrozen(sb))
+		return 0;
+
 	if (!sb_is_frozen(sb))
 		return -EINVAL;
 
@@ -1536,7 +1545,7 @@ int thaw_super(struct super_block *sb)
 	int error;
 
 	down_write(&sb->s_umount);
-	error = thaw_locked_super(sb);
+	error = thaw_locked_super(sb, true);
 	if (error) {
 		up_write(&sb->s_umount);
 		goto out;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1e10239c1d3b..107725b20fad 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1324,9 +1324,10 @@ enum {
 	SB_FREEZE_FS = 3,		/* For internal FS use (e.g. to stop
 					 * internal threads if needed) */
 	SB_FREEZE_COMPLETE = 4,		/* ->freeze_fs finished successfully */
+	SB_FREEZE_COMPLETE_AUTO = 5,	/* same but initiated automatically */
 };
 
-#define SB_FREEZE_LEVELS (SB_FREEZE_COMPLETE - 1)
+#define SB_FREEZE_LEVELS (SB_FREEZE_COMPLETE_AUTO - 1)
 
 struct sb_writers {
 	int				frozen;		/* Is sb frozen? */
@@ -1601,6 +1602,18 @@ static inline bool sb_is_frozen_by_user(struct super_block *sb)
 	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
 }
 
+/**
+ * sb_is_frozen_by_kernel - is superblock frozen by the kernel automatically
+ * @sb: the super to check
+ *
+ * Returns true if the super freeze was initiated by the kernel, automatically,
+ * for instance during system sleep or hibernation.
+ */
+static inline bool sb_is_frozen_by_kernel(struct super_block *sb)
+{
+	return sb->s_writers.frozen == SB_FREEZE_COMPLETE_AUTO;
+}
+
 /**
  * sb_is_frozen - is superblock frozen
  * @sb: the super to check
@@ -1609,7 +1622,7 @@ static inline bool sb_is_frozen_by_user(struct super_block *sb)
  */
 static inline bool sb_is_frozen(struct super_block *sb)
 {
-	return sb_is_frozen_by_user(sb);
+	return sb_is_frozen_by_user(sb) || sb_is_frozen_by_kernel(sb);
 }
 
 /**
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 04/11] fs: distinguish between user initiated freeze and kernel initiated freeze Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:48   ` Rafael J. Wysocki
  2017-11-30 16:57   ` Jan Kara
  2017-11-29 23:23 ` [PATCH 06/11] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

There are use cases where we wish to traverse the superblock list
but also capture errors, and in which case we want to avoid having
our callers issue a lock themselves since we can do the locking for
the callers. Provide a iterate_supers_excl() which calls a function
with the write lock held. If an error occurs we capture it and
propagate it.

Likewise there are use cases where we wish to traverse the superblock
list but in reverse order. The new iterate_supers_reverse_excl() helpers
does this but also also captures any errors encountered.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  2 ++
 2 files changed, 93 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index a63513d187e8..885711c1d35b 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -605,6 +605,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
 	spin_unlock(&sb_lock);
 }
 
+/**
+ *	iterate_supers_excl - exclusively call func for all active superblocks
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument. Returns 0 unless an error
+ *	occurred on calling the function on any superblock.
+ */
+int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
+/**
+ *	iterate_supers_reverse_excl - exclusively calls func in reverse order
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument, in reverse order, and holding
+ *	the s_umount write lock. Returns if an error occurred.
+ */
+int iterate_supers_reverse_excl(int (*f)(struct super_block *, void *),
+					 void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_write(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				up_write(&sb->s_umount);
+				spin_lock(&sb_lock);
+				__put_super(sb);
+				break;
+			}
+		}
+		up_write(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
 /**
  *	iterate_supers_type - call function for superblocks of given type
  *	@type: fs type
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 107725b20fad..fe90b6542697 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3164,6 +3164,8 @@ extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
 extern void iterate_supers(void (*)(struct super_block *, void *), void *);
+extern int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg);
+extern int iterate_supers_reverse_excl(int (*)(struct super_block *, void *), void *);
 extern void iterate_supers_type(struct file_system_type *,
 			        void (*)(struct super_block *, void *), void *);
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 06/11] fs: freeze on suspend and thaw on resume
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 07/11] xfs: remove not needed freezing calls Luis R. Rodriguez
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This uses the existing filesystem freeze and thaw callbacks to
freeze each filesystem on suspend/hibernation and thaw upon resume.

This is needed so that we properly really stop IO in flight without
races after userspace has been frozen. Without this we rely on
kthread freezing and its semantics are loose and error prone.
For instance, even though a kthread may use try_to_freeze() and end
up being frozen we have no way of being sure that everything that
has been spawned asynchronously from it (such as timers) have also
been stopped as well.

A long term advantage of also adding filesystem freeze / thawing
supporting durign suspend / hibernation is that long term we may
be able to eventually drop the kernel's thread freezing completely
as it was originally added to stop disk IO in flight as we hibernate
or suspend.

This also implies that many kthread users exist which have been
adding freezer semantics onto its kthreads without need. These also
will need to be reviewed later.

This is based on prior work originally by Rafael Wysocki and later by
Jiri Kosina.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c             | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h     | 13 ++++++++
 kernel/power/process.c | 15 ++++++++-
 3 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 885711c1d35b..c3a2842e5690 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1648,3 +1648,88 @@ int thaw_super(struct super_block *sb)
 	return error;
 }
 EXPORT_SYMBOL(thaw_super);
+
+#ifdef CONFIG_PM_SLEEP
+static bool super_should_freeze(struct super_block *sb)
+{
+	if (!sb->s_root)
+		return false;
+	if (!(sb->s_flags & MS_BORN))
+		return false;
+	/*
+	 * We don't freeze virtual filesystems, we skip those filesystems with
+	 * no backing device.
+	 */
+	if (sb->s_bdi == &noop_backing_dev_info)
+		return false;
+	/* No need to freeze read-only filesystems */
+	if (sb->s_flags & MS_RDONLY)
+		return false;
+
+	return true;
+}
+
+static int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
+
+	spin_unlock(&sb_lock);
+
+	atomic_inc(&sb->s_active);
+	error = freeze_locked_super(sb, false);
+	if (error)
+		atomic_dec(&sb->s_active);
+
+	spin_lock(&sb_lock);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to freeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	spin_unlock(&sb_lock);
+	return error;
+}
+
+int fs_suspend_freeze(void)
+{
+	return iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
+}
+
+static int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
+
+	spin_unlock(&sb_lock);
+
+	error = thaw_locked_super(sb, false);
+	if (!error)
+		atomic_dec(&sb->s_active);
+
+	spin_lock(&sb_lock);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to unfreeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+
+out:
+	spin_unlock(&sb_lock);
+	return error;
+}
+
+int fs_resume_unfreeze(void)
+{
+	return iterate_supers_excl(fs_suspend_thaw_sb, NULL);
+}
+
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index fe90b6542697..dbaa69c3a4cf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2237,6 +2237,19 @@ extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
 extern int freeze_super(struct super_block *super);
 extern int thaw_super(struct super_block *super);
+#ifdef CONFIG_PM_SLEEP
+int fs_suspend_freeze(void);
+int fs_resume_unfreeze(void);
+#else
+static inline int fs_suspend_freeze(void)
+{
+	return 0;
+}
+static inline int fs_resume_unfreeze(void)
+{
+	return 0;
+}
+#endif
 extern bool our_mnt(struct vfsmount *mnt);
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
diff --git a/kernel/power/process.c b/kernel/power/process.c
index c326d7235c5f..7a44f8310968 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -145,6 +145,16 @@ int freeze_processes(void)
 	pr_cont("\n");
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+	error = fs_suspend_freeze();
+	if (error) {
+		pr_cont("failed\n");
+		fs_resume_unfreeze();
+		thaw_processes();
+		return error;
+	}
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disbale
 	 * the OOM killer to disallow any further interference with
@@ -154,8 +164,10 @@ int freeze_processes(void)
 	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
+		fs_resume_unfreeze();
 		thaw_processes();
+	}
 	return error;
 }
 
@@ -198,6 +210,7 @@ void thaw_processes(void)
 	pm_nosig_freezing = false;
 
 	oom_killer_enable();
+	fs_resume_unfreeze();
 
 	pr_info("Restarting tasks ... ");
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 07/11] xfs: remove not needed freezing calls
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (5 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 06/11] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-30 16:21   ` Jan Kara
  2017-11-29 23:23 ` [PATCH 08/11] ext4: " Luis R. Rodriguez
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This removes superflous freezer calls as they are no longer needed
as the VFS now performs filesystem freezing/thaw if the filesystem has
support for it.

The following Coccinelle rule was used as follows:

spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

@ has_freeze_fs @
identifier super_ops;
expression freeze_op;
@@

struct super_operations super_ops = {
	.freeze_fs = freeze_op,
};

@ remove_set_freezable depends on has_freeze_fs @
expression time;
statement S, S2, S3;
expression task;
@@

(
-	set_freezable();
|
-	if (try_to_freeze())
-		continue;
|
-	try_to_freeze();
|
-	freezable_schedule();
+	schedule();
|
-	freezable_schedule_timeout(time);
+	schedule_timeout(time);
|
-	if (freezing(task)) { S }
|
-	if (freezing(task)) { S }
-	else
	{ S2 }
|
-	freezing(current)
)

Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/xfs/xfs_trans_ail.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index cef89f7127d3..1f3dd10a9d00 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -513,7 +513,6 @@ xfsaild(
 	long		tout = 0;	/* milliseconds */
 
 	current->flags |= PF_MEMALLOC;
-	set_freezable();
 
 	while (1) {
 		if (tout && tout <= 20)
@@ -551,19 +550,17 @@ xfsaild(
 		if (!xfs_ail_min(ailp) &&
 		    ailp->xa_target == ailp->xa_target_prev) {
 			spin_unlock(&ailp->xa_lock);
-			freezable_schedule();
+			schedule();
 			tout = 0;
 			continue;
 		}
 		spin_unlock(&ailp->xa_lock);
 
 		if (tout)
-			freezable_schedule_timeout(msecs_to_jiffies(tout));
+			schedule_timeout(msecs_to_jiffies(tout));
 
 		__set_current_state(TASK_RUNNING);
 
-		try_to_freeze();
-
 		tout = xfsaild_push(ailp);
 	}
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 08/11] ext4: remove not needed freezing calls
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (6 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 07/11] xfs: remove not needed freezing calls Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 09/11] f2fs: " Luis R. Rodriguez
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This removes superflous freezer calls as they are no longer needed
as the VFS now performs filesystem freezing/thaw if the filesystem has
support for it.

The following Coccinelle rule was used as follows:

spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

@ has_freeze_fs @
identifier super_ops;
expression freeze_op;
@@

struct super_operations super_ops = {
	.freeze_fs = freeze_op,
};

@ remove_set_freezable depends on has_freeze_fs @
expression time;
statement S, S2, S3;
expression task;
@@

(
-	set_freezable();
|
-	if (try_to_freeze())
-		continue;
|
-	try_to_freeze();
|
-	freezable_schedule();
+	schedule();
|
-	freezable_schedule_timeout(time);
+	schedule_timeout(time);
|
-	if (freezing(task)) { S }
|
-	if (freezing(task)) { S }
-	else
	{ S2 }
|
-	freezing(current)
)

Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/ext4/super.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 57a8fa451d3e..8a510b1c2d92 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2980,8 +2980,6 @@ static int ext4_lazyinit_thread(void *arg)
 		}
 		mutex_unlock(&eli->li_list_mtx);
 
-		try_to_freeze();
-
 		cur = jiffies;
 		if ((time_after_eq(cur, next_wakeup)) ||
 		    (MAX_JIFFY_OFFSET == next_wakeup)) {
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 09/11] f2fs: remove not needed freezing calls
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (7 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 08/11] ext4: " Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 10/11] nilfs2: " Luis R. Rodriguez
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This removes superflous freezer calls as they are no longer needed
as the VFS now performs filesystem freezing/thaw if the filesystem has
support for it.

The following Coccinelle rule was used as follows:

spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

@ has_freeze_fs @
identifier super_ops;
expression freeze_op;
@@

struct super_operations super_ops = {
	.freeze_fs = freeze_op,
};

@ remove_set_freezable depends on has_freeze_fs @
expression time;
statement S, S2, S3;
expression task;
@@

(
-	set_freezable();
|
-	if (try_to_freeze())
-		continue;
|
-	try_to_freeze();
|
-	freezable_schedule();
+	schedule();
|
-	freezable_schedule_timeout(time);
+	schedule_timeout(time);
|
-	if (freezing(task)) { S }
|
-	if (freezing(task)) { S }
-	else
	{ S2 }
|
-	freezing(current)
)

Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/f2fs/gc.c      | 5 +----
 fs/f2fs/segment.c | 6 +-----
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index d844dcb80570..1032d6aa1756 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -32,10 +32,9 @@ static int gc_thread_func(void *data)
 
 	wait_ms = gc_th->min_sleep_time;
 
-	set_freezable();
 	do {
 		wait_event_interruptible_timeout(*wq,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				gc_th->gc_wake,
 				msecs_to_jiffies(wait_ms));
 
@@ -43,8 +42,6 @@ static int gc_thread_func(void *data)
 		if (gc_th->gc_wake)
 			gc_th->gc_wake = 0;
 
-		if (try_to_freeze())
-			continue;
 		if (kthread_should_stop())
 			break;
 
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e0913f2a..a55e456e67ee 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1382,18 +1382,14 @@ static int issue_discard_thread(void *data)
 	unsigned int wait_ms = DEF_MIN_DISCARD_ISSUE_TIME;
 	int issued;
 
-	set_freezable();
-
 	do {
 		init_discard_policy(&dpolicy, DPOLICY_BG,
 					dcc->discard_granularity);
 
 		wait_event_interruptible_timeout(*q,
-				kthread_should_stop() || freezing(current) ||
+				kthread_should_stop() ||
 				dcc->discard_wake,
 				msecs_to_jiffies(wait_ms));
-		if (try_to_freeze())
-			continue;
 		if (kthread_should_stop())
 			return 0;
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 10/11] nilfs2: remove not needed freezing calls
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (8 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 09/11] f2fs: " Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-29 23:23 ` [PATCH 11/11] jfs: " Luis R. Rodriguez
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This removes superflous freezer calls as they are no longer needed
as the VFS now performs filesystem freezing/thaw if the filesystem has
support for it.

The following Coccinelle rule was used as follows:

spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

@ has_freeze_fs @
identifier super_ops;
expression freeze_op;
@@

struct super_operations super_ops = {
	.freeze_fs = freeze_op,
};

@ remove_set_freezable depends on has_freeze_fs @
expression time;
statement S, S2, S3;
expression task;
@@

(
-	set_freezable();
|
-	if (try_to_freeze())
-		continue;
|
-	try_to_freeze();
|
-	freezable_schedule();
+	schedule();
|
-	freezable_schedule_timeout(time);
+	schedule_timeout(time);
|
-	if (freezing(task)) { S }
|
-	if (freezing(task)) { S }
-	else
	{ S2 }
|
-	freezing(current)
)

Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/nilfs2/segment.c | 48 ++++++++++++++++++++----------------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 9f3ffba41533..407e12a60145 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2543,6 +2543,8 @@ static int nilfs_segctor_thread(void *arg)
 	struct nilfs_sc_info *sci = (struct nilfs_sc_info *)arg;
 	struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
 	int timeout = 0;
+	DEFINE_WAIT(wait);
+	int should_sleep = 1;
 
 	sci->sc_timer_task = current;
 
@@ -2574,38 +2576,28 @@ static int nilfs_segctor_thread(void *arg)
 		timeout = 0;
 	}
 
+	prepare_to_wait(&sci->sc_wait_daemon, &wait,
+			TASK_INTERRUPTIBLE);
 
-	if (freezing(current)) {
+	if (sci->sc_seq_request != sci->sc_seq_done)
+		should_sleep = 0;
+	else if (sci->sc_flush_request)
+		should_sleep = 0;
+	else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
+		should_sleep = time_before(jiffies,
+				sci->sc_timer.expires);
+
+	if (should_sleep) {
 		spin_unlock(&sci->sc_state_lock);
-		try_to_freeze();
+		schedule();
 		spin_lock(&sci->sc_state_lock);
-	} else {
-		DEFINE_WAIT(wait);
-		int should_sleep = 1;
-
-		prepare_to_wait(&sci->sc_wait_daemon, &wait,
-				TASK_INTERRUPTIBLE);
-
-		if (sci->sc_seq_request != sci->sc_seq_done)
-			should_sleep = 0;
-		else if (sci->sc_flush_request)
-			should_sleep = 0;
-		else if (sci->sc_state & NILFS_SEGCTOR_COMMIT)
-			should_sleep = time_before(jiffies,
-					sci->sc_timer.expires);
-
-		if (should_sleep) {
-			spin_unlock(&sci->sc_state_lock);
-			schedule();
-			spin_lock(&sci->sc_state_lock);
-		}
-		finish_wait(&sci->sc_wait_daemon, &wait);
-		timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
-			   time_after_eq(jiffies, sci->sc_timer.expires));
-
-		if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
-			set_nilfs_discontinued(nilfs);
 	}
+	finish_wait(&sci->sc_wait_daemon, &wait);
+	timeout = ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+		   time_after_eq(jiffies, sci->sc_timer.expires));
+
+	if (nilfs_sb_dirty(nilfs) && nilfs_sb_need_update(nilfs))
+		set_nilfs_discontinued(nilfs);
 	goto loop;
 
  end_thread:
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 11/11] jfs: remove not needed freezing calls
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (9 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 10/11] nilfs2: " Luis R. Rodriguez
@ 2017-11-29 23:23 ` Luis R. Rodriguez
  2017-11-30 16:36 ` [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Yu Chen
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:23 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, Luis R. Rodriguez

This removes superflous freezer calls as they are no longer needed
as the VFS now performs filesystem freezing/thaw if the filesystem has
support for it.

The following Coccinelle rule was used as follows:

spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

@ has_freeze_fs @
identifier super_ops;
expression freeze_op;
@@

struct super_operations super_ops = {
	.freeze_fs = freeze_op,
};

@ remove_set_freezable depends on has_freeze_fs @
expression time;
statement S, S2, S3;
expression task;
@@

(
-	set_freezable();
|
-	if (try_to_freeze())
-		continue;
|
-	try_to_freeze();
|
-	freezable_schedule();
+	schedule();
|
-	freezable_schedule_timeout(time);
+	schedule_timeout(time);
|
-	if (freezing(task)) { S }
|
-	if (freezing(task)) { S }
-	else
	{ S2 }
|
-	freezing(current)
)

Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/jfs/jfs_logmgr.c | 11 +++--------
 fs/jfs/jfs_txnmgr.c | 31 +++++++++----------------------
 2 files changed, 12 insertions(+), 30 deletions(-)

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 0e5d412c0b01..fa5a95d8fba8 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2344,14 +2344,9 @@ int jfsIOWait(void *arg)
 			spin_lock_irq(&log_redrive_lock);
 		}
 
-		if (freezing(current)) {
-			spin_unlock_irq(&log_redrive_lock);
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			spin_unlock_irq(&log_redrive_lock);
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		spin_unlock_irq(&log_redrive_lock);
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfsIOWait being killed!");
diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
index 4d973524c887..a313300c4651 100644
--- a/fs/jfs/jfs_txnmgr.c
+++ b/fs/jfs/jfs_txnmgr.c
@@ -2747,6 +2747,7 @@ int jfs_lazycommit(void *arg)
 	struct tblock *tblk;
 	unsigned long flags;
 	struct jfs_sb_info *sbi;
+	DECLARE_WAITQUEUE(wq, current);
 
 	do {
 		LAZY_LOCK(flags);
@@ -2793,19 +2794,11 @@ int jfs_lazycommit(void *arg)
 		}
 		/* In case a wakeup came while all threads were active */
 		jfs_commit_thread_waking = 0;
-
-		if (freezing(current)) {
-			LAZY_UNLOCK(flags);
-			try_to_freeze();
-		} else {
-			DECLARE_WAITQUEUE(wq, current);
-
-			add_wait_queue(&jfs_commit_thread_wait, &wq);
-			set_current_state(TASK_INTERRUPTIBLE);
-			LAZY_UNLOCK(flags);
-			schedule();
-			remove_wait_queue(&jfs_commit_thread_wait, &wq);
-		}
+		add_wait_queue(&jfs_commit_thread_wait, &wq);
+		set_current_state(TASK_INTERRUPTIBLE);
+		LAZY_UNLOCK(flags);
+		schedule();
+		remove_wait_queue(&jfs_commit_thread_wait, &wq);
 	} while (!kthread_should_stop());
 
 	if (!list_empty(&TxAnchor.unlock_queue))
@@ -2982,15 +2975,9 @@ int jfs_sync(void *arg)
 		}
 		/* Add anon_list2 back to anon_list */
 		list_splice_init(&TxAnchor.anon_list2, &TxAnchor.anon_list);
-
-		if (freezing(current)) {
-			TXN_UNLOCK();
-			try_to_freeze();
-		} else {
-			set_current_state(TASK_INTERRUPTIBLE);
-			TXN_UNLOCK();
-			schedule();
-		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		TXN_UNLOCK();
+		schedule();
 	} while (!kthread_should_stop());
 
 	jfs_info("jfs_sync being killed");
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-29 23:23 ` [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis R. Rodriguez
@ 2017-11-29 23:48   ` Rafael J. Wysocki
  2017-11-30  0:22     ` Luis R. Rodriguez
  2017-11-30  1:34     ` Dave Chinner
  2017-11-30 16:57   ` Jan Kara
  1 sibling, 2 replies; 47+ messages in thread
From: Rafael J. Wysocki @ 2017-11-29 23:48 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Al Viro, bart.vanassche, ming.lei, Ted Ts'o, Darrick J. Wong,
	Jiri Kosina, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-fsdevel, Boris Ostrovsky, Juergen Gross, Todd Brandt,
	nborisov, Jan Kara, Martin K. Petersen, Oliver Neukum, oleksandr,
	Oleg Antonyan, Yu Chen, Dan Williams, Linux PM, linux-block,
	linux-xfs, Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 12:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> There are use cases where we wish to traverse the superblock list
> but also capture errors, and in which case we want to avoid having
> our callers issue a lock themselves since we can do the locking for
> the callers. Provide a iterate_supers_excl() which calls a function
> with the write lock held. If an error occurs we capture it and
> propagate it.
>
> Likewise there are use cases where we wish to traverse the superblock
> list but in reverse order. The new iterate_supers_reverse_excl() helpers
> does this but also also captures any errors encountered.
>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h |  2 ++
>  2 files changed, 93 insertions(+)
>
> diff --git a/fs/super.c b/fs/super.c
> index a63513d187e8..885711c1d35b 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -605,6 +605,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
>         spin_unlock(&sb_lock);
>  }
>
> +/**
> + *     iterate_supers_excl - exclusively call func for all active superblocks
> + *     @f: function to call
> + *     @arg: argument to pass to it
> + *
> + *     Scans the superblock list and calls given function, passing it
> + *     locked superblock and given argument. Returns 0 unless an error
> + *     occurred on calling the function on any superblock.
> + */
> +int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
> +{
> +       struct super_block *sb, *p = NULL;
> +       int error = 0;
> +
> +       spin_lock(&sb_lock);
> +       list_for_each_entry(sb, &super_blocks, s_list) {
> +               if (hlist_unhashed(&sb->s_instances))
> +                       continue;
> +               sb->s_count++;
> +               spin_unlock(&sb_lock);

Can anything bad happen if the list is modified at this point by a
concurrent thread?

> +
> +               down_write(&sb->s_umount);
> +               if (sb->s_root && (sb->s_flags & SB_BORN)) {
> +                       error = f(sb, arg);
> +                       if (error) {
> +                               up_write(&sb->s_umount);
> +                               spin_lock(&sb_lock);
> +                               __put_super(sb);
> +                               break;
> +                       }
> +               }
> +               up_write(&sb->s_umount);
> +
> +               spin_lock(&sb_lock);
> +               if (p)
> +                       __put_super(p);
> +               p = sb;
> +       }
> +       if (p)
> +               __put_super(p);
> +       spin_unlock(&sb_lock);
> +
> +       return error;
> +}
> +

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-29 23:48   ` Rafael J. Wysocki
@ 2017-11-30  0:22     ` Luis R. Rodriguez
  2017-11-30  1:34     ` Dave Chinner
  1 sibling, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30  0:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Luis R. Rodriguez, Al Viro, bart.vanassche, ming.lei,
	Ted Ts'o, Darrick J. Wong, Jiri Kosina, Rafael J. Wysocki,
	Pavel Machek, Len Brown, linux-fsdevel, Boris Ostrovsky,
	Juergen Gross, Todd Brandt, nborisov, Jan Kara,
	Martin K. Petersen, Oliver Neukum, oleksandr, Oleg Antonyan,
	Yu Chen, Dan Williams, Linux PM, linux-block, linux-xfs,
	Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 12:48:15AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 30, 2017 at 12:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > +int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
> > +{
> > +       struct super_block *sb, *p = NULL;
> > +       int error = 0;
> > +
> > +       spin_lock(&sb_lock);
> > +       list_for_each_entry(sb, &super_blocks, s_list) {
> > +               if (hlist_unhashed(&sb->s_instances))
> > +                       continue;
> > +               sb->s_count++;
> > +               spin_unlock(&sb_lock);
> 
> Can anything bad happen if the list is modified at this point by a
> concurrent thread?

The race is theoretical and applies to all users of iterate_supers() as well.

Its certainly worth considering, however given other code is implicated its
not a *new* issue or race. Its the best we can do with the current design.

That said, as I looked at all this code I considered that perhaps super_blocks
deserves its own RCU lock to enable us to do full swap operations on the list,
without having to do these nasty releases in between.

If that's possible / desirable I'd consider it a welcomed future optimization,
and I could give it a shot, however its unclear if this is a requirement for
this feature at this time.

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-29 23:48   ` Rafael J. Wysocki
  2017-11-30  0:22     ` Luis R. Rodriguez
@ 2017-11-30  1:34     ` Dave Chinner
  2017-11-30  1:40       ` Rafael J. Wysocki
  1 sibling, 1 reply; 47+ messages in thread
From: Dave Chinner @ 2017-11-30  1:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Luis R. Rodriguez, Al Viro, bart.vanassche, ming.lei,
	Ted Ts'o, Darrick J. Wong, Jiri Kosina, Rafael J. Wysocki,
	Pavel Machek, Len Brown, linux-fsdevel, Boris Ostrovsky,
	Juergen Gross, Todd Brandt, nborisov, Jan Kara,
	Martin K. Petersen, Oliver Neukum, oleksandr, Oleg Antonyan,
	Yu Chen, Dan Williams, Linux PM, linux-block, linux-xfs,
	Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 12:48:15AM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 30, 2017 at 12:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > There are use cases where we wish to traverse the superblock list
> > but also capture errors, and in which case we want to avoid having
> > our callers issue a lock themselves since we can do the locking for
> > the callers. Provide a iterate_supers_excl() which calls a function
> > with the write lock held. If an error occurs we capture it and
> > propagate it.
> >
> > Likewise there are use cases where we wish to traverse the superblock
> > list but in reverse order. The new iterate_supers_reverse_excl() helpers
> > does this but also also captures any errors encountered.
> >
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/fs.h |  2 ++
> >  2 files changed, 93 insertions(+)
> >
> > diff --git a/fs/super.c b/fs/super.c
> > index a63513d187e8..885711c1d35b 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -605,6 +605,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
> >         spin_unlock(&sb_lock);
> >  }
> >
> > +/**
> > + *     iterate_supers_excl - exclusively call func for all active superblocks
> > + *     @f: function to call
> > + *     @arg: argument to pass to it
> > + *
> > + *     Scans the superblock list and calls given function, passing it
> > + *     locked superblock and given argument. Returns 0 unless an error
> > + *     occurred on calling the function on any superblock.
> > + */
> > +int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
> > +{
> > +       struct super_block *sb, *p = NULL;
> > +       int error = 0;
> > +
> > +       spin_lock(&sb_lock);
> > +       list_for_each_entry(sb, &super_blocks, s_list) {
> > +               if (hlist_unhashed(&sb->s_instances))
> > +                       continue;
> > +               sb->s_count++;
> > +               spin_unlock(&sb_lock);
> 
> Can anything bad happen if the list is modified at this point by a
> concurrent thread?

No. We have a valid reference to sb->s_count and that keeps it on
the list while we have the lock dropped. The sb reference isn't
dropped until we've iterated to the next sb on the list and taken a
reference to that, hence it's safe to drop and regain the list lock
without needing to restart the iteration.

> > +
> > +               down_write(&sb->s_umount);
> > +               if (sb->s_root && (sb->s_flags & SB_BORN)) {
> > +                       error = f(sb, arg);
> > +                       if (error) {
> > +                               up_write(&sb->s_umount);
> > +                               spin_lock(&sb_lock);
> > +                               __put_super(sb);
> > +                               break;
> > +                       }
> > +               }
> > +               up_write(&sb->s_umount);
> > +
> > +               spin_lock(&sb_lock);
> > +               if (p)
> > +                       __put_super(p);
> > +               p = sb;

This code here is what drops the reference to the previous sb
we've iterated past.

FWIW, this "hold until next is held" iteration pattern is used
frequently for inodes, dentries, and other reference counted VFS
objects so we can iterate the list without needing to hold the
list lock for the entire iteration....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-30  1:34     ` Dave Chinner
@ 2017-11-30  1:40       ` Rafael J. Wysocki
  0 siblings, 0 replies; 47+ messages in thread
From: Rafael J. Wysocki @ 2017-11-30  1:40 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Rafael J. Wysocki, Luis R. Rodriguez, Al Viro, bart.vanassche,
	ming.lei, Ted Ts'o, Darrick J. Wong, Jiri Kosina,
	Rafael J. Wysocki, Pavel Machek, Len Brown, linux-fsdevel,
	Boris Ostrovsky, Juergen Gross, Todd Brandt, nborisov, Jan Kara,
	Martin K. Petersen, Oliver Neukum, oleksandr, Oleg Antonyan,
	Yu Chen, Dan Williams, Linux PM, linux-block, linux-xfs,
	Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 2:34 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Nov 30, 2017 at 12:48:15AM +0100, Rafael J. Wysocki wrote:
>> On Thu, Nov 30, 2017 at 12:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > There are use cases where we wish to traverse the superblock list
>> > but also capture errors, and in which case we want to avoid having
>> > our callers issue a lock themselves since we can do the locking for
>> > the callers. Provide a iterate_supers_excl() which calls a function
>> > with the write lock held. If an error occurs we capture it and
>> > propagate it.
>> >
>> > Likewise there are use cases where we wish to traverse the superblock
>> > list but in reverse order. The new iterate_supers_reverse_excl() helpers
>> > does this but also also captures any errors encountered.
>> >
>> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>> > ---
>> >  fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >  include/linux/fs.h |  2 ++
>> >  2 files changed, 93 insertions(+)
>> >
>> > diff --git a/fs/super.c b/fs/super.c
>> > index a63513d187e8..885711c1d35b 100644
>> > --- a/fs/super.c
>> > +++ b/fs/super.c
>> > @@ -605,6 +605,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
>> >         spin_unlock(&sb_lock);
>> >  }
>> >
>> > +/**
>> > + *     iterate_supers_excl - exclusively call func for all active superblocks
>> > + *     @f: function to call
>> > + *     @arg: argument to pass to it
>> > + *
>> > + *     Scans the superblock list and calls given function, passing it
>> > + *     locked superblock and given argument. Returns 0 unless an error
>> > + *     occurred on calling the function on any superblock.
>> > + */
>> > +int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
>> > +{
>> > +       struct super_block *sb, *p = NULL;
>> > +       int error = 0;
>> > +
>> > +       spin_lock(&sb_lock);
>> > +       list_for_each_entry(sb, &super_blocks, s_list) {
>> > +               if (hlist_unhashed(&sb->s_instances))
>> > +                       continue;
>> > +               sb->s_count++;
>> > +               spin_unlock(&sb_lock);
>>
>> Can anything bad happen if the list is modified at this point by a
>> concurrent thread?
>
> No. We have a valid reference to sb->s_count and that keeps it on
> the list while we have the lock dropped. The sb reference isn't
> dropped until we've iterated to the next sb on the list and taken a
> reference to that, hence it's safe to drop and regain the list lock
> without needing to restart the iteration.
>
>> > +
>> > +               down_write(&sb->s_umount);
>> > +               if (sb->s_root && (sb->s_flags & SB_BORN)) {
>> > +                       error = f(sb, arg);
>> > +                       if (error) {
>> > +                               up_write(&sb->s_umount);
>> > +                               spin_lock(&sb_lock);
>> > +                               __put_super(sb);
>> > +                               break;
>> > +                       }
>> > +               }
>> > +               up_write(&sb->s_umount);
>> > +
>> > +               spin_lock(&sb_lock);
>> > +               if (p)
>> > +                       __put_super(p);
>> > +               p = sb;
>
> This code here is what drops the reference to the previous sb
> we've iterated past.
>
> FWIW, this "hold until next is held" iteration pattern is used
> frequently for inodes, dentries, and other reference counted VFS
> objects so we can iterate the list without needing to hold the
> list lock for the entire iteration....

OK, thanks!

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 07/11] xfs: remove not needed freezing calls
  2017-11-29 23:23 ` [PATCH 07/11] xfs: remove not needed freezing calls Luis R. Rodriguez
@ 2017-11-30 16:21   ` Jan Kara
  2017-11-30 20:32     ` Rafael J. Wysocki
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Kara @ 2017-11-30 16:21 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed 29-11-17 15:23:52, Luis R. Rodriguez wrote:
> This removes superflous freezer calls as they are no longer needed
> as the VFS now performs filesystem freezing/thaw if the filesystem has
> support for it.
> 
> The following Coccinelle rule was used as follows:
> 
> spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/

I think your rule misses WQ_FREEZABLE flag for workqueues? That would be
also good to get rid of...

								Honza

> 
> @ has_freeze_fs @
> identifier super_ops;
> expression freeze_op;
> @@
> 
> struct super_operations super_ops = {
> 	.freeze_fs = freeze_op,
> };
> 
> @ remove_set_freezable depends on has_freeze_fs @
> expression time;
> statement S, S2, S3;
> expression task;
> @@
> 
> (
> -	set_freezable();
> |
> -	if (try_to_freeze())
> -		continue;
> |
> -	try_to_freeze();
> |
> -	freezable_schedule();
> +	schedule();
> |
> -	freezable_schedule_timeout(time);
> +	schedule_timeout(time);
> |
> -	if (freezing(task)) { S }
> |
> -	if (freezing(task)) { S }
> -	else
> 	{ S2 }
> |
> -	freezing(current)
> )
> 
> Generated-by: Coccinelle SmPL
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  fs/xfs/xfs_trans_ail.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index cef89f7127d3..1f3dd10a9d00 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -513,7 +513,6 @@ xfsaild(
>  	long		tout = 0;	/* milliseconds */
>  
>  	current->flags |= PF_MEMALLOC;
> -	set_freezable();
>  
>  	while (1) {
>  		if (tout && tout <= 20)
> @@ -551,19 +550,17 @@ xfsaild(
>  		if (!xfs_ail_min(ailp) &&
>  		    ailp->xa_target == ailp->xa_target_prev) {
>  			spin_unlock(&ailp->xa_lock);
> -			freezable_schedule();
> +			schedule();
>  			tout = 0;
>  			continue;
>  		}
>  		spin_unlock(&ailp->xa_lock);
>  
>  		if (tout)
> -			freezable_schedule_timeout(msecs_to_jiffies(tout));
> +			schedule_timeout(msecs_to_jiffies(tout));
>  
>  		__set_current_state(TASK_RUNNING);
>  
> -		try_to_freeze();
> -
>  		tout = xfsaild_push(ailp);
>  	}
>  
> -- 
> 2.15.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (10 preceding siblings ...)
  2017-11-29 23:23 ` [PATCH 11/11] jfs: " Luis R. Rodriguez
@ 2017-11-30 16:36 ` Yu Chen
  2017-11-30 16:41   ` Jiri Kosina
  2017-11-30 17:01 ` Bart Van Assche
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 47+ messages in thread
From: Yu Chen @ 2017-11-30 16:36 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos,
	Rafael J. Wysocki, Pavel Machek, Len Brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Dan Williams, Linux PM list, linux-block, linux-xfs,
	Linux Kernel Mailing List, Zhang Rui

On Thu, Nov 30, 2017 at 7:23 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> This is a followup from the original RFC which proposed to start
> to kill kthread freezing all together [0]. Instead of going straight
> out to the jugular for kthread freezing this series only addresses
> killing freezer calls on filesystems which implement freeze_fs, after
> we let the kernel freeze these filesystems for us on suspend.
>
>
>   o xfs
>   o reiserfs
>   o nilfs2
>   o jfs
>   o f2fs
>   o ext4
>   o ext2
>   o btrfs
>

Thanks for your work on this!  I'll test xfs part first.
BTW, is nfs able to be included in this set? I also encountered a
freeze() failure
due to nfs access during that stage recently.
-- 
Thanks,
Yu

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 16:36 ` [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Yu Chen
@ 2017-11-30 16:41   ` Jiri Kosina
  2017-11-30 16:50     ` Yu Chen
  2017-12-01 19:05     ` Jeff Layton
  0 siblings, 2 replies; 47+ messages in thread
From: Jiri Kosina @ 2017-11-30 16:41 UTC (permalink / raw)
  To: Yu Chen
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Dan Williams, Linux PM list, linux-block, linux-xfs,
	Linux Kernel Mailing List, Zhang Rui

On Fri, 1 Dec 2017, Yu Chen wrote:

> BTW, is nfs able to be included in this set? I also encountered a 
> freeze() failure due to nfs access during that stage recently.

The freezer usage in NFS is magnitudes more complicated, so it makes sense 
to first go after the lower hanging fruit to figure out the viability of 
the whole aproach in practice.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 16:41   ` Jiri Kosina
@ 2017-11-30 16:50     ` Yu Chen
  2017-12-01 19:05     ` Jeff Layton
  1 sibling, 0 replies; 47+ messages in thread
From: Yu Chen @ 2017-11-30 16:50 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Dan Williams, Linux PM list, linux-block, linux-xfs,
	Linux Kernel Mailing List, Zhang Rui

On Fri, Dec 1, 2017 at 12:41 AM, Jiri Kosina <jikos@kernel.org> wrote:
> On Fri, 1 Dec 2017, Yu Chen wrote:
>
>> BTW, is nfs able to be included in this set? I also encountered a
>> freeze() failure due to nfs access during that stage recently.
>
> The freezer usage in NFS is magnitudes more complicated, so it makes sense
> to first go after the lower hanging fruit to figure out the viability of
> the whole aproach in practice.

Ok, thanks!

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
  2017-11-29 23:23 ` [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis R. Rodriguez
  2017-11-29 23:48   ` Rafael J. Wysocki
@ 2017-11-30 16:57   ` Jan Kara
  1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2017-11-30 16:57 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed 29-11-17 15:23:50, Luis R. Rodriguez wrote:
> There are use cases where we wish to traverse the superblock list
> but also capture errors, and in which case we want to avoid having
> our callers issue a lock themselves since we can do the locking for
> the callers. Provide a iterate_supers_excl() which calls a function
> with the write lock held. If an error occurs we capture it and
> propagate it.
> 
> Likewise there are use cases where we wish to traverse the superblock
> list but in reverse order. The new iterate_supers_reverse_excl() helpers
> does this but also also captures any errors encountered.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

The patch looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza


> ---
>  fs/super.c         | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h |  2 ++
>  2 files changed, 93 insertions(+)
> 
> diff --git a/fs/super.c b/fs/super.c
> index a63513d187e8..885711c1d35b 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -605,6 +605,97 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
>  	spin_unlock(&sb_lock);
>  }
>  
> +/**
> + *	iterate_supers_excl - exclusively call func for all active superblocks
> + *	@f: function to call
> + *	@arg: argument to pass to it
> + *
> + *	Scans the superblock list and calls given function, passing it
> + *	locked superblock and given argument. Returns 0 unless an error
> + *	occurred on calling the function on any superblock.
> + */
> +int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
> +{
> +	struct super_block *sb, *p = NULL;
> +	int error = 0;
> +
> +	spin_lock(&sb_lock);
> +	list_for_each_entry(sb, &super_blocks, s_list) {
> +		if (hlist_unhashed(&sb->s_instances))
> +			continue;
> +		sb->s_count++;
> +		spin_unlock(&sb_lock);
> +
> +		down_write(&sb->s_umount);
> +		if (sb->s_root && (sb->s_flags & SB_BORN)) {
> +			error = f(sb, arg);
> +			if (error) {
> +				up_write(&sb->s_umount);
> +				spin_lock(&sb_lock);
> +				__put_super(sb);
> +				break;
> +			}
> +		}
> +		up_write(&sb->s_umount);
> +
> +		spin_lock(&sb_lock);
> +		if (p)
> +			__put_super(p);
> +		p = sb;
> +	}
> +	if (p)
> +		__put_super(p);
> +	spin_unlock(&sb_lock);
> +
> +	return error;
> +}
> +
> +/**
> + *	iterate_supers_reverse_excl - exclusively calls func in reverse order
> + *	@f: function to call
> + *	@arg: argument to pass to it
> + *
> + *	Scans the superblock list and calls given function, passing it
> + *	locked superblock and given argument, in reverse order, and holding
> + *	the s_umount write lock. Returns if an error occurred.
> + */
> +int iterate_supers_reverse_excl(int (*f)(struct super_block *, void *),
> +					 void *arg)
> +{
> +	struct super_block *sb, *p = NULL;
> +	int error = 0;
> +
> +	spin_lock(&sb_lock);
> +	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
> +		if (hlist_unhashed(&sb->s_instances))
> +			continue;
> +		sb->s_count++;
> +		spin_unlock(&sb_lock);
> +
> +		down_write(&sb->s_umount);
> +		if (sb->s_root && (sb->s_flags & SB_BORN)) {
> +			error = f(sb, arg);
> +			if (error) {
> +				up_write(&sb->s_umount);
> +				spin_lock(&sb_lock);
> +				__put_super(sb);
> +				break;
> +			}
> +		}
> +		up_write(&sb->s_umount);
> +
> +		spin_lock(&sb_lock);
> +		if (p)
> +			__put_super(p);
> +		p = sb;
> +	}
> +	if (p)
> +		__put_super(p);
> +	spin_unlock(&sb_lock);
> +
> +	return error;
> +}
> +
>  /**
>   *	iterate_supers_type - call function for superblocks of given type
>   *	@type: fs type
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 107725b20fad..fe90b6542697 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3164,6 +3164,8 @@ extern struct super_block *get_active_super(struct block_device *bdev);
>  extern void drop_super(struct super_block *sb);
>  extern void drop_super_exclusive(struct super_block *sb);
>  extern void iterate_supers(void (*)(struct super_block *, void *), void *);
> +extern int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg);
> +extern int iterate_supers_reverse_excl(int (*)(struct super_block *, void *), void *);
>  extern void iterate_supers_type(struct file_system_type *,
>  			        void (*)(struct super_block *, void *), void *);
>  
> -- 
> 2.15.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 01/11] fs: provide unlocked helper for freeze_super()
  2017-11-29 23:23 ` [PATCH 01/11] fs: provide unlocked helper for freeze_super() Luis R. Rodriguez
@ 2017-11-30 16:58   ` Jan Kara
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2017-11-30 16:58 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed 29-11-17 15:23:46, Luis R. Rodriguez wrote:
> freeze_super() holds a write lock, however we wish to also enable
> callers which already hold the write lock. To do this provide a helper
> and make freeze_super() use it. This way, all that freeze_super() does
> now is lock handling and active count management.
> 
> This change has no functional changes.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c | 100 +++++++++++++++++++++++++++++++++----------------------------
>  1 file changed, 55 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index d4e33e8f1e6f..a7650ff22f0e 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1387,59 +1387,20 @@ static void sb_freeze_unlock(struct super_block *sb)
>  		percpu_up_write(sb->s_writers.rw_sem + level);
>  }
>  
> -/**
> - * freeze_super - lock the filesystem and force it into a consistent state
> - * @sb: the super to lock
> - *
> - * Syncs the super to make sure the filesystem is consistent and calls the fs's
> - * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> - * -EBUSY.
> - *
> - * During this function, sb->s_writers.frozen goes through these values:
> - *
> - * SB_UNFROZEN: File system is normal, all writes progress as usual.
> - *
> - * SB_FREEZE_WRITE: The file system is in the process of being frozen.  New
> - * writes should be blocked, though page faults are still allowed. We wait for
> - * all writes to complete and then proceed to the next stage.
> - *
> - * SB_FREEZE_PAGEFAULT: Freezing continues. Now also page faults are blocked
> - * but internal fs threads can still modify the filesystem (although they
> - * should not dirty new pages or inodes), writeback can run etc. After waiting
> - * for all running page faults we sync the filesystem which will clean all
> - * dirty pages and inodes (no new dirty pages or inodes can be created when
> - * sync is running).
> - *
> - * SB_FREEZE_FS: The file system is frozen. Now all internal sources of fs
> - * modification are blocked (e.g. XFS preallocation truncation on inode
> - * reclaim). This is usually implemented by blocking new transactions for
> - * filesystems that have them and need this additional guard. After all
> - * internal writers are finished we call ->freeze_fs() to finish filesystem
> - * freezing. Then we transition to SB_FREEZE_COMPLETE state. This state is
> - * mostly auxiliary for filesystems to verify they do not modify frozen fs.
> - *
> - * sb->s_writers.frozen is protected by sb->s_umount.
> - */
> -int freeze_super(struct super_block *sb)
> +/* Caller takes lock and handles active count */
> +static int freeze_locked_super(struct super_block *sb)
>  {
>  	int ret;
>  
> -	atomic_inc(&sb->s_active);
> -	down_write(&sb->s_umount);
> -	if (sb->s_writers.frozen != SB_UNFROZEN) {
> -		deactivate_locked_super(sb);
> +	if (sb->s_writers.frozen != SB_UNFROZEN)
>  		return -EBUSY;
> -	}
>  
> -	if (!(sb->s_flags & SB_BORN)) {
> -		up_write(&sb->s_umount);
> +	if (!(sb->s_flags & SB_BORN))
>  		return 0;	/* sic - it's "nothing to do" */
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		/* Nothing to do really... */
>  		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> -		up_write(&sb->s_umount);
>  		return 0;
>  	}
>  
> @@ -1468,7 +1429,6 @@ int freeze_super(struct super_block *sb)
>  			sb->s_writers.frozen = SB_UNFROZEN;
>  			sb_freeze_unlock(sb);
>  			wake_up(&sb->s_writers.wait_unfrozen);
> -			deactivate_locked_super(sb);
>  			return ret;
>  		}
>  	}
> @@ -1478,9 +1438,59 @@ int freeze_super(struct super_block *sb)
>  	 */
>  	sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>  	lockdep_sb_freeze_release(sb);
> -	up_write(&sb->s_umount);
>  	return 0;
>  }
> +
> +/**
> + * freeze_super - lock the filesystem and force it into a consistent state
> + * @sb: the super to lock
> + *
> + * Syncs the super to make sure the filesystem is consistent and calls the fs's
> + * freeze_fs.  Subsequent calls to this without first thawing the fs will return
> + * -EBUSY.
> + *
> + * During this function, sb->s_writers.frozen goes through these values:
> + *
> + * SB_UNFROZEN: File system is normal, all writes progress as usual.
> + *
> + * SB_FREEZE_WRITE: The file system is in the process of being frozen.  New
> + * writes should be blocked, though page faults are still allowed. We wait for
> + * all writes to complete and then proceed to the next stage.
> + *
> + * SB_FREEZE_PAGEFAULT: Freezing continues. Now also page faults are blocked
> + * but internal fs threads can still modify the filesystem (although they
> + * should not dirty new pages or inodes), writeback can run etc. After waiting
> + * for all running page faults we sync the filesystem which will clean all
> + * dirty pages and inodes (no new dirty pages or inodes can be created when
> + * sync is running).
> + *
> + * SB_FREEZE_FS: The file system is frozen. Now all internal sources of fs
> + * modification are blocked (e.g. XFS preallocation truncation on inode
> + * reclaim). This is usually implemented by blocking new transactions for
> + * filesystems that have them and need this additional guard. After all
> + * internal writers are finished we call ->freeze_fs() to finish filesystem
> + * freezing. Then we transition to SB_FREEZE_COMPLETE state. This state is
> + * mostly auxiliary for filesystems to verify they do not modify frozen fs.
> + *
> + * sb->s_writers.frozen is protected by sb->s_umount.
> + */
> +int freeze_super(struct super_block *sb)
> +{
> +	int error;
> +
> +	atomic_inc(&sb->s_active);
> +
> +	down_write(&sb->s_umount);
> +	error = freeze_locked_super(sb);
> +	if (error) {
> +		deactivate_locked_super(sb);
> +		goto out;
> +	}
> +	up_write(&sb->s_umount);
> +
> +out:
> +	return error;
> +}
>  EXPORT_SYMBOL(freeze_super);
>  
>  /**
> -- 
> 2.15.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 02/11] fs: provide unlocked helper thaw_super()
  2017-11-29 23:23 ` [PATCH 02/11] fs: provide unlocked helper thaw_super() Luis R. Rodriguez
@ 2017-11-30 16:59   ` Jan Kara
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Kara @ 2017-11-30 16:59 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed 29-11-17 15:23:47, Luis R. Rodriguez wrote:
> thaw_super() hold a write lock, however we wish to also enable
> callers which already hold the write lock. To do this provide a helper
> and make thaw_super() use it. This way, all that thaw_super() does
> now is lock handling and active count management.
> 
> This change has no functional changes.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c | 39 ++++++++++++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index a7650ff22f0e..cecc279beecd 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1493,21 +1493,13 @@ int freeze_super(struct super_block *sb)
>  }
>  EXPORT_SYMBOL(freeze_super);
>  
> -/**
> - * thaw_super -- unlock filesystem
> - * @sb: the super to thaw
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_super().
> - */
> -int thaw_super(struct super_block *sb)
> +/* Caller takes lock and handles active count */
> +static int thaw_locked_super(struct super_block *sb)
>  {
>  	int error;
>  
> -	down_write(&sb->s_umount);
> -	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> -		up_write(&sb->s_umount);
> +	if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
>  		return -EINVAL;
> -	}
>  
>  	if (sb_rdonly(sb)) {
>  		sb->s_writers.frozen = SB_UNFROZEN;
> @@ -1522,7 +1514,6 @@ int thaw_super(struct super_block *sb)
>  			printk(KERN_ERR
>  				"VFS:Filesystem thaw failed\n");
>  			lockdep_sb_freeze_release(sb);
> -			up_write(&sb->s_umount);
>  			return error;
>  		}
>  	}
> @@ -1531,7 +1522,29 @@ int thaw_super(struct super_block *sb)
>  	sb_freeze_unlock(sb);
>  out:
>  	wake_up(&sb->s_writers.wait_unfrozen);
> -	deactivate_locked_super(sb);
>  	return 0;
>  }
> +
> +/**
> + * thaw_super -- unlock filesystem
> + * @sb: the super to thaw
> + *
> + * Unlocks the filesystem and marks it writeable again after freeze_super().
> + */
> +int thaw_super(struct super_block *sb)
> +{
> +	int error;
> +
> +	down_write(&sb->s_umount);
> +	error = thaw_locked_super(sb);
> +	if (error) {
> +		up_write(&sb->s_umount);
> +		goto out;
> +	}
> +
> +	deactivate_locked_super(sb);
> +
> +out:
> +	return error;
> +}
>  EXPORT_SYMBOL(thaw_super);
> -- 
> 2.15.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (11 preceding siblings ...)
  2017-11-30 16:36 ` [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Yu Chen
@ 2017-11-30 17:01 ` Bart Van Assche
  2017-11-30 19:42   ` Luis R. Rodriguez
  2017-11-30 21:51 ` Pavel Machek
  2017-12-13  1:09 ` Rafael J. Wysocki
  14 siblings, 1 reply; 47+ messages in thread
From: Bart Van Assche @ 2017-11-30 17:01 UTC (permalink / raw)
  To: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	jikos, len.brown, tytso
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, jgross, martin.petersen,
	dan.j.williams, yu.chen.surf, oleksandr, todd.e.brandt, jack

T24gV2VkLCAyMDE3LTExLTI5IGF0IDE1OjIzIC0wODAwLCBMdWlzIFIuIFJvZHJpZ3VleiB3cm90
ZToNCj4gVGhpcyBpcyBhIGZvbGxvd3VwIGZyb20gdGhlIG9yaWdpbmFsIFJGQyB3aGljaCBwcm9w
b3NlZCB0byBzdGFydA0KPiB0byBraWxsIGt0aHJlYWQgZnJlZXppbmcgYWxsIHRvZ2V0aGVyIFsw
XS4gSW5zdGVhZCBvZiBnb2luZyBzdHJhaWdodA0KPiBvdXQgdG8gdGhlIGp1Z3VsYXIgZm9yIGt0
aHJlYWQgZnJlZXppbmcgdGhpcyBzZXJpZXMgb25seSBhZGRyZXNzZXMNCj4ga2lsbGluZyBmcmVl
emVyIGNhbGxzIG9uIGZpbGVzeXN0ZW1zIHdoaWNoIGltcGxlbWVudCBmcmVlemVfZnMsIGFmdGVy
DQo+IHdlIGxldCB0aGUga2VybmVsIGZyZWV6ZSB0aGVzZSBmaWxlc3lzdGVtcyBmb3IgdXMgb24g
c3VzcGVuZC4NCj4gDQo+IFRoaXMgYXBwcm9hY2ggcHV0cyBvbiBhIHNsb3cgYnV0IHN0ZWFkeSBw
YXRoIHRvd2FyZHMgdGhlIG9yaWdpbmFsIGdvYWwNCj4gdGhvdWdoLiBFYWNoIHN1YnN5c3RlbSBj
b3VsZCBsb29rIGZvciBzaW1pbGFyIHNvbHV0aW9ucy4gRXZlbiB3aXRoDQo+IGZpbGVzeXN0ZW1z
IHdlJ3JlIG5vdCBhbGwgZG9uZSB5ZXQsIGFmdGVyIHRoaXMgd2UnbGwgc3RpbGwgaGF2ZSB0bw0K
PiBkZWNpZGUgd2hhdCB0byBkbyBhYm91dCBmaWxlc3lzdGVtcyB3aGljaCBkbyBub3QgaW1wbGVt
ZW50IGZyZWV6ZV9mcygpLg0KPiANCj4gTW90aXZhdGlvbiBhbmQgcHJvYmxlbToNCj4gDQo+IGt0
aHJlYWRzIGhhdmUgc29tZSBzZW1hbnRpY3MgZm9yIGZyZWV6aW5nLCB3aGljaCBoZWxwcyB0aGUg
a2VybmVsDQo+IGZyZWV6ZSB0aGVtIHdoZW4gYSBzeXN0ZW0gaXMgZ29pbmcgdG8gc3VzcGVuZCBv
ciBoaWJlcm5hdGlvbi4gVGhlc2UNCj4gc2VtYW50aWNzIGFyZSBub3Qgd2VsbCBkZWZpbmVkIHRo
b3VnaCwgYW5kIGl0IGFjdHVhbGx5IHR1cm5zIG91dA0KPiBwcmV0dHkgaGFyZCB0byBnZXQgaXQg
cmlnaHQuDQo+IA0KPiBXaXRob3V0IGEgcHJvcGVyIHNvbHV0aW9uIHN1c3BlbmQgYW5kIGhpYmVy
bmF0aW9uIGFyZSBmcmFnaWxlIG9uIGZpbGVzeXN0ZW1zLA0KPiBpdCBjYW4gZWFzaWx5IGJyZWFr
IHN1c3BlbmQgYW5kIGZpeGluZyBzdWNoIGlzc3VlcyBhcmUgaW4gbm8gd2F5IHRyaXZpYWwgWzFd
DQo+IFsyXS4NCj4gDQo+IFByb3Bvc2VkIHNvbHV0aW9uOg0KPiANCj4gSW5zdGVhZCBvZiBmaXhp
bmcgc3VjaCBzZW1hbnRpY3MgYW5kIHRyeWluZyB0byBnZXQgYWxsIGZpbGVzeXN0ZW1zIHRvIGRv
IGl0DQo+IHJpZ2h0LCB3ZSBjYW4gZWFzaWx5IGRvIGF3YXkgd2l0aCBhbGwgZnJlZXppbmcgY2Fs
bHMgaWYgdGhlIGZpbGVzeXN0ZW0NCj4gaW1wbGVtZW50cyBhIHByb3BlciBmcmVlemVfZnMoKSBj
YWxsYmFjay4gVGhlIGZvbGxvd2luZyA5IGZpbGVzeXN0ZW1zIGhhdmUNCj4gZnJlZXplX2ZzKCkg
aW1wbGVtZW50ZWQgYXMgc3VjaCB3ZSBjYW4gbGV0IHRoZSBrZXJuZWwgaXNzdWUgdGhlIGNhbGxi
YWNrIHVwb24NCj4gc3VzcGVuZCBhbmQgdGhhdyBvbiByZXN1bWUgYXV0b21hdGljYWxseSBvbiBv
dXIgYmVoYWxmLg0KPiANCj4gICBvIHhmcw0KPiAgIG8gcmVpc2VyZnMNCj4gICBvIG5pbGZzMg0K
PiAgIG8gamZzDQo+ICAgbyBmMmZzDQo+ICAgbyBleHQ0DQo+ICAgbyBleHQyDQo+ICAgbyBidHJm
cw0KPiANCj4gT2YgdGhlc2UsIHRoZSBmb2xsb3dpbmcgaGF2ZSBmcmVlemVyIGhlbHBlcnMsIHdo
aWNoIGNhbiB0aGVuIGJlIHJlbW92ZWQNCj4gYWZ0ZXIgdGhlIGtlcm5lbCBhdXRvbWF0aWNhbHkg
Y2FsbHMgZnJlZXplX2ZzIGZvciB1cyBvbiBzdXNwZW5kOg0KPiANCj4gICBvIHhmcw0KPiAgIG8g
bmlsZnMyDQo+ICAgbyBqZnMNCj4gICBvIGYyZnMNCj4gICBvIGV4dDQNCj4gDQo+IEkndmUgdGVz
dGVkIHRoaXMgb24gYSBzeXN0ZW0gd2l0aCBleHQ0IGFuZCBYRlMsIGFuZCBoYXZlIGxldCAwLWRh
eSBnbyBhdA0KPiB3aXRob3V0IGlzc3Vlcy4gSSBoYXZlIGJyYW5jaGVzIGF2YWlsYWJlIGZvciBs
aW51eC1uZXh0IFszXSBhbmQgTGludXMnDQo+IGxhdGVzdCB0cmVlIFs0XS4NCj4gDQo+IEZ1cnRo
ZXIgdGVzdGluZywgdGhvdWdodHMsIHJldmlld3MsIGZsYW1lcyBhcmUgYWxsIGVxdWFsbHkgYXBw
cmVjaWF0ZWQuDQoNCkhlbGxvIEx1aXMsDQoNCkl0J3MgZ3JlYXQgdG8gc2VlIHRoYXQgeW91IGFy
ZSBtYWtpbmcgcHJvZ3Jlc3Mgd2l0aCB0aGlzIHdvcmsgOi0pIEhvd2V2ZXIsDQp3aGF0J3Mgbm90
IGNsZWFyIHRvIG1lIGlzIHdoYXQgeW91ciAobG9uZy10ZXJtKSBwbGFuIGlzIGZvciBmcmVlemlu
Zw0KZmlsZXN5c3RlbXMgdGhhdCBlLmcuIGV4aXN0IG9uIHRvcCBvZiBhIG1kIFJBSUQxIGJsb2Nr
IGRldmljZT8gVGhlIG1kIHJlc3luYw0KdGhyZWFkIG11c3QgYmUgc3RvcHBlZCBiZWZvcmUgYSBz
eXN0ZW0gaXMgZnJvemVuLiBUb2RheSB0aGUgbWQgZHJpdmVyIHVzZXMNCnRoZSBrdGhyZWFkIGZy
ZWV6aW5nIG1lY2hhbmlzbSBmb3IgdGhhdCBwdXJwb3NlLiBEbyB5b3UgaGF2ZSBhIHBsYW4gZm9y
DQpoYW5kbGluZyB0aGUgbW9yZSBjb21wbGljYXRlZCBzY2VuYXJpb3MsIGUuZy4gYSBmaWxlc3lz
dGVtIHRoYXQgZXhpc3RzIG9uIHRvcA0Kb2YgYW4gbWQgZGV2aWNlIHdoZXJlIHRoZSBtZCBkZXZp
Y2UgdXNlcyBvbmUgb3IgbW9yZSBmaWxlcyBhcyBiYWNraW5nIHN0b3JlDQphbmQgd2l0aCB0aGUg
bG9vcCBkcml2ZXIgYmV0d2VlbiB0aGUgbWQgZGV2aWNlIGFuZCB0aGUgZmlsZXM/DQoNCkhvdyBh
Ym91dCBmaWxlc3lzdGVtcyBpbXBsZW1lbnRlZCBpbiB1c2VyIHNwYWNlIHVzaW5nIHRoZSBGVVNF
IGRyaXZlcj8NClBhdGNoIDYvMTEgb2YgdGhpcyBzZXJpZXMgZnJlZXplcyB1c2VyIHNwYWNlIHBy
b2Nlc3NlcyBiZWZvcmUgZnJlZXppbmcNCmZpbGVzeXN0ZW1zLiBXaWxsIHRoYXQgd29yayBmb3Ig
RlVTRSBmaWxlc3lzdGVtcz8NCg0KVGhhbmtzLA0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-11-29 23:23 ` [PATCH 03/11] fs: add frozen sb state helpers Luis R. Rodriguez
@ 2017-11-30 17:13   ` Jan Kara
  2017-11-30 19:05     ` Luis R. Rodriguez
  2018-04-22  2:53     ` Luis R. Rodriguez
  0 siblings, 2 replies; 47+ messages in thread
From: Jan Kara @ 2017-11-30 17:13 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed 29-11-17 15:23:48, Luis R. Rodriguez wrote:
> The question of whether or not a superblock is frozen needs to be
> augmented in the future to account for differences between a user
> initiated freeze and a kernel initiated freeze done automatically
> on behalf of the kernel.
> 
> Provide helpers so that these can be used instead so that we don't
> have to expand checks later in these same call sites as we expand
> the definition of a frozen superblock.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

So helpers are fine but...

> +/**
> + * sb_is_frozen_by_user - is superblock frozen by a user call
> + * @sb: the super to check
> + *
> + * Returns true if the super freeze was initiated by userspace, for instance,
> + * an ioctl call.
> + */
> +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> +{
> +	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
> +}

... I dislike the _by_user() suffix as there may be different places that
call freeze_super() (e.g. device mapper does this during some operations).
Clearly we need to distinguish "by system suspend" and "the other" cases.
So please make this clear in the naming.

In fact, what might be a cleaner solution is to introduce a 'freeze_count'
for superblock freezing (we already do have this for block devices). Then
you don't need to differentiate these two cases - but you'd still need to
properly handle cleanup if freezing of all superblocks fails in the middle.
So I'm not 100% this works out nicely in the end. But it's certainly worth
a consideration.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-11-30 17:13   ` Jan Kara
@ 2017-11-30 19:05     ` Luis R. Rodriguez
  2017-12-01 11:47       ` Jan Kara
  2018-04-22  2:53     ` Luis R. Rodriguez
  1 sibling, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30 19:05 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote:
> ... I dislike the _by_user() suffix as there may be different places that
> call freeze_super() (e.g. device mapper does this during some operations).
> Clearly we need to distinguish "by system suspend" and "the other" cases.
> So please make this clear in the naming.

Ah. How about sb_frozen_by_cb() ?

> In fact, what might be a cleaner solution is to introduce a 'freeze_count'
> for superblock freezing (we already do have this for block devices). Then
> you don't need to differentiate these two cases - but you'd still need to
> properly handle cleanup if freezing of all superblocks fails in the middle.
> So I'm not 100% this works out nicely in the end. But it's certainly worth
> a consideration.

Ah, there are three important reasons for doing it the way I did it which are
easy to miss, unless you read the commit log message very carefully.

0) The ioctl interface causes a failure to be sent back to userspace if
you issue two consecutive freezes, or two thaws. Ie, once a filesystem is
frozen, a secondary call will result in an error. Likewise for thaw.

1) The new iterate supers stuff I added bail on the first error and return that
error. If we kept the ioctl() interface error scheme we'd be erroring out
if on suspend if userspace had already frozen a filesystem. Clearly that'd
be silly so we need to distinguish between the automatic kernel freezing
and the old userspace ioctl initiated interface, so that we can keep the
old behaviour but allow in-kernel auto freeze on suspend to work properly.

2) If we fail to suspend we need to then thaw up all filesystems. The order
in which we try to freeze is in reverse order on the super_block list. If we
fail though we iterate in proper order on the super_block list and thaw. If
you had two filesystems this means that if a failure happened on freezing
the first filesystem, we'd first thaw the other filesystem -- and because of
0) if we don't distinguish between the ioctl interface or auto freezing, we'd
also fail on thaw'ing given the other superblock wouldn't have been frozen.

So we need to keep two separate approaches. The count stuff would not suffice
to distinguish origin of source for freeze call.

Come to think of it, I think I forgot to avoid thaw if the freeze was ioctl
initiated..

thaw_unlocked(bool cb_call)
{
  if (sb_frozen_by_cb(sb) && !cb_call)
    return 0; /* skip as the user wanted to keep this fs frozen */
  ...
}

Even though the kernel auto call is new I think we need to keep ioctl initiated
frozen filesystems frozen to not break old userspace assumptions.

So, keeping all this in mind, does a count method still suffice?

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 17:01 ` Bart Van Assche
@ 2017-11-30 19:42   ` Luis R. Rodriguez
  2017-11-30 20:53     ` Bart Van Assche
  0 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-11-30 19:42 UTC (permalink / raw)
  To: Bart Van Assche, Benjamin Marzinski
  Cc: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	jikos, len.brown, tytso, boris.ostrovsky, ONeukum, linux-block,
	linux-kernel, nborisov, oleg.b.antonyan, linux-pm, linux-xfs,
	jgross, martin.petersen, dan.j.williams, yu.chen.surf, oleksandr,
	todd.e.brandt, jack

On Thu, Nov 30, 2017 at 05:01:13PM +0000, Bart Van Assche wrote:
> It's great to see that you are making progress with this work :-) However,
> what's not clear to me is what your (long-term) plan is for freezing
> filesystems that e.g. exist on top of a md RAID1 block device?

The original approach of doing a full swing kill on kthread freezing helpers
in one shot was too much. We must therefore approach this turtle style and
carefully. This patch series currently only addresses filesystems which
have a freeze_fs() callback implemented, that's it. The rest of the kernel
needs more work.

So this work is far as I've gotten so far.

Note, it does *not* remove the kthread freezing API calls on any other drivers,
as such that should still work.

> The md resync
> thread must be stopped before a system is frozen. Today the md driver uses
> the kthread freezing mechanism for that purpose. Do you have a plan for
> handling the more complicated scenarios, e.g. a filesystem that exists on top
> of an md device where the md device uses one or more files as backing store
> and with the loop driver between the md device and the files?

Nope not yet. It seems you have given this some thought though so you're 
help here is greatly appreciated. In fact the way we should see the long
term 'kill kthread freezing' effort should be a collaborative one. I've
never touched md, so folks more familiar with md should give this some
thought.

Can for instance md register_pm_notifier() and register_syscore_ops()
and do handy work to pause some work to replace kthread freezing?
Note that I believe a pm notifier is needed in case syscore_suspend()
is not called, say on a suspend fail.

> How about filesystems implemented in user space using the FUSE driver?

Not sure, someone more familiar with FUSE drivers should chime in. Right
now all this does is leverage filesystems which *do* implement freeze_fs()
only, and removes the kthread freezing calls after that. That's all. It
should not affect any other filesystems

Likewise consider filesystems that implement freeze_super() instead (like gfs2)
which don't hold sb->s_umount -- see commit 48b6bca6b7b8309 ("fs: add
freeze_super/thaw_super fs hooks"). If its desirable to address those as well,
it would seem a pair of non-locking iters somehow are needed. More work to
be done.

This all needs to be considered for the future as well.

> Patch 6/11 of this series freezes user space processes before freezing
> filesystems. Will that work for FUSE filesystems?

It doesn't add new code which should negatively affect FUSE drivers.
Long term if we wanted to address FUSE, it may require other work.

In the future then fs_suspend_freeze() may need to work for different
types of filesystems. Right now only filesystems which implement freeze_fs()
are addressed.

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 07/11] xfs: remove not needed freezing calls
  2017-11-30 16:21   ` Jan Kara
@ 2017-11-30 20:32     ` Rafael J. Wysocki
  2017-11-30 23:30       ` Dave Chinner
  0 siblings, 1 reply; 47+ messages in thread
From: Rafael J. Wysocki @ 2017-11-30 20:32 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, Al Viro, bart.vanassche, ming.lei,
	Ted Ts'o, Darrick J. Wong, Jiri Kosina, Rafael J. Wysocki,
	Pavel Machek, Len Brown, linux-fsdevel, Boris Ostrovsky,
	Juergen Gross, Todd Brandt, nborisov, Martin K. Petersen,
	Oliver Neukum, oleksandr, Oleg Antonyan, Yu Chen, Dan Williams,
	Linux PM, linux-block, linux-xfs, Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 5:21 PM, Jan Kara <jack@suse.cz> wrote:
> On Wed 29-11-17 15:23:52, Luis R. Rodriguez wrote:
>> This removes superflous freezer calls as they are no longer needed
>> as the VFS now performs filesystem freezing/thaw if the filesystem has
>> support for it.
>>
>> The following Coccinelle rule was used as follows:
>>
>> spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/
>
> I think your rule misses WQ_FREEZABLE flag for workqueues? That would be
> also good to get rid of...

We need that one (or equivalent) for the runtime PM workqueue at least.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 19:42   ` Luis R. Rodriguez
@ 2017-11-30 20:53     ` Bart Van Assche
  2017-11-30 21:03       ` Dave Chinner
  0 siblings, 1 reply; 47+ messages in thread
From: Bart Van Assche @ 2017-11-30 20:53 UTC (permalink / raw)
  To: bmarzins, mcgrof
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, dan.j.williams, rjw, jgross, oleksandr, yu.chen.surf,
	todd.e.brandt, martin.petersen, linux-fsdevel, jikos, len.brown,
	tytso, jack

T24gVGh1LCAyMDE3LTExLTMwIGF0IDIwOjQyICswMTAwLCBMdWlzIFIuIFJvZHJpZ3VleiB3cm90
ZToNCj4gT24gVGh1LCBOb3YgMzAsIDIwMTcgYXQgMDU6MDE6MTNQTSArMDAwMCwgQmFydCBWYW4g
QXNzY2hlIHdyb3RlOg0KPiA+IFRoZSBtZCByZXN5bmMNCj4gPiB0aHJlYWQgbXVzdCBiZSBzdG9w
cGVkIGJlZm9yZSBhIHN5c3RlbSBpcyBmcm96ZW4uIFRvZGF5IHRoZSBtZCBkcml2ZXIgdXNlcw0K
PiA+IHRoZSBrdGhyZWFkIGZyZWV6aW5nIG1lY2hhbmlzbSBmb3IgdGhhdCBwdXJwb3NlLiBEbyB5
b3UgaGF2ZSBhIHBsYW4gZm9yDQo+ID4gaGFuZGxpbmcgdGhlIG1vcmUgY29tcGxpY2F0ZWQgc2Nl
bmFyaW9zLCBlLmcuIGEgZmlsZXN5c3RlbSB0aGF0IGV4aXN0cyBvbiB0b3ANCj4gPiBvZiBhbiBt
ZCBkZXZpY2Ugd2hlcmUgdGhlIG1kIGRldmljZSB1c2VzIG9uZSBvciBtb3JlIGZpbGVzIGFzIGJh
Y2tpbmcgc3RvcmUNCj4gPiBhbmQgd2l0aCB0aGUgbG9vcCBkcml2ZXIgYmV0d2VlbiB0aGUgbWQg
ZGV2aWNlIGFuZCB0aGUgZmlsZXM/DQo+IA0KPiBOb3BlIG5vdCB5ZXQuIEl0IHNlZW1zIHlvdSBo
YXZlIGdpdmVuIHRoaXMgc29tZSB0aG91Z2h0IHRob3VnaCBzbyB5b3UncmUgDQo+IGhlbHAgaGVy
ZSBpcyBncmVhdGx5IGFwcHJlY2lhdGVkLiBJbiBmYWN0IHRoZSB3YXkgd2Ugc2hvdWxkIHNlZSB0
aGUgbG9uZw0KPiB0ZXJtICdraWxsIGt0aHJlYWQgZnJlZXppbmcnIGVmZm9ydCBzaG91bGQgYmUg
YSBjb2xsYWJvcmF0aXZlIG9uZS4gSSd2ZQ0KPiBuZXZlciB0b3VjaGVkIG1kLCBzbyBmb2xrcyBt
b3JlIGZhbWlsaWFyIHdpdGggbWQgc2hvdWxkIGdpdmUgdGhpcyBzb21lDQo+IHRob3VnaHQuDQo+
IA0KPiBDYW4gZm9yIGluc3RhbmNlIG1kIHJlZ2lzdGVyX3BtX25vdGlmaWVyKCkgYW5kIHJlZ2lz
dGVyX3N5c2NvcmVfb3BzKCkNCj4gYW5kIGRvIGhhbmR5IHdvcmsgdG8gcGF1c2Ugc29tZSB3b3Jr
IHRvIHJlcGxhY2Uga3RocmVhZCBmcmVlemluZz8NCj4gTm90ZSB0aGF0IEkgYmVsaWV2ZSBhIHBt
IG5vdGlmaWVyIGlzIG5lZWRlZCBpbiBjYXNlIHN5c2NvcmVfc3VzcGVuZCgpDQo+IGlzIG5vdCBj
YWxsZWQsIHNheSBvbiBhIHN1c3BlbmQgZmFpbC4NCg0KU29ycnkgYnV0IEkgZG9uJ3QgdGhpbmsg
dGhhdCBhIHNvbHV0aW9uIGNhbiBiZSBiYXNlZCBvbiBhIG5vdGlmaWVyIG1lY2hhbmlzbS4NCkZy
ZWV6aW5nIGhhcyB0byBoYXBwZW4gaW4gdGhlIG9yZGVyIGluIHdoaWNoIGRyaXZlcnMgYW5kIGZp
bGVzeXN0ZW1zIGhhdmUNCmJlZW4gc3RhY2tlZCAoZmlsZXN5c3RlbSA+IG1kIGRldmljZSA+IGZp
bGVzeXN0ZW0gZm9yIHRoZSBhYm92ZSBleGFtcGxlKS4NClNpbmNlIHRoZSBvcmRlciBpbiB3aGlj
aCBub3RpZmllcnMgYXJlIGNhbGxlZCBpcyByZWxhdGVkIHRvIHRoZSBvcmRlciBpbg0Kd2hpY2gg
bm90aWZpZXJzIGhhdmUgYmVlbiByZWdpc3RlcmVkIEkgZG9uJ3QgdGhpbmsgdGhhdCBhIHNvbHV0
aW9uIGZvciB0aGUNCmV4YW1wbGUgSSBkZXNjcmliZWQgY2FuIGJlIGJhc2VkIG9uIG5vdGlmaWVy
cy4gV2hhdCBJIHRoaW5rIHdlIG5lZWQgaXMgYQ0KbWVjaGFuaXNtIGZvciB0cmF2ZXJzaW5nIHRo
ZSBzdG9yYWdlIHN0YWNrIHRoYXQgaW5jbHVkZXMgYmxvY2sgZHJpdmVycyBhbmQNCmFuIGVxdWl2
YWxlbnQgb2YgZnJlZXplX2ZzKCkgZm9yIGJsb2NrIGRyaXZlcnMuIEZyZWV6aW5nIHNob3VsZCBv
Y2N1ciBieQ0KY2FsbGluZyB0aGUgZnJlZXplX2ZzKCkgbWV0aG9kcyBmb3IgZWFjaCBzdG9yYWdl
IGxheWVyIHN0YXJ0aW5nIGF0IHRoZSB0b3Agb2YNCnRoZSBzdG9yYWdlIHN0YWNrIGFuZCBwcm9j
ZWVkaW5nIHRvd2FyZHMgdGhlIGJvdHRvbS4NCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 20:53     ` Bart Van Assche
@ 2017-11-30 21:03       ` Dave Chinner
  0 siblings, 0 replies; 47+ messages in thread
From: Dave Chinner @ 2017-11-30 21:03 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: bmarzins, mcgrof, boris.ostrovsky, ONeukum, linux-block,
	linux-kernel, nborisov, oleg.b.antonyan, linux-pm, linux-xfs,
	pavel, darrick.wong, viro, ming.lei, dan.j.williams, rjw, jgross,
	oleksandr, yu.chen.surf, todd.e.brandt, martin.petersen,
	linux-fsdevel, jikos, len.brown, tytso, jack

On Thu, Nov 30, 2017 at 08:53:52PM +0000, Bart Van Assche wrote:
> On Thu, 2017-11-30 at 20:42 +0100, Luis R. Rodriguez wrote:
> > On Thu, Nov 30, 2017 at 05:01:13PM +0000, Bart Van Assche wrote:
> > > The md resync
> > > thread must be stopped before a system is frozen. Today the md driver uses
> > > the kthread freezing mechanism for that purpose. Do you have a plan for
> > > handling the more complicated scenarios, e.g. a filesystem that exists on top
> > > of an md device where the md device uses one or more files as backing store
> > > and with the loop driver between the md device and the files?
> > 
> > Nope not yet. It seems you have given this some thought though so you're 
> > help here is greatly appreciated. In fact the way we should see the long
> > term 'kill kthread freezing' effort should be a collaborative one. I've
> > never touched md, so folks more familiar with md should give this some
> > thought.
> > 
> > Can for instance md register_pm_notifier() and register_syscore_ops()
> > and do handy work to pause some work to replace kthread freezing?
> > Note that I believe a pm notifier is needed in case syscore_suspend()
> > is not called, say on a suspend fail.
> 
> Sorry but I don't think that a solution can be based on a notifier mechanism.
> Freezing has to happen in the order in which drivers and filesystems have
> been stacked (filesystem > md device > filesystem for the above example).

Yup, we need top down stack freezing. We get that at the filesystem
layer by the reverse order superblock iteration - that freezes newer
mounts before older mounts, so things like filesystems on loopback
freeze before the lower filesystem that hosts the loopback image.

I think what we need is the opposite of the "freeze_bdev()" interface.
That allows a block device to freeze the superblock on the block
device. i.e. when the block device needs to quiesce (e.g. for a
snapshot to be taken) it locks the filesystem and waits for it to
quiesce, then does what it needs to and unlocks the filesystem.

ISTM that you're asking for the opposite of this - a call that
allows the superblock to quiesce the underlying block device into a
known, unchanging state.  This would allow MD/dm to suspend whatever
on-going maintenance operations it has in progress until the
filesystem says "we need you to start again" when it gets a thaw
call....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (12 preceding siblings ...)
  2017-11-30 17:01 ` Bart Van Assche
@ 2017-11-30 21:51 ` Pavel Machek
  2017-12-01  0:44   ` Luis R. Rodriguez
  2017-12-13  1:09 ` Rafael J. Wysocki
  14 siblings, 1 reply; 47+ messages in thread
From: Pavel Machek @ 2017-11-30 21:51 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, yu.chen.surf, dan.j.williams, linux-pm,
	linux-block, linux-xfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1087 bytes --]

Hi!

> Proposed solution:
> 
> Instead of fixing such semantics and trying to get all filesystems to do it
> right, we can easily do away with all freezing calls if the filesystem
> implements a proper freeze_fs() callback. The following 9 filesystems have
> freeze_fs() implemented as such we can let the kernel issue the callback upon
> suspend and thaw on resume automatically on our behalf.
> 
>   o xfs
>   o reiserfs
>   o nilfs2
>   o jfs
>   o f2fs
>   o ext4
>   o ext2
>   o btrfs
> 
> Of these, the following have freezer helpers, which can then be removed
> after the kernel automaticaly calls freeze_fs for us on suspend:
> 
>   o xfs
>   o nilfs2
>   o jfs
>   o f2fs
>   o ext4
> 
> I've tested this on a system with ext4 and XFS, and have let 0-day go at
> without issues. I have branches availabe for linux-next [3] and Linus'
> latest tree [4].

Was hibernation tested? uswsusp?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 07/11] xfs: remove not needed freezing calls
  2017-11-30 20:32     ` Rafael J. Wysocki
@ 2017-11-30 23:30       ` Dave Chinner
  2017-11-30 23:40         ` Rafael J. Wysocki
  0 siblings, 1 reply; 47+ messages in thread
From: Dave Chinner @ 2017-11-30 23:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jan Kara, Luis R. Rodriguez, Al Viro, bart.vanassche, ming.lei,
	Ted Ts'o, Darrick J. Wong, Jiri Kosina, Rafael J. Wysocki,
	Pavel Machek, Len Brown, linux-fsdevel, Boris Ostrovsky,
	Juergen Gross, Todd Brandt, nborisov, Martin K. Petersen,
	Oliver Neukum, oleksandr, Oleg Antonyan, Yu Chen, Dan Williams,
	Linux PM, linux-block, linux-xfs, Linux Kernel Mailing List

On Thu, Nov 30, 2017 at 09:32:53PM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 30, 2017 at 5:21 PM, Jan Kara <jack@suse.cz> wrote:
> > On Wed 29-11-17 15:23:52, Luis R. Rodriguez wrote:
> >> This removes superflous freezer calls as they are no longer needed
> >> as the VFS now performs filesystem freezing/thaw if the filesystem has
> >> support for it.
> >>
> >> The following Coccinelle rule was used as follows:
> >>
> >> spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/
> >
> > I think your rule misses WQ_FREEZABLE flag for workqueues? That would be
> > also good to get rid of...
> 
> We need that one (or equivalent) for the runtime PM workqueue at least.

I think Jan was talking about all the WQ_FREEZABLE flags in
filesystem workqueues (such as all the XFS wqs) that are no longer
necessary once filesystems are frozen appropriately.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 07/11] xfs: remove not needed freezing calls
  2017-11-30 23:30       ` Dave Chinner
@ 2017-11-30 23:40         ` Rafael J. Wysocki
  0 siblings, 0 replies; 47+ messages in thread
From: Rafael J. Wysocki @ 2017-11-30 23:40 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Rafael J. Wysocki, Jan Kara, Luis R. Rodriguez, Al Viro,
	bart.vanassche, ming.lei, Ted Ts'o, Darrick J. Wong,
	Jiri Kosina, Pavel Machek, Len Brown, linux-fsdevel,
	Boris Ostrovsky, Juergen Gross, Todd Brandt, nborisov,
	Martin K. Petersen, Oliver Neukum, oleksandr, Oleg Antonyan,
	Yu Chen, Dan Williams, Linux PM, linux-block, linux-xfs,
	Linux Kernel Mailing List

On Friday, December 1, 2017 12:30:33 AM CET Dave Chinner wrote:
> On Thu, Nov 30, 2017 at 09:32:53PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Nov 30, 2017 at 5:21 PM, Jan Kara <jack@suse.cz> wrote:
> > > On Wed 29-11-17 15:23:52, Luis R. Rodriguez wrote:
> > >> This removes superflous freezer calls as they are no longer needed
> > >> as the VFS now performs filesystem freezing/thaw if the filesystem has
> > >> support for it.
> > >>
> > >> The following Coccinelle rule was used as follows:
> > >>
> > >> spatch --sp-file fs-freeze-cleanup.cocci --in-place fs/$FS/
> > >
> > > I think your rule misses WQ_FREEZABLE flag for workqueues? That would be
> > > also good to get rid of...
> > 
> > We need that one (or equivalent) for the runtime PM workqueue at least.
> 
> I think Jan was talking about all the WQ_FREEZABLE flags in
> filesystem workqueues (such as all the XFS wqs) that are no longer
> necessary once filesystems are frozen appropriately.

I see, thanks.  That was unclear to me, sorry for the confusion.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 21:51 ` Pavel Machek
@ 2017-12-01  0:44   ` Luis R. Rodriguez
  0 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-12-01  0:44 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Thu, Nov 30, 2017 at 10:51:57PM +0100, Pavel Machek wrote:
> Hi!
> 
> > Proposed solution:
> > 
> > Instead of fixing such semantics and trying to get all filesystems to do it
> > right, we can easily do away with all freezing calls if the filesystem
> > implements a proper freeze_fs() callback. The following 9 filesystems have
> > freeze_fs() implemented as such we can let the kernel issue the callback upon
> > suspend and thaw on resume automatically on our behalf.
> > 
> >   o xfs
> >   o reiserfs
> >   o nilfs2
> >   o jfs
> >   o f2fs
> >   o ext4
> >   o ext2
> >   o btrfs
> > 
> > Of these, the following have freezer helpers, which can then be removed
> > after the kernel automaticaly calls freeze_fs for us on suspend:
> > 
> >   o xfs
> >   o nilfs2
> >   o jfs
> >   o f2fs
> >   o ext4
> > 
> > I've tested this on a system with ext4 and XFS, and have let 0-day go at
> > without issues. I have branches availabe for linux-next [3] and Linus'
> > latest tree [4].
> 
> Was hibernation tested? uswsusp?

I had not done a test before but I just did and it worked, in fact I was able
to keep my ssh connection to my qemu guest after resume from hibernation with
this.

root@piggy:~# echo shutdown > /sys/power/disk; echo disk > /sys/power/state

[   87.930446] PM: hibernation entry
[   87.936294] firmware_class: device_cache_fw_images
[   87.936363] PM: Syncing filesystems ... 
[   87.979960] PM: done.
[   87.980594] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   87.983839] Freezing filesystems ... 
[   87.983844] xfs (sdb1): freezing
[   88.013313] ext4 (sda1): freezing
[   88.057635] done.
[   88.058242] OOM killer disabled.
...
[   88.145364] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   88.221583] Disabling non-boot CPUs ...
[   88.241231] Unregister pv shared memory for cpu 1
[   88.244139] smpboot: CPU 1 is now offline
[   88.273290] Unregister pv shared memory for cpu 2
[   88.276170] smpboot: CPU 2 is now offline
[   88.297296] Unregister pv shared memory for cpu 3
[   88.299991] smpboot: CPU 3 is now offline
[   88.302665] PM: Creating hibernation image:
[   88.305633] PM: Need to copy 107866 pages
[   88.305633] PM: Normal pages needed: 107866 + 1024, available pages: 940488
...

At this point my qemu session ends. I start it up again.

   88.305633] Enabling non-boot CPUs ...
[   88.305633] x86: Booting SMP configuration:
[   88.305633] smpboot: Booting Node 0 Processor 1 APIC 0x1
[   88.244097] kvm-clock: cpu 1, msr 1:3ffef041, secondary cpu clock
[   88.334899] KVM setup async PF for cpu 1
[   88.335174] kvm-stealtime: cpu 1, msr 13fc8d9c0
...
[   88.424852] sd 0:0:0:0: [sda] Starting disk
[   88.424893] sd 0:0:1:0: [sdb] Starting disk
[   88.424928] sd 1:0:0:0: [sdc] Starting disk
[   88.473888] PM: Basic memory bitmaps freed
[   88.473890] OOM killer enabled.
[   88.474876] ext4 (sda1): thawing
[   88.585233] ata2.01: NODEV after polling detection
[   88.587805] ata2.00: configured for MWDMA2
[   88.589436] ata1.00: configured for MWDMA2
[   88.592421] ata1.01: configured for MWDMA2
[   88.601141] xfs (sdb1): thawing
[   88.602430] Restarting tasks ... done.
[   88.611055] PM: hibernation exit
[   90.492926] e1000: ens3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[   98.780159] firmware_class: device_uncache_fw_images

And ssh is working after this, without requiring to initiate another connection.

Let me know if you'd like me to test something else.

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-11-30 19:05     ` Luis R. Rodriguez
@ 2017-12-01 11:47       ` Jan Kara
  2017-12-01 21:13         ` Luis R. Rodriguez
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Kara @ 2017-12-01 11:47 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jan Kara, viro, bart.vanassche, ming.lei, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Thu 30-11-17 20:05:48, Luis R. Rodriguez wrote:
> On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote:
> > ... I dislike the _by_user() suffix as there may be different places that
> > call freeze_super() (e.g. device mapper does this during some operations).
> > Clearly we need to distinguish "by system suspend" and "the other" cases.
> > So please make this clear in the naming.
> 
> Ah. How about sb_frozen_by_cb() ?

And what does 'cb' stand for? :)

> > In fact, what might be a cleaner solution is to introduce a 'freeze_count'
> > for superblock freezing (we already do have this for block devices). Then
> > you don't need to differentiate these two cases - but you'd still need to
> > properly handle cleanup if freezing of all superblocks fails in the middle.
> > So I'm not 100% this works out nicely in the end. But it's certainly worth
> > a consideration.
> 
> Ah, there are three important reasons for doing it the way I did it which are
> easy to miss, unless you read the commit log message very carefully.
> 
> 0) The ioctl interface causes a failure to be sent back to userspace if
> you issue two consecutive freezes, or two thaws. Ie, once a filesystem is
> frozen, a secondary call will result in an error. Likewise for thaw.

Yep. But also note that there's *another* interface to filesystem freezing
which behaves differently - freeze_bdev() (used internally by dm). That
interface uses the counter and freezing of already frozen device succeeds.
IOW it is a mess. We cannot change the behavior of the ioctl but we could
still provide an in-kernel interface to freeze_super() with the same
semantics as freeze_bdev() which might be easier to use by suspend - maybe
we could call it 'exclusive' (for the current freeze_super() semantics) and
'non-exclusive' (for the freeze_bdev() semantics) since this is very much
like O_EXCL open of block devices...

> 1) The new iterate supers stuff I added bail on the first error and return that
> error. If we kept the ioctl() interface error scheme we'd be erroring out
> if on suspend if userspace had already frozen a filesystem. Clearly that'd
> be silly so we need to distinguish between the automatic kernel freezing
> and the old userspace ioctl initiated interface, so that we can keep the
> old behaviour but allow in-kernel auto freeze on suspend to work properly.

This would work fine with the non-exclusive semantics I believe.

> 2) If we fail to suspend we need to then thaw up all filesystems. The order
> in which we try to freeze is in reverse order on the super_block list. If we
> fail though we iterate in proper order on the super_block list and thaw. If
> you had two filesystems this means that if a failure happened on freezing
> the first filesystem, we'd first thaw the other filesystem -- and because of
> 0) if we don't distinguish between the ioctl interface or auto freezing, we'd
> also fail on thaw'ing given the other superblock wouldn't have been frozen.
> 
> So we need to keep two separate approaches. The count stuff would not suffice
> to distinguish origin of source for freeze call.
> 
> Come to think of it, I think I forgot to avoid thaw if the freeze was ioctl
> initiated..
> 
> thaw_unlocked(bool cb_call)
> {
>   if (sb_frozen_by_cb(sb) && !cb_call)
>     return 0; /* skip as the user wanted to keep this fs frozen */
>   ...
> }
> 
> Even though the kernel auto call is new I think we need to keep ioctl initiated
> frozen filesystems frozen to not break old userspace assumptions.
> 
> So, keeping all this in mind, does a count method still suffice?

The count method would need a different error recovery method - i.e. if you
fail freezing filesystems somewhere in the middle of iteration through
superblock list, you'd need to iterate from that point on to the superblock
where you've started. This is somewhat more complicated than your approach
but it seems cleaner to me:

1) Function freezing all superblocks either succeeds and all superblocks
are frozen or fails and no superblocks are (additionally) frozen.

2) It is not that normal users + one special user (who owns the "flag" in
the superblock in form of a special freeze state) setup. We'd simply have
exclusive and non-exclusive users of superblock freezing and there can be
arbitrary numbers of them.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-30 16:41   ` Jiri Kosina
  2017-11-30 16:50     ` Yu Chen
@ 2017-12-01 19:05     ` Jeff Layton
  2017-12-01 21:51       ` Dave Chinner
  1 sibling, 1 reply; 47+ messages in thread
From: Jeff Layton @ 2017-12-01 19:05 UTC (permalink / raw)
  To: Jiri Kosina, Yu Chen
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Dan Williams, Linux PM list, linux-block, linux-xfs,
	Linux Kernel Mailing List, Zhang Rui

On Thu, 2017-11-30 at 17:41 +0100, Jiri Kosina wrote:
> On Fri, 1 Dec 2017, Yu Chen wrote:
> 
> > BTW, is nfs able to be included in this set? I also encountered a 
> > freeze() failure due to nfs access during that stage recently.
> 
> The freezer usage in NFS is magnitudes more complicated, so it makes sense 
> to first go after the lower hanging fruit to figure out the viability of 
> the whole aproach in practice.
> 

Agreed that we should do this in stages. It doesn't help that freezer
handling in the client is a bit of a mess at this point...

At a high level for NFS, I think we need to have freeze_fs make the RPC
engine "park" newly issued RPCs for that fs' client onto a
rpc_wait_queue. Any RPC that has already been sent however, we need to
wait for a reply.

Once everything is quiesced we can return and call it frozen.
unfreeze_fs can then just have the engine stop parking RPCs and wake up
the waitq.

That should be enough to make suspend and resume work more reliably. If,
however, you're interested in making the cgroup freezer also work, then
we may need to do a bit more work to ensure that we don't end up with
frozen tasks squatting on VFS locks.
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-12-01 11:47       ` Jan Kara
@ 2017-12-01 21:13         ` Luis R. Rodriguez
  2017-12-21 11:03           ` Jan Kara
  0 siblings, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-12-01 21:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Fri, Dec 01, 2017 at 12:47:24PM +0100, Jan Kara wrote:
> On Thu 30-11-17 20:05:48, Luis R. Rodriguez wrote:
> > On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote:
> > > ... I dislike the _by_user() suffix as there may be different places that
> > > call freeze_super() (e.g. device mapper does this during some operations).
> > > Clearly we need to distinguish "by system suspend" and "the other" cases.
> > > So please make this clear in the naming.
> > 
> > Ah. How about sb_frozen_by_cb() ?
> 
> And what does 'cb' stand for? :)

Callback. But let me think about bdev usage a bit and we can worry about the
bikeshedding later.

> > > In fact, what might be a cleaner solution is to introduce a 'freeze_count'
> > > for superblock freezing (we already do have this for block devices). Then
> > > you don't need to differentiate these two cases - but you'd still need to
> > > properly handle cleanup if freezing of all superblocks fails in the middle.
> > > So I'm not 100% this works out nicely in the end. But it's certainly worth
> > > a consideration.
> > 
> > Ah, there are three important reasons for doing it the way I did it which are
> > easy to miss, unless you read the commit log message very carefully.
> > 
> > 0) The ioctl interface causes a failure to be sent back to userspace if
> > you issue two consecutive freezes, or two thaws. Ie, once a filesystem is
> > frozen, a secondary call will result in an error. Likewise for thaw.
> 
> Yep. But also note that there's *another* interface to filesystem freezing
> which behaves differently - freeze_bdev() (used internally by dm). That
> interface uses the counter and freezing of already frozen device succeeds.

Ah... so freeze_bdev() semantics matches the same semantics I was looking
for.

> IOW it is a mess.

To say the least.

> We cannot change the behavior of the ioctl but we could
> still provide an in-kernel interface to freeze_super() with the same
> semantics as freeze_bdev() which might be easier to use by suspend - maybe
> we could call it 'exclusive' (for the current freeze_super() semantics) and
> 'non-exclusive' (for the freeze_bdev() semantics) since this is very much
> like O_EXCL open of block devices...

Sure, now typically I see we make exclusive calls with the postfix _excl() so
I take it you'd be OK in renaming freeze_super() freeze_super_excl() eventually
then?

I totally missed freeze_bdev() otherwise I think I would have picked up on the
shared semantics stuff and I would have just made a helper out of what
freeze_bdev() uses, and then have both in-kernel and freeze_bdev() use it.

I'll note that its still not perfectly clear if really the semantics behind
freeze_bdev() match what I described above fully. That still needs to be
vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
an ioctl initiated freeze had occurred before? If so then great. Otherwise
I think we'll need to distinguish the ioctl interface. Worst possible case
is that bdev semantics and in-kernel semantics differ somehow, then that
will really create a holy fucking mess.

> > 1) The new iterate supers stuff I added bail on the first error and return that
> > error. If we kept the ioctl() interface error scheme we'd be erroring out
> > if on suspend if userspace had already frozen a filesystem. Clearly that'd
> > be silly so we need to distinguish between the automatic kernel freezing
> > and the old userspace ioctl initiated interface, so that we can keep the
> > old behaviour but allow in-kernel auto freeze on suspend to work properly.
> 
> This would work fine with the non-exclusive semantics I believe.

Groovy.

> > 2) If we fail to suspend we need to then thaw up all filesystems. The order
> > in which we try to freeze is in reverse order on the super_block list. If we
> > fail though we iterate in proper order on the super_block list and thaw. If
> > you had two filesystems this means that if a failure happened on freezing
> > the first filesystem, we'd first thaw the other filesystem -- and because of
> > 0) if we don't distinguish between the ioctl interface or auto freezing, we'd
> > also fail on thaw'ing given the other superblock wouldn't have been frozen.
> > 
> > So we need to keep two separate approaches. The count stuff would not suffice
> > to distinguish origin of source for freeze call.
> > 
> > Come to think of it, I think I forgot to avoid thaw if the freeze was ioctl
> > initiated..
> > 
> > thaw_unlocked(bool cb_call)
> > {
> >   if (sb_frozen_by_cb(sb) && !cb_call)
> >     return 0; /* skip as the user wanted to keep this fs frozen */
> >   ...
> > }
> > 
> > Even though the kernel auto call is new I think we need to keep ioctl initiated
> > frozen filesystems frozen to not break old userspace assumptions.
> > 
> > So, keeping all this in mind, does a count method still suffice?
> 
> The count method would need a different error recovery method - i.e. if you
> fail freezing filesystems somewhere in the middle of iteration through
> superblock list, you'd need to iterate from that point on to the superblock
> where you've started. This is somewhat more complicated than your approach
> but it seems cleaner to me:
> 
> 1) Function freezing all superblocks either succeeds and all superblocks
> are frozen or fails and no superblocks are (additionally) frozen.

To be clear, for now this would just be, all superblocks that support
freeze_fs() are frozen :)

> 2) It is not that normal users + one special user (who owns the "flag" in
> the superblock in form of a special freeze state) setup. We'd simply have
> exclusive and non-exclusive users of superblock freezing and there can be
> arbitrary numbers of them.

Sorry I did not understand this point. Can you rephrase perhaps a bit?

Anyway, I just tried implementing this and it seemed rather easy to
use a pivot, however note that then freeze_processes() which calls
fs_suspend_freeze() would somehow need to pass the failed sb... do
we want to have let fs_suspend_freeze() pass a parameter to be set
to the failed sb of it failed? Locking-wise this seems racy.

So I mean, adding support to thaw using a pivot, the failed sb is
rather easy:

diff --git a/fs/super.c b/fs/super.c
index 885711c1d35b..8cb6f38652d8 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -614,13 +614,21 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
  *	locked superblock and given argument. Returns 0 unless an error
  *	occurred on calling the function on any superblock.
  */
-int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg)
+int iterate_supers_excl(int (*f)(struct super_block *, void *), void *arg,
+			struct super_block *pivot)
 {
 	struct super_block *sb, *p = NULL;
 	int error = 0;
 
 	spin_lock(&sb_lock);
 	list_for_each_entry(sb, &super_blocks, s_list) {
+		/* If we have a pivot, start work on the next item */
+		if (pivot) {
+			if (sb != pivot)
+				continue;
+			pivot = NULL;
+			continue;
+		}
 		if (hlist_unhashed(&sb->s_instances))
 			continue;
 		sb->s_count++;

But we'd still need to to give enough context to let thaw use the failed sb
as a pivot.

  Luis

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-12-01 19:05     ` Jeff Layton
@ 2017-12-01 21:51       ` Dave Chinner
  0 siblings, 0 replies; 47+ messages in thread
From: Dave Chinner @ 2017-12-01 21:51 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Jiri Kosina, Yu Chen, Luis R. Rodriguez, viro, bart.vanassche,
	ming.lei, tytso, darrick.wong, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, Dan Williams, Linux PM list, linux-block,
	linux-xfs, Linux Kernel Mailing List, Zhang Rui

On Fri, Dec 01, 2017 at 02:05:44PM -0500, Jeff Layton wrote:
> On Thu, 2017-11-30 at 17:41 +0100, Jiri Kosina wrote:
> > On Fri, 1 Dec 2017, Yu Chen wrote:
> > 
> > > BTW, is nfs able to be included in this set? I also encountered a 
> > > freeze() failure due to nfs access during that stage recently.
> > 
> > The freezer usage in NFS is magnitudes more complicated, so it makes sense 
> > to first go after the lower hanging fruit to figure out the viability of 
> > the whole aproach in practice.
> > 
> 
> Agreed that we should do this in stages. It doesn't help that freezer
> handling in the client is a bit of a mess at this point...
> 
> At a high level for NFS, I think we need to have freeze_fs make the RPC
> engine "park" newly issued RPCs for that fs' client onto a
> rpc_wait_queue. Any RPC that has already been sent however, we need to
> wait for a reply.
> 
> Once everything is quiesced we can return and call it frozen.
> unfreeze_fs can then just have the engine stop parking RPCs and wake up
> the waitq.

That seems pretty reasonable. freezing is expected to take a bit of
time to run - local filesystems can do a fair bit of IO draining
queues, inflight operations and bringing the journal into a
consistent state on disk before declaring the filesystem is frozen.

> That should be enough to make suspend and resume work more reliably. If,
> however, you're interested in making the cgroup freezer also work, then
> we may need to do a bit more work to ensure that we don't end up with
> frozen tasks squatting on VFS locks.

None of the existing freezing code gives those guarantees. In fact,
freezing a filesystem pretty much guarantees the opposite - that
tasks *will freeze when holding VFS locks* - and so the cgroup
freezer is broken by design if it requires tasks to be frozen
without holding any VFS/filesystem lock context. So I wouldn't
really worry about the cgroup freezer....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
                   ` (13 preceding siblings ...)
  2017-11-30 21:51 ` Pavel Machek
@ 2017-12-13  1:09 ` Rafael J. Wysocki
  2017-12-19 16:50   ` Luis R. Rodriguez
  14 siblings, 1 reply; 47+ messages in thread
From: Rafael J. Wysocki @ 2017-12-13  1:09 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Thursday, November 30, 2017 12:23:45 AM CET Luis R. Rodriguez wrote:
> This is a followup from the original RFC which proposed to start
> to kill kthread freezing all together [0]. Instead of going straight
> out to the jugular for kthread freezing this series only addresses
> killing freezer calls on filesystems which implement freeze_fs, after
> we let the kernel freeze these filesystems for us on suspend.
> 
> This approach puts on a slow but steady path towards the original goal
> though. Each subsystem could look for similar solutions. Even with
> filesystems we're not all done yet, after this we'll still have to
> decide what to do about filesystems which do not implement freeze_fs().
> 
> Motivation and problem:
> 
> kthreads have some semantics for freezing, which helps the kernel
> freeze them when a system is going to suspend or hibernation. These
> semantics are not well defined though, and it actually turns out
> pretty hard to get it right.
> 
> Without a proper solution suspend and hibernation are fragile on filesystems,
> it can easily break suspend and fixing such issues are in no way trivial [1]
> [2].
> 
> Proposed solution:
> 
> Instead of fixing such semantics and trying to get all filesystems to do it
> right, we can easily do away with all freezing calls if the filesystem
> implements a proper freeze_fs() callback. The following 9 filesystems have
> freeze_fs() implemented as such we can let the kernel issue the callback upon
> suspend and thaw on resume automatically on our behalf.
> 
>   o xfs
>   o reiserfs
>   o nilfs2
>   o jfs
>   o f2fs
>   o ext4
>   o ext2
>   o btrfs
> 
> Of these, the following have freezer helpers, which can then be removed
> after the kernel automaticaly calls freeze_fs for us on suspend:
> 
>   o xfs
>   o nilfs2
>   o jfs
>   o f2fs
>   o ext4
> 
> I've tested this on a system with ext4 and XFS, and have let 0-day go at
> without issues. I have branches availabe for linux-next [3] and Linus'
> latest tree [4].
> 
> Further testing, thoughts, reviews, flames are all equally appreciated.
> 
> [0] https://lkml.kernel.org/r/20171003185313.1017-1-mcgrof@kernel.org
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1043449
> [2] https://lkml.kernel.org/r/20171113103139.GA18936@yu-chen.sh.intel.com
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20171129-fs-freeze-cleanup
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20171129-fs-freeze-cleanup
> 
> Luis R. Rodriguez (11):
>   fs: provide unlocked helper for freeze_super()
>   fs: provide unlocked helper thaw_super()
>   fs: add frozen sb state helpers
>   fs: distinguish between user initiated freeze and kernel initiated
>     freeze
>   fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
>   fs: freeze on suspend and thaw on resume
>   xfs: remove not needed freezing calls
>   ext4: remove not needed freezing calls
>   f2fs: remove not needed freezing calls
>   nilfs2: remove not needed freezing calls
>   jfs: remove not needed freezing calls
> 
>  fs/ext4/ext4_jbd2.c    |   2 +-
>  fs/ext4/super.c        |   2 -
>  fs/f2fs/gc.c           |   5 +-
>  fs/f2fs/segment.c      |   6 +-
>  fs/jfs/jfs_logmgr.c    |  11 +-
>  fs/jfs/jfs_txnmgr.c    |  31 ++---
>  fs/nilfs2/segment.c    |  48 ++++----
>  fs/super.c             | 320 ++++++++++++++++++++++++++++++++++++++++---------
>  fs/xfs/xfs_trans.c     |   2 +-
>  fs/xfs/xfs_trans_ail.c |   7 +-
>  include/linux/fs.h     |  63 +++++++++-
>  kernel/power/process.c |  15 ++-
>  12 files changed, 378 insertions(+), 134 deletions(-)
> 

I'm assuming an update of this to be posted due to the comments from Jan
on patch [3/11] and patch [7/11] probably.

Is there anything else that needs to be addressed?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate
  2017-12-13  1:09 ` Rafael J. Wysocki
@ 2017-12-19 16:50   ` Luis R. Rodriguez
  0 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2017-12-19 16:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Wed, Dec 13, 2017 at 02:09:49AM +0100, Rafael J. Wysocki wrote:
> I'm assuming an update of this to be posted due to the comments from Jan
> on patch [3/11] and patch [7/11] probably.
> 
> Is there anything else that needs to be addressed?

I was waiting on Jan Kara's feedback on how he'd like to proceed with the
unthawing on error given his point on that the device mapper API seems to match
the in kernel automatic freezing just that I didn't use that same interface.

0-day did come back with one RCU issue which I also have to address:

[  422.919958] kernel BUG at kernel/rcu/sync.c:228!                                                                                                                                           
[  422.920115] invalid opcode: 0000 [#1] SMP                                                                                                                                                  
[  422.920212] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver netconsole sr_mod cdrom sd_mod sg snd_hda_codec_idt snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal    
+intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul snd_hda_intel crc32_pclmul crc32c_intel snd_hda_codec snd_hda_core snd_hwdep i915 ghash_clmulni_intel cryptd snd_pcm pcspkr         
+drm_kms_helper snd_timer ahci libahci syscopyarea sysfillrect sysimgblt fb_sys_fops snd libata shpchp soundcore drm video ip_tables                                                          
[  422.921168] CPU: 2 PID: 237 Comm: kworker/2:3 Not tainted 4.15.0-rc1-00030-gf95c16a #1                                                                                                     
[  422.921347] Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013                                                                                                            
[  422.921515] Workqueue: events destroy_super_work                                                                                                                                           
[  422.921628] task: ffff8801bfcd0000 task.stack: ffffc90001128000                                                                                                                            
[  422.921768] RIP: 0010:rcu_sync_dtor+0x65/0x70                                                                                                                                              
[  422.921874] RSP: 0000:ffffc9000112be60 EFLAGS: 00010286                                                                                                                                    
[  422.921985] RAX: 0000000000080000 RBX: ffff8801c00793d8 RCX: 000000000001fece                                                                                                              
[  422.922133] RDX: 00000000fffffff6 RSI: 0000000000000282 RDI: ffff8801c00793d8                                                                                                              
[  422.922283] RBP: ffff880212f1b6c0 R08: 0000000000000000 R09: 000000000000009c                                                                                                              
[  422.922432] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880212f1f800                                                                                                              
[  422.922579] R13: 0000000000000000 R14: ffff8801c036db40 R15: ffff8801c00795b0                                                                                                              
[  422.922728] FS:  0000000000000000(0000) GS:ffff880212f00000(0000) knlGS:0000000000000000                                                                                                   
[  422.922931] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                              
[  422.923054] CR2: 00007ffe2f820ff8 CR3: 0000000001e09002 CR4: 00000000001606e0                                                                                                              
[  422.923203] Call Trace:                                                                                                                                                                    
[  422.923266]  percpu_free_rwsem+0x15/0x30                                                                                                                                                   
[  422.923357]  destroy_super_work+0x3d/0x50                                                                                                                                                  
[  422.923449]  process_one_work+0x18f/0x3e0                                                                                                                                                  
[  422.923540]  worker_thread+0x3a/0x3b0                                                                                                                                                      
[  422.923623]  ? process_one_work+0x3e0/0x3e0                                                                                                                                                
[  422.923716]  kthread+0x11c/0x140                                                                                                                                                           
[  422.923792]  ? kthread_create_worker_on_cpu+0x50/0x50                                                                                                                                      
[  422.923905]  ret_from_fork+0x1f/0x30                                                                                                                                                       
[  422.923986] Code: 00 fb 66 0f 1f 44 00 00 65 ff 0d 57 6c f2 7e 85 d2 74 15 8b 43 38 48 8d 04 40 ff 14 c5 f0 f5 a1 81 8b 43 20 85 c0 75 05 5b 5d c3 <0f> 0b 0f 0b 90 90 90 90 90 90 90 0f 1f
+44 00 00 8b 87 c8 c3 00                                                                                                                                                                      
[  422.924405] RIP: rcu_sync_dtor+0x65/0x70 RSP: ffffc9000112be60 

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-12-01 21:13         ` Luis R. Rodriguez
@ 2017-12-21 11:03           ` Jan Kara
  2018-04-18  0:59             ` Luis R. Rodriguez
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Kara @ 2017-12-21 11:03 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jan Kara, viro, bart.vanassche, ming.lei, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel

Hello,

I think I owe you a reply here... Sorry that it took so long.

On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> On Fri, Dec 01, 2017 at 12:47:24PM +0100, Jan Kara wrote:
> > On Thu 30-11-17 20:05:48, Luis R. Rodriguez wrote:
> > > > In fact, what might be a cleaner solution is to introduce a 'freeze_count'
> > > > for superblock freezing (we already do have this for block devices). Then
> > > > you don't need to differentiate these two cases - but you'd still need to
> > > > properly handle cleanup if freezing of all superblocks fails in the middle.
> > > > So I'm not 100% this works out nicely in the end. But it's certainly worth
> > > > a consideration.
> > > 
> > > Ah, there are three important reasons for doing it the way I did it which are
> > > easy to miss, unless you read the commit log message very carefully.
> > > 
> > > 0) The ioctl interface causes a failure to be sent back to userspace if
> > > you issue two consecutive freezes, or two thaws. Ie, once a filesystem is
> > > frozen, a secondary call will result in an error. Likewise for thaw.
> > 
> > Yep. But also note that there's *another* interface to filesystem freezing
> > which behaves differently - freeze_bdev() (used internally by dm). That
> > interface uses the counter and freezing of already frozen device succeeds.
> 
> Ah... so freeze_bdev() semantics matches the same semantics I was looking
> for.
> 
> > IOW it is a mess.
> 
> To say the least.
> 
> > We cannot change the behavior of the ioctl but we could
> > still provide an in-kernel interface to freeze_super() with the same
> > semantics as freeze_bdev() which might be easier to use by suspend - maybe
> > we could call it 'exclusive' (for the current freeze_super() semantics) and
> > 'non-exclusive' (for the freeze_bdev() semantics) since this is very much
> > like O_EXCL open of block devices...
> 
> Sure, now typically I see we make exclusive calls with the postfix _excl() so
> I take it you'd be OK in renaming freeze_super() freeze_super_excl() eventually
> then?

In principle yes but let's leave the naming disputes to a later time when
it is clear what API do we actually want to provide.

> I totally missed freeze_bdev() otherwise I think I would have picked up on the
> shared semantics stuff and I would have just made a helper out of what
> freeze_bdev() uses, and then have both in-kernel and freeze_bdev() use it.
> 
> I'll note that its still not perfectly clear if really the semantics behind
> freeze_bdev() match what I described above fully. That still needs to be
> vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> an ioctl initiated freeze had occurred before? If so then great. Otherwise
> I think we'll need to distinguish the ioctl interface. Worst possible case
> is that bdev semantics and in-kernel semantics differ somehow, then that
> will really create a holy fucking mess.

I believe nobody really thought about mixing those two interfaces to fs
freezing and so the behavior is basically defined by the implementation.
That is:

freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY
freeze_bdev() on sb frozen by freeze_bdev() -> success
ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY

thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL
ioctl_fsthaw() on sb frozen by freeze_bdev() -> success

What I propose is the following API:

freeze_super_excl()
  - freezes superblock, returns EBUSY if the superblock is already frozen
    (either by another freeze_super_excl() or by freeze_super())
freeze_super()
  - this function will make sure superblock is frozen when the function
    returns with success. It can be nested with other freeze_super() or
    freeze_super_excl() calls (this second part is different from how
    freeze_bdev() behaves currently but AFAICT this behavior is actually
    what all current users of freeze_bdev() really want - just make sure
    fs cannot be written to)
thaw_super()
  - counterpart to freeze_super(), would fail with EINVAL if we were to
    drop the last "freeze reference" but sb was actually frozen with
    freeze_super_excl()
thaw_super_excl()
  - counterpart to freeze_super_excl(). Fails with EINVAL if sb was not
    frozen with freeze_super_excl() (this is different to current behavior
    but I don't believe anyone relies on this and doing otherwise is asking
    for data corruption).

I'd implement it by a freeze counter in the superblock (similar to what we
currently have in bdev) where every call to freeze_super() or
freeze_super_excl() would add one. Additionally we'd have a flag in the
superblock whether the first freeze (it could not be any other since those
would fail with EBUSY) came from freeze_super_excl().

Then we could make ioctl interface use the _excl() variant of the freezing
API, freeze_bdev() would use the non-exclusive variant (we could drop the
freeze counter in bdev completely), your freezing on suspend could then use
the non-exclusive variant as well.

Also when doing this, you'd need to move code like:

        if (sb->s_op->freeze_super)
                error = sb->s_op->freeze_super(sb);
        else
                error = freeze_super(sb);

into the freeze_super() / freeze_super_excl() handler behind the
freeze counting code which might be a bit tricky WRT locking. GFS2 is the
only fs having ->freeze_super() and that callback was implemented specifically
so that it can do its own (cluster wide) locking before generic code grabbing
s_umount semaphore. Then internally GFS2 ends up calling freeze_super()
from freeze_go_sync() when cluster lock is acquired.


> > 2) It is not that normal users + one special user (who owns the "flag" in
> > the superblock in form of a special freeze state) setup. We'd simply have
> > exclusive and non-exclusive users of superblock freezing and there can be
> > arbitrary numbers of them.
> 
> Sorry I did not understand this point. Can you rephrase perhaps a bit?
> 
> Anyway, I just tried implementing this and it seemed rather easy to
> use a pivot, however note that then freeze_processes() which calls
> fs_suspend_freeze() would somehow need to pass the failed sb... do
> we want to have let fs_suspend_freeze() pass a parameter to be set
> to the failed sb of it failed? Locking-wise this seems racy.

So with your iterate_supers_excl() doing this is somewhat difficult but you
could have something like:

int freeze_all_supers(void)
{
	struct super_block *sb, *p = NULL;
	int error = 0;

	spin_lock(&sb_lock);
	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
		if (hlist_unhashed(&sb->s_instances))
			continue;
		sb->s_count++;
		spin_unlock(&sb_lock);

		down_write(&sb->s_umount);
		if (sb->s_root && (sb->s_flags & SB_BORN)) {
			error = freeze_super(sb, arg);
			if (error) {
				up_write(&sb->s_umount);
				spin_lock(&sb_lock);
				if (p)
					__put_super(p);
				p = sb;
				list_for_each_entry_continue(sb, &super_blocks,
							     s_list) {
					if (hlist_unhashed(&sb->s_instances))
						continue;
					sb->s_count++;
					spin_unlock(&sb_lock);

					down_write(&sb->s_umount);
					if (sb->s_root && (sb->s_flags & SB_BORN))
						thaw_super(sb, arg);
					up_write(&sb->s_umount);

					spin_lock(&sb_lock);
					if (p)
						__put_super(p);
					p = sb;
				}
				break;
			}
		}
		up_write(&sb->s_umount);

		spin_lock(&sb_lock);
		if (p)
			__put_super(p);
		p = sb;
	}
	if (p)
		__put_super(p);
	spin_unlock(&sb_lock);

	return error;
}

And you could possibly factor that out into two helper functions for
iterating the superblocks, just they'd need more parameters and you'd need
to pass reference (sb->count) when passing in the 'pivot' as you call it.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-12-21 11:03           ` Jan Kara
@ 2018-04-18  0:59             ` Luis R. Rodriguez
  2018-04-18 10:12               ` Jan Kara
  2018-04-20 18:49               ` Luis R. Rodriguez
  0 siblings, 2 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2018-04-18  0:59 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, jlayton

On Thu, Dec 21, 2017 at 12:03:29PM +0100, Jan Kara wrote:
> Hello,
> 
> I think I owe you a reply here... Sorry that it took so long.

Took me just as long :)

> On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> > 
> > I'll note that its still not perfectly clear if really the semantics behind
> > freeze_bdev() match what I described above fully. That still needs to be
> > vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> > an ioctl initiated freeze had occurred before? If so then great. Otherwise
> > I think we'll need to distinguish the ioctl interface. Worst possible case
> > is that bdev semantics and in-kernel semantics differ somehow, then that
> > will really create a holy fucking mess.
> 
> I believe nobody really thought about mixing those two interfaces to fs
> freezing and so the behavior is basically defined by the implementation.
> That is:
> 
> freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY

Note below as well on your *future* freeze_super() implementation.

> freeze_bdev() on sb frozen by freeze_bdev() -> success
> ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
> ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY
> 
> thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL

Phew, so this is what we want for the in-kernel freezing so we're good
and *can* combine these then.

> ioctl_fsthaw() on sb frozen by freeze_bdev() -> success
> 
> What I propose is the following API:
> 
> freeze_super_excl()
>   - freezes superblock, returns EBUSY if the superblock is already frozen
>     (either by another freeze_super_excl() or by freeze_super())
> freeze_super()
>   - this function will make sure superblock is frozen when the function
>     returns with success. 

That's straight forward.

>     It can be nested with other freeze_super() or
>     freeze_super_excl() calls 

This is where it can get hairy. More below.

>     (this second part is different from how
>     freeze_bdev() behaves currently but AFAICT this behavior is actually
>     what all current users of freeze_bdev() really want - just make sure
>     fs cannot be written to)

If we can agree to this, then sure. However there are two types of
possible nested calls to consider, one where the sb was already frozen
by an IOCTL, and the other where it was initiated by either another
freeze_super_excl() or another freeze_super() call which is currently
being processed. For the first type, its easy to say the device is
already frozen as such return success. If the freezing is ongoing,
we may want to wait or not wait, and this will depend on our current
use cases for freeze_bdev().

As you noted above, freeze_bdev() currently returns EBUSY if we had
the sb already frozen by ioctl_fsfreeze(). It may be a welcomed
enhancement to correct the semantics first to address the first case,
but keep the EBUSY for the other case. A secondary patch could then
add a completion mechanism and let callers decide to either wait or not.
*Iff* the caller did not opt-in to wait we keep the EBUSY return.

Seem reasonable?

I'll address the rest of the mail later.

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2018-04-18  0:59             ` Luis R. Rodriguez
@ 2018-04-18 10:12               ` Jan Kara
  2018-04-20 18:49               ` Luis R. Rodriguez
  1 sibling, 0 replies; 47+ messages in thread
From: Jan Kara @ 2018-04-18 10:12 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jan Kara, viro, bart.vanassche, ming.lei, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel, jlayton

On Tue 17-04-18 17:59:36, Luis R. Rodriguez wrote:
> On Thu, Dec 21, 2017 at 12:03:29PM +0100, Jan Kara wrote:
> > On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> > > 
> > > I'll note that its still not perfectly clear if really the semantics behind
> > > freeze_bdev() match what I described above fully. That still needs to be
> > > vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> > > an ioctl initiated freeze had occurred before? If so then great. Otherwise
> > > I think we'll need to distinguish the ioctl interface. Worst possible case
> > > is that bdev semantics and in-kernel semantics differ somehow, then that
> > > will really create a holy fucking mess.
> > 
> > I believe nobody really thought about mixing those two interfaces to fs
> > freezing and so the behavior is basically defined by the implementation.
> > That is:
> > 
> > freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY
> 
> Note below as well on your *future* freeze_super() implementation.
> 
> > freeze_bdev() on sb frozen by freeze_bdev() -> success
> > ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
> > ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > 
> > thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL
> 
> Phew, so this is what we want for the in-kernel freezing so we're good
> and *can* combine these then.
> 
> > ioctl_fsthaw() on sb frozen by freeze_bdev() -> success
> > 
> > What I propose is the following API:
> > 
> > freeze_super_excl()
> >   - freezes superblock, returns EBUSY if the superblock is already frozen
> >     (either by another freeze_super_excl() or by freeze_super())
> > freeze_super()
> >   - this function will make sure superblock is frozen when the function
> >     returns with success. 
> 
> That's straight forward.
> 
> >     It can be nested with other freeze_super() or
> >     freeze_super_excl() calls 
> 
> This is where it can get hairy. More below.
> 
> >     (this second part is different from how
> >     freeze_bdev() behaves currently but AFAICT this behavior is actually
> >     what all current users of freeze_bdev() really want - just make sure
> >     fs cannot be written to)
> 
> If we can agree to this, then sure. However there are two types of
> possible nested calls to consider, one where the sb was already frozen
> by an IOCTL, and the other where it was initiated by either another
> freeze_super_excl() or another freeze_super() call which is currently
> being processed. For the first type, its easy to say the device is
> already frozen as such return success. If the freezing is ongoing,
> we may want to wait or not wait, and this will depend on our current
> use cases for freeze_bdev().

A side note since I'm not sure I wrote this down in my previous email:
I want ioctl_fsfreeze() directly use freeze_super_excl().

Now to your freeze in progress question: freeze_super_excl() can
immediately return EBUSY when there's freezing in progress. OTOH
freeze_super() always has to wait for the current freeze / thaw to finish
and then do what's necessary. I don't see a use case where you'd like to
have freeze_super() not wait.

> As you noted above, freeze_bdev() currently returns EBUSY if we had
> the sb already frozen by ioctl_fsfreeze(). It may be a welcomed
> enhancement to correct the semantics first to address the first case,
> but keep the EBUSY for the other case. A secondary patch could then
> add a completion mechanism and let callers decide to either wait or not.
> *Iff* the caller did not opt-in to wait we keep the EBUSY return.

You're now speaking about steps to transition to the new API, right? I'd
structure the transition as follows:

1) Move bdev->bd_fsfreeze_count to a superblock.
2) Make freeze_super() grab the counter as well, thaw_super() drops it and
  unfreezes the filesystem only if the counter dropped to zero.
3) Rename freeze_super() to freeze_super_excl().
4) Only now I'd go for messing with freeze_bdev() as it now combines sanely
with freeze_super_excl(). Probably I'd just implement new freeze_super()
with the desired semantics (including waiting for ongoing operation to
finish).
5) And then switch all users (there are 4 in the kernel) from freeze_bdev()
to freeze_super() with the justification in each case why the new semantics
is actually desirable.
6) Drop old freeze_bdev() - note that only one freeze_bdev() user (in
drivers/md/dm.c) is actually interested in passing bdev, all the others are
better off just passing in superblock to new freeze_super(). Anyway for
that user in dm we might still provide a convenience wrapper to grab the
superblock and call new freeze_super() on it.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2018-04-18  0:59             ` Luis R. Rodriguez
  2018-04-18 10:12               ` Jan Kara
@ 2018-04-20 18:49               ` Luis R. Rodriguez
  2018-04-21 23:53                 ` Jan Kara
  1 sibling, 1 reply; 47+ messages in thread
From: Luis R. Rodriguez @ 2018-04-20 18:49 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jan Kara, viro, bart.vanassche, ming.lei, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel, jlayton

On Tue, Apr 17, 2018 at 05:59:36PM -0700, Luis R. Rodriguez wrote:
> On Thu, Dec 21, 2017 at 12:03:29PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > I think I owe you a reply here... Sorry that it took so long.
> 
> Took me just as long :)
> 
> > On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> > > 
> > > I'll note that its still not perfectly clear if really the semantics behind
> > > freeze_bdev() match what I described above fully. That still needs to be
> > > vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> > > an ioctl initiated freeze had occurred before? If so then great. Otherwise
> > > I think we'll need to distinguish the ioctl interface. Worst possible case
> > > is that bdev semantics and in-kernel semantics differ somehow, then that
> > > will really create a holy fucking mess.
> > 
> > I believe nobody really thought about mixing those two interfaces to fs
> > freezing and so the behavior is basically defined by the implementation.
> > That is:
> > 
> > freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > freeze_bdev() on sb frozen by freeze_bdev() -> success
> > ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
> > ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > 
> > thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL
> 
> Phew, so this is what we want for the in-kernel freezing so we're good
> and *can* combine these then.

I double checked, and I don't see where you get EINVAL for this case.
We *do* keep the sb frozen though, which is good, and the worst fear
I had was that we did not. However we return 0 if there was already
a prior freeze_bdev() or ioctl_fsfreeze() other than the context that
started the prior freeze (--bdev->bd_fsfreeze_count > 0).

The -EINVAL is only returned currently if there were no freezers.

int thaw_bdev(struct block_device *bdev, struct super_block *sb)
{
	int error = -EINVAL;

	mutex_lock(&bdev->bd_fsfreeze_mutex);
	if (!bdev->bd_fsfreeze_count)
		goto out;

	error = 0;
	if (--bdev->bd_fsfreeze_count > 0)
		goto out;
	...
out:
	mutex_unlock(&bdev->bd_fsfreeze_mutex);
	return error;
}

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2018-04-20 18:49               ` Luis R. Rodriguez
@ 2018-04-21 23:53                 ` Jan Kara
  2018-04-22  1:22                   ` Luis R. Rodriguez
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Kara @ 2018-04-21 23:53 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jan Kara, viro, bart.vanassche, ming.lei, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, yu.chen.surf, dan.j.williams,
	linux-pm, linux-block, linux-xfs, linux-kernel, jlayton

On Fri 20-04-18 11:49:32, Luis R. Rodriguez wrote:
> On Tue, Apr 17, 2018 at 05:59:36PM -0700, Luis R. Rodriguez wrote:
> > On Thu, Dec 21, 2017 at 12:03:29PM +0100, Jan Kara wrote:
> > > Hello,
> > > 
> > > I think I owe you a reply here... Sorry that it took so long.
> > 
> > Took me just as long :)
> > 
> > > On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> > > > 
> > > > I'll note that its still not perfectly clear if really the semantics behind
> > > > freeze_bdev() match what I described above fully. That still needs to be
> > > > vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> > > > an ioctl initiated freeze had occurred before? If so then great. Otherwise
> > > > I think we'll need to distinguish the ioctl interface. Worst possible case
> > > > is that bdev semantics and in-kernel semantics differ somehow, then that
> > > > will really create a holy fucking mess.
> > > 
> > > I believe nobody really thought about mixing those two interfaces to fs
> > > freezing and so the behavior is basically defined by the implementation.
> > > That is:
> > > 
> > > freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > > freeze_bdev() on sb frozen by freeze_bdev() -> success
> > > ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
> > > ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > > 
> > > thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL
> > 
> > Phew, so this is what we want for the in-kernel freezing so we're good
> > and *can* combine these then.
> 
> I double checked, and I don't see where you get EINVAL for this case.
> We *do* keep the sb frozen though, which is good, and the worst fear
> I had was that we did not. However we return 0 if there was already
> a prior freeze_bdev() or ioctl_fsfreeze() other than the context that
> started the prior freeze (--bdev->bd_fsfreeze_count > 0).
> 
> The -EINVAL is only returned currently if there were no freezers.
> 
> int thaw_bdev(struct block_device *bdev, struct super_block *sb)
> {
> 	int error = -EINVAL;
> 
> 	mutex_lock(&bdev->bd_fsfreeze_mutex);
> 	if (!bdev->bd_fsfreeze_count)
> 		goto out;

But this is precisely where we'd bail if we freeze sb by ioctl_fsfreeze()
but try to thaw by thaw_bdev(). ioctl_fsfreeze() does not touch
bd_fsfreeze_count...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2018-04-21 23:53                 ` Jan Kara
@ 2018-04-22  1:22                   ` Luis R. Rodriguez
  0 siblings, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2018-04-22  1:22 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel, jlayton

On Sun, Apr 22, 2018 at 01:53:23AM +0200, Jan Kara wrote:
> On Fri 20-04-18 11:49:32, Luis R. Rodriguez wrote:
> > On Tue, Apr 17, 2018 at 05:59:36PM -0700, Luis R. Rodriguez wrote:
> > > On Thu, Dec 21, 2017 at 12:03:29PM +0100, Jan Kara wrote:
> > > > Hello,
> > > > 
> > > > I think I owe you a reply here... Sorry that it took so long.
> > > 
> > > Took me just as long :)
> > > 
> > > > On Fri 01-12-17 22:13:27, Luis R. Rodriguez wrote:
> > > > > 
> > > > > I'll note that its still not perfectly clear if really the semantics behind
> > > > > freeze_bdev() match what I described above fully. That still needs to be
> > > > > vetted for. For instance, does thaw_bdev() keep a superblock frozen if we
> > > > > an ioctl initiated freeze had occurred before? If so then great. Otherwise
> > > > > I think we'll need to distinguish the ioctl interface. Worst possible case
> > > > > is that bdev semantics and in-kernel semantics differ somehow, then that
> > > > > will really create a holy fucking mess.
> > > > 
> > > > I believe nobody really thought about mixing those two interfaces to fs
> > > > freezing and so the behavior is basically defined by the implementation.
> > > > That is:
> > > > 
> > > > freeze_bdev() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > > > freeze_bdev() on sb frozen by freeze_bdev() -> success
> > > > ioctl_fsfreeze() on sb frozen by freeze_bdev() -> EBUSY
> > > > ioctl_fsfreeze() on sb frozen by ioctl_fsfreeze() -> EBUSY
> > > > 
> > > > thaw_bdev() on sb frozen by ioctl_fsfreeze() -> EINVAL
> > > 
> > > Phew, so this is what we want for the in-kernel freezing so we're good
> > > and *can* combine these then.
> > 
> > I double checked, and I don't see where you get EINVAL for this case.
> > We *do* keep the sb frozen though, which is good, and the worst fear
> > I had was that we did not. However we return 0 if there was already
> > a prior freeze_bdev() or ioctl_fsfreeze() other than the context that
> > started the prior freeze (--bdev->bd_fsfreeze_count > 0).
> > 
> > The -EINVAL is only returned currently if there were no freezers.
> > 
> > int thaw_bdev(struct block_device *bdev, struct super_block *sb)
> > {
> > 	int error = -EINVAL;
> > 
> > 	mutex_lock(&bdev->bd_fsfreeze_mutex);
> > 	if (!bdev->bd_fsfreeze_count)
> > 		goto out;
> 
> But this is precisely where we'd bail if we freeze sb by ioctl_fsfreeze()
> but try to thaw by thaw_bdev(). ioctl_fsfreeze() does not touch
> bd_fsfreeze_count...

Ah, yes, I see that now, thanks!

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 03/11] fs: add frozen sb state helpers
  2017-11-30 17:13   ` Jan Kara
  2017-11-30 19:05     ` Luis R. Rodriguez
@ 2018-04-22  2:53     ` Luis R. Rodriguez
  1 sibling, 0 replies; 47+ messages in thread
From: Luis R. Rodriguez @ 2018-04-22  2:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	yu.chen.surf, dan.j.williams, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote:
> On Wed 29-11-17 15:23:48, Luis R. Rodriguez wrote:
> > The question of whether or not a superblock is frozen needs to be
> > augmented in the future to account for differences between a user
> > initiated freeze and a kernel initiated freeze done automatically
> > on behalf of the kernel.
> > 
> > Provide helpers so that these can be used instead so that we don't
> > have to expand checks later in these same call sites as we expand
> > the definition of a frozen superblock.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> 
> So helpers are fine but...
> 
> > +/**
> > + * sb_is_frozen_by_user - is superblock frozen by a user call
> > + * @sb: the super to check
> > + *
> > + * Returns true if the super freeze was initiated by userspace, for instance,
> > + * an ioctl call.
> > + */
> > +static inline bool sb_is_frozen_by_user(struct super_block *sb)
> > +{
> > +	return sb->s_writers.frozen == SB_FREEZE_COMPLETE;
> > +}
> 
> ... I dislike the _by_user() suffix as there may be different places that
> call freeze_super() (e.g. device mapper does this during some operations).
> Clearly we need to distinguish "by system suspend" and "the other" cases.
> So please make this clear in the naming.
> 
> In fact, what might be a cleaner solution is to introduce a 'freeze_count'
> for superblock freezing (we already do have this for block devices). Then
> you don't need to differentiate these two cases - but you'd still need to
> properly handle cleanup if freezing of all superblocks fails in the middle.
> So I'm not 100% this works out nicely in the end. But it's certainly worth
> a consideration.

Seems reasonable.

  Luis

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2018-04-22  2:53 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 23:23 [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 01/11] fs: provide unlocked helper for freeze_super() Luis R. Rodriguez
2017-11-30 16:58   ` Jan Kara
2017-11-29 23:23 ` [PATCH 02/11] fs: provide unlocked helper thaw_super() Luis R. Rodriguez
2017-11-30 16:59   ` Jan Kara
2017-11-29 23:23 ` [PATCH 03/11] fs: add frozen sb state helpers Luis R. Rodriguez
2017-11-30 17:13   ` Jan Kara
2017-11-30 19:05     ` Luis R. Rodriguez
2017-12-01 11:47       ` Jan Kara
2017-12-01 21:13         ` Luis R. Rodriguez
2017-12-21 11:03           ` Jan Kara
2018-04-18  0:59             ` Luis R. Rodriguez
2018-04-18 10:12               ` Jan Kara
2018-04-20 18:49               ` Luis R. Rodriguez
2018-04-21 23:53                 ` Jan Kara
2018-04-22  1:22                   ` Luis R. Rodriguez
2018-04-22  2:53     ` Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 04/11] fs: distinguish between user initiated freeze and kernel initiated freeze Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl() Luis R. Rodriguez
2017-11-29 23:48   ` Rafael J. Wysocki
2017-11-30  0:22     ` Luis R. Rodriguez
2017-11-30  1:34     ` Dave Chinner
2017-11-30  1:40       ` Rafael J. Wysocki
2017-11-30 16:57   ` Jan Kara
2017-11-29 23:23 ` [PATCH 06/11] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 07/11] xfs: remove not needed freezing calls Luis R. Rodriguez
2017-11-30 16:21   ` Jan Kara
2017-11-30 20:32     ` Rafael J. Wysocki
2017-11-30 23:30       ` Dave Chinner
2017-11-30 23:40         ` Rafael J. Wysocki
2017-11-29 23:23 ` [PATCH 08/11] ext4: " Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 09/11] f2fs: " Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 10/11] nilfs2: " Luis R. Rodriguez
2017-11-29 23:23 ` [PATCH 11/11] jfs: " Luis R. Rodriguez
2017-11-30 16:36 ` [PATCH 00/11] fs: use freeze_fs on suspend/hibernate Yu Chen
2017-11-30 16:41   ` Jiri Kosina
2017-11-30 16:50     ` Yu Chen
2017-12-01 19:05     ` Jeff Layton
2017-12-01 21:51       ` Dave Chinner
2017-11-30 17:01 ` Bart Van Assche
2017-11-30 19:42   ` Luis R. Rodriguez
2017-11-30 20:53     ` Bart Van Assche
2017-11-30 21:03       ` Dave Chinner
2017-11-30 21:51 ` Pavel Machek
2017-12-01  0:44   ` Luis R. Rodriguez
2017-12-13  1:09 ` Rafael J. Wysocki
2017-12-19 16:50   ` Luis R. Rodriguez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).