linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
@ 2017-10-03 18:53 Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 1/5] fs: add iterate_supers_reverse() Luis R. Rodriguez
                   ` (5 more replies)
  0 siblings, 6 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

At the 2015 South Korea kernel summit Jiri Kosina had pointd out the issue of
the sloppy semantics of the kthread freezer, lwn covered this pretty well [0].
In short, he explained how the original purpose of the freezer was to aid
in going to suspend to ensure no uwanted IO activity would cause filesystem
corruption. Kernel kthreads require special freezer handling though, the call
try_to_freeze() is often sprinkled at strategic places, but sometimes this is
done without set_freezable() making try_to_freeze() useless. Other helpers such
as freezable_schedule_timeout() exist, and very likely they are not used in any
consistent and proper way either all over the kernel. Dealing with these
helpers alone also does not and cannot ensure that everything that has been
spawned asynchronously from a kthread (such as timers) are stopped as well,
these are left to the kthread user implementation, and chances are pretty high
there are many bugs lurking here. There are even non-IO bounds kthreads now
using the freezer APIs, where this is not even needed!

Jiri suggested we can easily replace the complexities of the kthread freezer
by just using the existing filesystem freeze/thaw callbacks on hibernation and
suspend.

I've picked up Jiri's work given there are bugs which after inspection don't
see like real bugs, but just random IO loosely waiting to be completed and the
kernel not really being able to tell me who the culprit is. In fact even if one
plugs a fix, one ends up in another random place and its really unclear who is
to blaim for next.

For instance, to reproduce a simple suspend bug on what may at first seem to be
an XFS bug, one can issue a dd onto disk prior to suspend, and we'll get a
stall on our way to suspend, claiming the issue was the SCSI layer not being
able to quiesce the disk. This was reported on OpenSUSE and reproduced on
linux-next easily [1]. The following script can be run while we loop on
systemctl suspend and qemu system_wakeup calls to resume:

	while true; do
		dd if=/dev/zero of=crap bs=1M count=1024 &> /dev/null
	done

You end up with with a hung suspend attempt, and eventually a splat
as follows with a hunk task notification:

INFO: task kworker/u8:8:1320 blocked for more than 10 seconds.
      Tainted: G            E   4.13.0-next-20170907+ #88
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:8    D    0  1320      2 0x80000000
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 __schedule+0x2ec/0x7a0
 schedule+0x36/0x80
 io_schedule+0x16/0x40
 get_request+0x278/0x780
 ? remove_wait_queue+0x70/0x70
 blk_get_request+0x9c/0x110
 scsi_execute+0x7a/0x310 [scsi_mod]
 sd_sync_cache+0xa3/0x190 [sd_mod]
 ? blk_run_queue+0x3f/0x50
 sd_suspend_common+0x7b/0x130 [sd_mod]
 ? scsi_print_result+0x270/0x270 [scsi_mod]
 sd_suspend_system+0x13/0x20 [sd_mod]
 do_scsi_suspend+0x1b/0x30 [scsi_mod]
 scsi_bus_suspend_common+0xb1/0xd0 [scsi_mod]
 ? device_for_each_child+0x69/0x90
 scsi_bus_suspend+0x15/0x20 [scsi_mod]
 dpm_run_callback+0x56/0x140
 ? scsi_bus_freeze+0x20/0x20 [scsi_mod]
 __device_suspend+0xf1/0x340
 async_suspend+0x1f/0xa0
 async_run_entry_fn+0x38/0x160
 process_one_work+0x191/0x380
 worker_thread+0x4e/0x3c0
 kthread+0x109/0x140
 ? process_one_work+0x380/0x380
 ? kthread_create_on_node+0x70/0x70
 ret_from_fork+0x25/0x30

Sprinkling freezer_do_not_count() on scsi_execute() *iff* we're going to
suspend is one way to fix this splat, however this just pushes the bug
elsewhere, and we keep doing this ever on. The right thing to do here is ensure
that after userpsace is frozen we also freeze all filesystem behaviour.

Its possible other symptoms have been observed elsewhere, and similar work
arounds are being devised. Perhaps this work can help put a stop gap for such
work arounds and complexities.

While this all seems trivial, well discussed, and so it should be well accepted
for us to go full swing with a replacement of the kthread freezer with
filesystem suspend / thaw -- not all filesystems have a respective freeze/thaw
call. Also in practice it would seem some thaw'ing calls need some revision.
For instance thawing with a sync call on ext4 currently issues a bio_submit(),
which in turn stalls on resume. Each call/filesystem then should be vetted
manually, as I've done for ext4 in this series.

Likewise while kthread freezing may have been designed to help with avoiding IO
on disk, its sloppy use on kthreads may now be providing buggy functionality
which *helps* somehow now. The removal of freezer helpers on kthreads which are
not IO bound should also be carefully vetted for to avoid potential
regressions.

Then we have the few areas in the kernel which may generate disk IO which do
not come from filesystems -- this was mentioned recently the the Alpine Linux
Persistence and Storage Summit [2] -- such drivers would need to get proper
suspend calls eventually.

Given all the above then I'd like to move *carefully* with the kthread freezer
replacement with filesystem freeze/thawing instead of doing it all in one shot,
as Jiri had originally intended.

Filesystmes which have been vetted to work properly can use the super block
FS_FREEZE_ON_SUSPEND flag when things are ready as a transitional flag. I hope
we can get to the point we can rely on all filesystmes supporting it, so that
once all filesystem code is vetted for, and all non-IO bound kthread callers
have been cleared of the freezer call we can just remove the
FS_FREEZE_ON_SUSPEND flag and then kill the kthread freezer completely (the
last patch in this series).

The way this would be merged then is, only the first four patches would
be considered for upstream at first. We'd then move on to do the same with
other filesystems. Then or in parallel move on to tackle vetting removal
of all disk non-IO bounds kthread freezer callers. Only once this is all
done should we consider the last patch to put final nail on the kthread
freezer coffin.

[0] https://lwn.net/Articles/662703/
[1] https://bugzilla.suse.com/show_bug.cgi?id=1043449
[2] http://alpss.at/

Thoughts? Rants?

Luis R. Rodriguez (5):
  fs: add iterate_supers_reverse()
  fs: freeze on suspend and thaw on resume
  xfs: allow fs freeze on suspend/hibernation
  ext4: add fs freezing support on suspend/hibernation
  pm: remove kernel thread freezing

 Documentation/power/freezing-of-tasks.txt |   9 ---
 drivers/xen/manage.c                      |   6 --
 fs/ext4/super.c                           |  13 ++--
 fs/super.c                                | 122 ++++++++++++++++++++++++++++++
 fs/xfs/xfs_super.c                        |   3 +-
 fs/xfs/xfs_trans_ail.c                    |   7 +-
 include/linux/freezer.h                   |   4 -
 include/linux/fs.h                        |  14 ++++
 kernel/power/hibernate.c                  |  10 +--
 kernel/power/power.h                      |  20 +----
 kernel/power/process.c                    |  61 ++++-----------
 kernel/power/user.c                       |   9 ---
 tools/power/pm-graph/analyze_suspend.py   |   1 -
 13 files changed, 163 insertions(+), 116 deletions(-)

-- 
2.14.0

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [RFC 1/5] fs: add iterate_supers_reverse()
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
@ 2017-10-03 18:53 ` Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

There are use cases where we wish to traverse the superblock list
in reverse order. This particular implementation will also enable
to capture errors.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c         | 43 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 02da00410de8..d45e92d9a38f 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -609,6 +609,49 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
 	spin_unlock(&sb_lock);
 }
 
+/**
+ *	iterate_supers_reverse - call function for active superblocks in reverse
+ *	@f: function to call
+ *	@arg: argument to pass to it
+ *
+ *	Scans the superblock list and calls given function, passing it
+ *	locked superblock and given argument, in reverse order. Returns if
+ *	an error occurred.
+ */
+int iterate_supers_reverse(int (*f)(struct super_block *, void *), void *arg)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+		spin_unlock(&sb_lock);
+
+		down_read(&sb->s_umount);
+		if (sb->s_root && (sb->s_flags & SB_BORN)) {
+			error = f(sb, arg);
+			if (error) {
+				spin_lock(&sb_lock);
+				break;
+			}
+		}
+		up_read(&sb->s_umount);
+
+		spin_lock(&sb_lock);
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
 /**
  *	iterate_supers_type - call function for superblocks of given type
  *	@type: fs type
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 46796b2f956b..cd084792cf39 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3097,6 +3097,7 @@ extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
 extern void iterate_supers(void (*)(struct super_block *, void *), void *);
+extern int iterate_supers_reverse(int (*)(struct super_block *, void *), void *);
 extern void iterate_supers_type(struct file_system_type *,
 			        void (*)(struct super_block *, void *), void *);
 
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 1/5] fs: add iterate_supers_reverse() Luis R. Rodriguez
@ 2017-10-03 18:53 ` Luis R. Rodriguez
  2017-10-03 20:02   ` Bart Van Assche
                     ` (2 more replies)
  2017-10-03 18:53 ` [RFC 3/5] xfs: allow fs freeze on suspend/hibernation Luis R. Rodriguez
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

This uses the existing filesystem freeze and thaw callbacks to
freeze each filesystem on suspend/hibernation and thaw upon resume.

This is needed so that we properly really stop IO in flight without
races after userspace has been frozen. Without this we rely on
kthread freezing and its semantics are loose and error prone.
For instance, even though a kthread may use try_to_freeze() and end
up being frozen we have no way of being sure that everything that
has been spawned asynchronously from it (such as timers) have also
been stopped as well.

A long term advantage of also adding filesystem freeze / thawing
supporting durign suspend / hibernation is that long term we may
be able to eventually drop the kernel's thread freezing completely
as it was originally added to stop disk IO in flight as we hibernate
or suspend.

This also implies that many kthread users exist which have been
adding freezer semantics onto its kthreads without need. These also
will need to be reviewed later.

This is based on prior work originally by Rafael Wysocki and later by
Jiri Kosina.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/super.c             | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h     | 13 +++++++++
 kernel/power/process.c | 14 ++++++++-
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index d45e92d9a38f..ce8da8b187b1 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1572,3 +1572,82 @@ int thaw_super(struct super_block *sb)
 	return 0;
 }
 EXPORT_SYMBOL(thaw_super);
+
+#ifdef CONFIG_PM_SLEEP
+static bool super_allows_freeze(struct super_block *sb)
+{
+	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
+}
+
+static bool super_should_freeze(struct super_block *sb)
+{
+	if (!sb->s_root)
+		return false;
+	if (!(sb->s_flags & MS_BORN))
+		return false;
+	/*
+	 * We don't freeze virtual filesystems, we skip those filesystems with
+	 * no backing device.
+	 */
+	if (sb->s_bdi == &noop_backing_dev_info)
+		return false;
+	/* No need to freeze read-only filesystems */
+	if (sb->s_flags & MS_RDONLY)
+		return false;
+
+	if (!super_allows_freeze(sb))
+		return false;
+
+	return true;
+}
+
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	if (!super_should_freeze(sb))
+		goto out;
+
+	up_read(&sb->s_umount);
+	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
+	error = freeze_super(sb);
+	down_read(&sb->s_umount);
+out:
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to freeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+	spin_unlock(&sb_lock);
+	return error;
+}
+
+int fs_suspend_freeze(void)
+{
+	return iterate_supers_reverse(fs_suspend_freeze_sb, NULL);
+}
+
+void fs_resume_unfreeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	if (!super_should_freeze(sb))
+		goto out;
+
+	up_read(&sb->s_umount);
+	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
+	error = thaw_super(sb);
+	down_read(&sb->s_umount);
+out:
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to unfreeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+	spin_unlock(&sb_lock);
+}
+
+void fs_resume_unfreeze(void)
+{
+	iterate_supers(fs_resume_unfreeze_sb, NULL);
+}
+
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cd084792cf39..7754230d6afc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2071,6 +2071,7 @@ struct file_system_type {
 #define FS_BINARY_MOUNTDATA	2
 #define FS_HAS_SUBTYPE		4
 #define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
+#define FS_FREEZE_ON_SUSPEND   16      /* should filesystem freeze on suspend? */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
 	struct dentry *(*mount) (struct file_system_type *, int,
 		       const char *, void *);
@@ -2172,6 +2173,18 @@ extern int fd_statfs(int, struct kstatfs *);
 extern int vfs_ustat(dev_t, struct kstatfs *);
 extern int freeze_super(struct super_block *super);
 extern int thaw_super(struct super_block *super);
+#ifdef CONFIG_PM_SLEEP
+int fs_suspend_freeze(void);
+void fs_resume_unfreeze(void);
+#else
+static inline int fs_suspend_freeze(void)
+{
+	return 0;
+}
+static inline void fs_resume_unfreeze(void)
+{
+}
+#endif
 extern bool our_mnt(struct vfsmount *mnt);
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 50f25cb370c6..9d1277768312 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -144,6 +144,15 @@ int freeze_processes(void)
 	pr_cont("\n");
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+	error = fs_suspend_freeze();
+	if (error) {
+		pr_cont("failed\n");
+		fs_resume_unfreeze();
+		return error;
+	}
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disbale
 	 * the OOM killer to disallow any further interference with
@@ -153,8 +162,10 @@ int freeze_processes(void)
 	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
 		thaw_processes();
+		fs_resume_unfreeze();
+	}
 	return error;
 }
 
@@ -197,6 +208,7 @@ void thaw_processes(void)
 	pm_nosig_freezing = false;
 
 	oom_killer_enable();
+	fs_resume_unfreeze();
 
 	pr_info("Restarting tasks ... ");
 
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 3/5] xfs: allow fs freeze on suspend/hibernation
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 1/5] fs: add iterate_supers_reverse() Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
@ 2017-10-03 18:53 ` Luis R. Rodriguez
  2017-10-03 18:53 ` [RFC 4/5] ext4: add fs freezing support " Luis R. Rodriguez
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

This also removes the freezer calls on the XFS kthread as they are
no longer needed.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/xfs/xfs_super.c     | 3 ++-
 fs/xfs/xfs_trans_ail.c | 7 ++-----
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 64013b2db8e0..75c3415657eb 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1775,7 +1775,8 @@ static struct file_system_type xfs_fs_type = {
 	.name			= "xfs",
 	.mount			= xfs_fs_mount,
 	.kill_sb		= kill_block_super,
-	.fs_flags		= FS_REQUIRES_DEV,
+	.fs_flags		= FS_REQUIRES_DEV |
+				  FS_FREEZE_ON_SUSPEND,
 };
 MODULE_ALIAS_FS("xfs");
 
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 354368a906e5..8cebda88e91a 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -512,7 +512,6 @@ xfsaild(
 	long		tout = 0;	/* milliseconds */
 
 	current->flags |= PF_MEMALLOC;
-	set_freezable();
 
 	while (!kthread_should_stop()) {
 		if (tout && tout <= 20)
@@ -535,19 +534,17 @@ xfsaild(
 		if (!xfs_ail_min(ailp) &&
 		    ailp->xa_target == ailp->xa_target_prev) {
 			spin_unlock(&ailp->xa_lock);
-			freezable_schedule();
+			schedule();
 			tout = 0;
 			continue;
 		}
 		spin_unlock(&ailp->xa_lock);
 
 		if (tout)
-			freezable_schedule_timeout(msecs_to_jiffies(tout));
+			schedule_timeout(msecs_to_jiffies(tout));
 
 		__set_current_state(TASK_RUNNING);
 
-		try_to_freeze();
-
 		tout = xfsaild_push(ailp);
 	}
 
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2017-10-03 18:53 ` [RFC 3/5] xfs: allow fs freeze on suspend/hibernation Luis R. Rodriguez
@ 2017-10-03 18:53 ` Luis R. Rodriguez
  2017-10-03 19:59   ` Theodore Ts'o
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
  2017-10-03 19:33 ` [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Ming Lei
  5 siblings, 1 reply; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

This also removes the superflous freezer calls as they are no longer
needed. We need to avoid sync call on thaw as otherwise we end up with
a stall on bio_submit().

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 fs/ext4/super.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3cc5635d00cc..03d3bbd27dae 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -119,7 +119,8 @@ static struct file_system_type ext2_fs_type = {
 	.name		= "ext2",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV |
+			  FS_FREEZE_ON_SUSPEND,
 };
 MODULE_ALIAS_FS("ext2");
 MODULE_ALIAS("ext2");
@@ -134,7 +135,8 @@ static struct file_system_type ext3_fs_type = {
 	.name		= "ext3",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV |
+			  FS_FREEZE_ON_SUSPEND,
 };
 MODULE_ALIAS_FS("ext3");
 MODULE_ALIAS("ext3");
@@ -2979,8 +2981,6 @@ static int ext4_lazyinit_thread(void *arg)
 		}
 		mutex_unlock(&eli->li_list_mtx);
 
-		try_to_freeze();
-
 		cur = jiffies;
 		if ((time_after_eq(cur, next_wakeup)) ||
 		    (MAX_JIFFY_OFFSET == next_wakeup)) {
@@ -4926,7 +4926,7 @@ static int ext4_unfreeze(struct super_block *sb)
 		ext4_set_feature_journal_needs_recovery(sb);
 	}
 
-	ext4_commit_super(sb, 1);
+	ext4_commit_super(sb, 0);
 	return 0;
 }
 
@@ -5771,7 +5771,8 @@ static struct file_system_type ext4_fs_type = {
 	.name		= "ext4",
 	.mount		= ext4_mount,
 	.kill_sb	= kill_block_super,
-	.fs_flags	= FS_REQUIRES_DEV,
+	.fs_flags	= FS_REQUIRES_DEV |
+			  FS_FREEZE_ON_SUSPEND,
 };
 MODULE_ALIAS_FS("ext4");
 
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2017-10-03 18:53 ` [RFC 4/5] ext4: add fs freezing support " Luis R. Rodriguez
@ 2017-10-03 18:53 ` Luis R. Rodriguez
  2017-10-03 18:59   ` Rafael J. Wysocki
                     ` (4 more replies)
  2017-10-03 19:33 ` [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Ming Lei
  5 siblings, 5 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 18:53 UTC (permalink / raw)
  To: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel, Luis R. Rodriguez

Now that all filesystems which used to rely on kthread
freezing have been converted to filesystem freeze/thawing
we can remove the kernel kthread freezer.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 Documentation/power/freezing-of-tasks.txt |  9 ------
 drivers/xen/manage.c                      |  6 ----
 include/linux/freezer.h                   |  4 ---
 kernel/power/hibernate.c                  | 10 ++-----
 kernel/power/power.h                      | 20 +------------
 kernel/power/process.c                    | 47 -------------------------------
 kernel/power/user.c                       |  9 ------
 tools/power/pm-graph/analyze_suspend.py   |  1 -
 8 files changed, 3 insertions(+), 103 deletions(-)

diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt
index af005770e767..477559f57c95 100644
--- a/Documentation/power/freezing-of-tasks.txt
+++ b/Documentation/power/freezing-of-tasks.txt
@@ -71,15 +71,6 @@ Rationale behind the functions dealing with freezing and thawing of tasks:
 freeze_processes():
   - freezes only userspace tasks
 
-freeze_kernel_threads():
-  - freezes all tasks (including kernel threads) because we can't freeze
-    kernel threads without freezing userspace tasks
-
-thaw_kernel_threads():
-  - thaws only kernel threads; this is particularly useful if we need to do
-    anything special in between thawing of kernel threads and thawing of
-    userspace tasks, or if we want to postpone the thawing of userspace tasks
-
 thaw_processes():
   - thaws all tasks (including kernel threads) because we can't thaw userspace
     tasks without thawing kernel threads
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index c425d03d37d2..8ca0e0c9a7d5 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -109,12 +109,6 @@ static void do_suspend(void)
 		goto out;
 	}
 
-	err = freeze_kernel_threads();
-	if (err) {
-		pr_err("%s: freeze kernel threads failed %d\n", __func__, err);
-		goto out_thaw;
-	}
-
 	err = dpm_suspend_start(PMSG_FREEZE);
 	if (err) {
 		pr_err("%s: dpm_suspend_start %d\n", __func__, err);
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index dd03e837ebb7..037ef3f16173 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -43,9 +43,7 @@ extern void __thaw_task(struct task_struct *t);
 
 extern bool __refrigerator(bool check_kthr_stop);
 extern int freeze_processes(void);
-extern int freeze_kernel_threads(void);
 extern void thaw_processes(void);
-extern void thaw_kernel_threads(void);
 
 /*
  * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
@@ -263,9 +261,7 @@ static inline void __thaw_task(struct task_struct *t) {}
 
 static inline bool __refrigerator(bool check_kthr_stop) { return false; }
 static inline int freeze_processes(void) { return -ENOSYS; }
-static inline int freeze_kernel_threads(void) { return -ENOSYS; }
 static inline void thaw_processes(void) {}
-static inline void thaw_kernel_threads(void) {}
 
 static inline bool try_to_freeze_nowarn(void) { return false; }
 static inline bool try_to_freeze(void) { return false; }
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a5c36e9c56a6..7c3af084b10a 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -352,10 +352,6 @@ int hibernation_snapshot(int platform_mode)
 	if (error)
 		goto Close;
 
-	error = freeze_kernel_threads();
-	if (error)
-		goto Cleanup;
-
 	if (hibernation_test(TEST_FREEZER)) {
 
 		/*
@@ -363,13 +359,13 @@ int hibernation_snapshot(int platform_mode)
 		 * successful freezer test.
 		 */
 		freezer_test_done = true;
-		goto Thaw;
+		goto Cleanup;
 	}
 
 	error = dpm_prepare(PMSG_FREEZE);
 	if (error) {
 		dpm_complete(PMSG_RECOVER);
-		goto Thaw;
+		goto Cleanup;
 	}
 
 	suspend_console();
@@ -405,8 +401,6 @@ int hibernation_snapshot(int platform_mode)
 	platform_end(platform_mode);
 	return error;
 
- Thaw:
-	thaw_kernel_threads();
  Cleanup:
 	swsusp_free();
 	goto Close;
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 1d2d761e3c25..333bde062e42 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -253,25 +253,7 @@ extern int pm_test_level;
 #ifdef CONFIG_SUSPEND_FREEZER
 static inline int suspend_freeze_processes(void)
 {
-	int error;
-
-	error = freeze_processes();
-	/*
-	 * freeze_processes() automatically thaws every task if freezing
-	 * fails. So we need not do anything extra upon error.
-	 */
-	if (error)
-		return error;
-
-	error = freeze_kernel_threads();
-	/*
-	 * freeze_kernel_threads() thaws only kernel threads upon freezing
-	 * failure. So we have to thaw the userspace tasks ourselves.
-	 */
-	if (error)
-		thaw_processes();
-
-	return error;
+	return freeze_processes();
 }
 
 static inline void suspend_thaw_processes(void)
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 9d1277768312..2e223555b764 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -169,33 +169,6 @@ int freeze_processes(void)
 	return error;
 }
 
-/**
- * freeze_kernel_threads - Make freezable kernel threads go to the refrigerator.
- *
- * On success, returns 0.  On failure, -errno and only the kernel threads are
- * thawed, so as to give a chance to the caller to do additional cleanups
- * (if any) before thawing the userspace tasks. So, it is the responsibility
- * of the caller to thaw the userspace tasks, when the time is right.
- */
-int freeze_kernel_threads(void)
-{
-	int error;
-
-	pr_info("Freezing remaining freezable tasks ... ");
-
-	pm_nosig_freezing = true;
-	error = try_to_freeze_tasks(false);
-	if (!error)
-		pr_cont("done.");
-
-	pr_cont("\n");
-	BUG_ON(in_atomic());
-
-	if (error)
-		thaw_kernel_threads();
-	return error;
-}
-
 void thaw_processes(void)
 {
 	struct task_struct *g, *p;
@@ -234,23 +207,3 @@ void thaw_processes(void)
 	pr_cont("done.\n");
 	trace_suspend_resume(TPS("thaw_processes"), 0, false);
 }
-
-void thaw_kernel_threads(void)
-{
-	struct task_struct *g, *p;
-
-	pm_nosig_freezing = false;
-	pr_info("Restarting kernel threads ... ");
-
-	thaw_workqueues();
-
-	read_lock(&tasklist_lock);
-	for_each_process_thread(g, p) {
-		if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
-			__thaw_task(p);
-	}
-	read_unlock(&tasklist_lock);
-
-	schedule();
-	pr_cont("done.\n");
-}
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 22df9f7ff672..ebb2e6a8ddc8 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -277,15 +277,6 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
 		swsusp_free();
 		memset(&data->handle, 0, sizeof(struct snapshot_handle));
 		data->ready = false;
-		/*
-		 * It is necessary to thaw kernel threads here, because
-		 * SNAPSHOT_CREATE_IMAGE may be invoked directly after
-		 * SNAPSHOT_FREE.  In that case, if kernel threads were not
-		 * thawed, the preallocation of memory carried out by
-		 * hibernation_snapshot() might run into problems (i.e. it
-		 * might fail or even deadlock).
-		 */
-		thaw_kernel_threads();
 		break;
 
 	case SNAPSHOT_PREF_IMAGE_SIZE:
diff --git a/tools/power/pm-graph/analyze_suspend.py b/tools/power/pm-graph/analyze_suspend.py
index 1b60fe203741..545a5e2eafbe 100755
--- a/tools/power/pm-graph/analyze_suspend.py
+++ b/tools/power/pm-graph/analyze_suspend.py
@@ -138,7 +138,6 @@ class SystemValues:
 		'pm_prepare_console': dict(),
 		'pm_notifier_call_chain': dict(),
 		'freeze_processes': dict(),
-		'freeze_kernel_threads': dict(),
 		'pm_restrict_gfp_mask': dict(),
 		'acpi_suspend_begin': dict(),
 		'suspend_console': dict(),
-- 
2.14.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
@ 2017-10-03 18:59   ` Rafael J. Wysocki
  2017-10-03 21:15     ` Rafael J. Wysocki
  2017-10-03 20:12   ` Pavel Machek
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2017-10-03 18:59 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tuesday, October 3, 2017 8:53:13 PM CEST Luis R. Rodriguez wrote:
> Now that all filesystems which used to rely on kthread
> freezing have been converted to filesystem freeze/thawing
> we can remove the kernel kthread freezer.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

I like this one. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
@ 2017-10-03 19:33 ` Ming Lei
  2017-10-03 20:05   ` Luis R. Rodriguez
  5 siblings, 1 reply; 53+ messages in thread
From: Ming Lei @ 2017-10-03 19:33 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, tytso, darrick.wong, jikos, rjw, pavel,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 11:53:08AM -0700, Luis R. Rodriguez wrote:
> At the 2015 South Korea kernel summit Jiri Kosina had pointd out the issue of
> the sloppy semantics of the kthread freezer, lwn covered this pretty well [0].
> In short, he explained how the original purpose of the freezer was to aid
> in going to suspend to ensure no uwanted IO activity would cause filesystem
> corruption. Kernel kthreads require special freezer handling though, the call
> try_to_freeze() is often sprinkled at strategic places, but sometimes this is
> done without set_freezable() making try_to_freeze() useless. Other helpers such
> as freezable_schedule_timeout() exist, and very likely they are not used in any
> consistent and proper way either all over the kernel. Dealing with these
> helpers alone also does not and cannot ensure that everything that has been
> spawned asynchronously from a kthread (such as timers) are stopped as well,
> these are left to the kthread user implementation, and chances are pretty high
> there are many bugs lurking here. There are even non-IO bounds kthreads now
> using the freezer APIs, where this is not even needed!
> 
> Jiri suggested we can easily replace the complexities of the kthread freezer
> by just using the existing filesystem freeze/thaw callbacks on hibernation and
> suspend.
> 
> I've picked up Jiri's work given there are bugs which after inspection don't
> see like real bugs, but just random IO loosely waiting to be completed and the
> kernel not really being able to tell me who the culprit is. In fact even if one
> plugs a fix, one ends up in another random place and its really unclear who is
> to blaim for next.
> 
> For instance, to reproduce a simple suspend bug on what may at first seem to be
> an XFS bug, one can issue a dd onto disk prior to suspend, and we'll get a
> stall on our way to suspend, claiming the issue was the SCSI layer not being
> able to quiesce the disk. This was reported on OpenSUSE and reproduced on
> linux-next easily [1]. The following script can be run while we loop on
> systemctl suspend and qemu system_wakeup calls to resume:
> 
> 	while true; do
> 		dd if=/dev/zero of=crap bs=1M count=1024 &> /dev/null
> 	done
> 
> You end up with with a hung suspend attempt, and eventually a splat
> as follows with a hunk task notification:
> 
> INFO: task kworker/u8:8:1320 blocked for more than 10 seconds.
>       Tainted: G            E   4.13.0-next-20170907+ #88
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/u8:8    D    0  1320      2 0x80000000
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
>  __schedule+0x2ec/0x7a0
>  schedule+0x36/0x80
>  io_schedule+0x16/0x40
>  get_request+0x278/0x780
>  ? remove_wait_queue+0x70/0x70
>  blk_get_request+0x9c/0x110
>  scsi_execute+0x7a/0x310 [scsi_mod]
>  sd_sync_cache+0xa3/0x190 [sd_mod]
>  ? blk_run_queue+0x3f/0x50
>  sd_suspend_common+0x7b/0x130 [sd_mod]
>  ? scsi_print_result+0x270/0x270 [scsi_mod]
>  sd_suspend_system+0x13/0x20 [sd_mod]
>  do_scsi_suspend+0x1b/0x30 [scsi_mod]
>  scsi_bus_suspend_common+0xb1/0xd0 [scsi_mod]
>  ? device_for_each_child+0x69/0x90
>  scsi_bus_suspend+0x15/0x20 [scsi_mod]
>  dpm_run_callback+0x56/0x140
>  ? scsi_bus_freeze+0x20/0x20 [scsi_mod]
>  __device_suspend+0xf1/0x340
>  async_suspend+0x1f/0xa0
>  async_run_entry_fn+0x38/0x160
>  process_one_work+0x191/0x380
>  worker_thread+0x4e/0x3c0
>  kthread+0x109/0x140
>  ? process_one_work+0x380/0x380
>  ? kthread_create_on_node+0x70/0x70
>  ret_from_fork+0x25/0x30

Actually we are trying to fix this issue inside block layer/SCSI, please
see the following link:

https://marc.info/?l=linux-scsi&m=150703947029304&w=2

Even though this patch can make kthread to not do I/O during
suspend/resume, the SCSI quiesce still can cause similar issue
in other case, like when sending SCSI domain validation
to transport_spi, which happens in revalidate path, nothing
to do with suspend/resume.

So IMO the root cause is in SCSI's quiesce.

You can find the similar description in above link:

	Once SCSI device is put into QUIESCE, no new request except for
	RQF_PREEMPT can be dispatched to SCSI successfully, and
	scsi_device_quiesce() just simply waits for completion of I/Os
	dispatched to SCSI stack. It isn't enough at all.

	Because new request still can be coming, but all the allocated
	requests can't be dispatched successfully, so request pool can be
	consumed up easily. Then RQF_PREEMPT can't be allocated, and
	hang forever, just like the stack trace you posted.

-- 
Ming

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-03 18:53 ` [RFC 4/5] ext4: add fs freezing support " Luis R. Rodriguez
@ 2017-10-03 19:59   ` Theodore Ts'o
  2017-10-03 20:13     ` Luis R. Rodriguez
  0 siblings, 1 reply; 53+ messages in thread
From: Theodore Ts'o @ 2017-10-03 19:59 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, darrick.wong, jikos, rjw, pavel,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 11:53:12AM -0700, Luis R. Rodriguez wrote:
> @@ -4926,7 +4926,7 @@ static int ext4_unfreeze(struct super_block *sb)
>  		ext4_set_feature_journal_needs_recovery(sb);
>  	}
>  
> -	ext4_commit_super(sb, 1);
> +	ext4_commit_super(sb, 0);
>  	return 0;
>  }
>

After we remove add the NEEDS_RECOVERY flag, we need to make sure
recovery flag is pushed out to disk before any other changes are
allowed to be pushed out to disk.  That's why we originally did the
update synchronously.

There are other ways we could fulfill this requirements, but doing a
synchronous update is the simplest way to handle this.  Was it
necessary to change this given the other changes you are making the fs
freeze implementation?

					- Ted

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
@ 2017-10-03 20:02   ` Bart Van Assche
  2017-10-03 20:23     ` Luis R. Rodriguez
  2017-10-03 20:06   ` Jiri Kosina
  2017-10-03 20:58   ` Dave Chinner
  2 siblings, 1 reply; 53+ messages in thread
From: Bart Van Assche @ 2017-10-03 20:02 UTC (permalink / raw)
  To: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	jikos, len.brown, tytso
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, jgross, martin.petersen,
	oleksandr, todd.e.brandt, jack

On Tue, 2017-10-03 at 11:53 -0700, Luis R. Rodriguez wrote:
> +static bool super_allows_freeze(struct super_block *sb)
> +{
> +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> +}

A minor comment: if "!!" would be left out the compiler will perform the
conversion from int to bool implicitly so I propose to leave out the "!!"
and parentheses. Anyway, I agree with the approach of this patch and I think
that freezing filesystems before processes are frozen would be a big step
forward.

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 19:33 ` [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Ming Lei
@ 2017-10-03 20:05   ` Luis R. Rodriguez
  2017-10-03 20:47     ` Matthew Wilcox
  2017-10-04 15:43     ` Ming Lei
  0 siblings, 2 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:05 UTC (permalink / raw)
  To: Ming Lei
  Cc: Luis R. Rodriguez, viro, bart.vanassche, tytso, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Wed, Oct 04, 2017 at 03:33:01AM +0800, Ming Lei wrote:
> On Tue, Oct 03, 2017 at 11:53:08AM -0700, Luis R. Rodriguez wrote:
> > INFO: task kworker/u8:8:1320 blocked for more than 10 seconds.
> >       Tainted: G            E   4.13.0-next-20170907+ #88
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/u8:8    D    0  1320      2 0x80000000
> > Workqueue: events_unbound async_run_entry_fn
> > Call Trace:
> >  __schedule+0x2ec/0x7a0
> >  schedule+0x36/0x80
> >  io_schedule+0x16/0x40
> >  get_request+0x278/0x780
> >  ? remove_wait_queue+0x70/0x70
> >  blk_get_request+0x9c/0x110
> >  scsi_execute+0x7a/0x310 [scsi_mod]
> >  sd_sync_cache+0xa3/0x190 [sd_mod]
> >  ? blk_run_queue+0x3f/0x50
> >  sd_suspend_common+0x7b/0x130 [sd_mod]
> >  ? scsi_print_result+0x270/0x270 [scsi_mod]
> >  sd_suspend_system+0x13/0x20 [sd_mod]
> >  do_scsi_suspend+0x1b/0x30 [scsi_mod]
> >  scsi_bus_suspend_common+0xb1/0xd0 [scsi_mod]
> >  ? device_for_each_child+0x69/0x90
> >  scsi_bus_suspend+0x15/0x20 [scsi_mod]
> >  dpm_run_callback+0x56/0x140
> >  ? scsi_bus_freeze+0x20/0x20 [scsi_mod]
> >  __device_suspend+0xf1/0x340
> >  async_suspend+0x1f/0xa0
> >  async_run_entry_fn+0x38/0x160
> >  process_one_work+0x191/0x380
> >  worker_thread+0x4e/0x3c0
> >  kthread+0x109/0x140
> >  ? process_one_work+0x380/0x380
> >  ? kthread_create_on_node+0x70/0x70
> >  ret_from_fork+0x25/0x30
> 
> Actually we are trying to fix this issue inside block layer/SCSI, please
> see the following link:
> 
> https://marc.info/?l=linux-scsi&m=150703947029304&w=2
> 
> Even though this patch can make kthread to not do I/O during
> suspend/resume, the SCSI quiesce still can cause similar issue
> in other case, like when sending SCSI domain validation
> to transport_spi, which happens in revalidate path, nothing
> to do with suspend/resume.

Are you saying that the SCSI layer can generate IO even without the filesystem
triggering it?

If so then by all means these are certainly other areas we should address
quiescing as I noted in my email.

Also, *iff* the generated IO is triggered on the SCSI suspend callback, then
clearly the next question is if this is truly needed. If so then yes, it
should be quiesced and all restrictions should be considered.

Note that device pm ops get called first, then later the notifiers are
processed, and only later is userspace frozen. Its this gap this patch
set addresses, and its also where where I saw the issue creep in. Depending on
the questions above we may or not need more work in other layers.

So I am not saying this patch set is sufficient to address all IO quiescing,
quite the contrary I acknowledged that each subsystem should vet if they have
non-FS generated IO (seems you and Bart are doing  great job at doing this
analysis on SCSI). This patchset however should help with odd corner cases
which *are* triggered by the FS and the spaghetti code requirements of the
kthread freezing clearly does not suffice.

> So IMO the root cause is in SCSI's quiesce.
> 
> You can find the similar description in above link:
> 
> 	Once SCSI device is put into QUIESCE, no new request except for
> 	RQF_PREEMPT can be dispatched to SCSI successfully, and
> 	scsi_device_quiesce() just simply waits for completion of I/Os
> 	dispatched to SCSI stack. It isn't enough at all.

I see so the race here is *on* the pm ops of SCSI we have generated IO
to QUIESCE.

> 
> 	Because new request still can be coming, but all the allocated
> 	requests can't be dispatched successfully, so request pool can be
> 	consumed up easily. Then RQF_PREEMPT can't be allocated, and
> 	hang forever, just like the stack trace you posted.
> 

I see. Makes sense. So SCSI quiesce has restrictions and they're being
violated.

Anyway, don't think of this as a replacement for yours or Bart's work then, but
rather supplemental.

Are you saying we should not move forward with this patch set, or simply that
the above splat is rather properly fixed with SCSI quiescing? Given you're
explanation I'd have to agree. But even with this considered and accepted, from
a theoretical perspective -- why would this patch set actually seem to fix the
same issue? Is it, that it just *seems* to fix it?

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
  2017-10-03 20:02   ` Bart Van Assche
@ 2017-10-03 20:06   ` Jiri Kosina
  2017-10-03 20:58   ` Dave Chinner
  2 siblings, 0 replies; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:06 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, rjw, pavel,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, 3 Oct 2017, Luis R. Rodriguez wrote:

> This uses the existing filesystem freeze and thaw callbacks to
> freeze each filesystem on suspend/hibernation and thaw upon resume.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting durign suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.
> 
> This also implies that many kthread users exist which have been
> adding freezer semantics onto its kthreads without need. These also
> will need to be reviewed later.
> 
> This is based on prior work originally by Rafael Wysocki and later by
> Jiri Kosina.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Thanks a lot for picking this up; I never found time to actually finalize 
it.

	Acked-by: Jiri Kosina <jkosina@suse.cz>

for patches 2 and 5 (the fs agnostic code), which were in the core of my 
original work.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
  2017-10-03 18:59   ` Rafael J. Wysocki
@ 2017-10-03 20:12   ` Pavel Machek
  2017-10-03 20:15     ` Jiri Kosina
  2017-10-03 20:49     ` Luis R. Rodriguez
  2017-10-03 20:13   ` Bart Van Assche
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 53+ messages in thread
From: Pavel Machek @ 2017-10-03 20:12 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

On Tue 2017-10-03 11:53:13, Luis R. Rodriguez wrote:
> Now that all filesystems which used to rely on kthread
> freezing have been converted to filesystem freeze/thawing
> we can remove the kernel kthread freezer.

Are you surely-sure? You mentioned other in kernel sources of writes;
what about those?

Can knfsd be a problem?

Can memory management doing background writes be a problem?

Did you do some hibernation testing?

> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Please don't apply this just yet.

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-03 19:59   ` Theodore Ts'o
@ 2017-10-03 20:13     ` Luis R. Rodriguez
  2017-10-04  1:42       ` Theodore Ts'o
  0 siblings, 1 reply; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:13 UTC (permalink / raw)
  To: Theodore Ts'o, Luis R. Rodriguez, viro, bart.vanassche,
	ming.lei, darrick.wong, jikos, rjw, pavel, len.brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 03:59:30PM -0400, Theodore Ts'o wrote:
> On Tue, Oct 03, 2017 at 11:53:12AM -0700, Luis R. Rodriguez wrote:
> > @@ -4926,7 +4926,7 @@ static int ext4_unfreeze(struct super_block *sb)
> >  		ext4_set_feature_journal_needs_recovery(sb);
> >  	}
> >  
> > -	ext4_commit_super(sb, 1);
> > +	ext4_commit_super(sb, 0);
> >  	return 0;
> >  }
> >
> 
> After we remove add the NEEDS_RECOVERY flag, we need to make sure
> recovery flag is pushed out to disk before any other changes are
> allowed to be pushed out to disk.  That's why we originally did the
> update synchronously.

I see. I had to try to dig through linux-history to see why this was done, and
I actually could not find an exact commit which explained what you just did.
Thanks!

Hrm, so on freeze we issue the same commit as well, so is a sync *really* needed
on thaw?

> There are other ways we could fulfill this requirements, but doing a
> synchronous update is the simplest way to handle this.  Was it
> necessary to change this given the other changes you are making the fs
> freeze implementation?

No, it was am empirical evaluation done at testing, I observed bio_submit()
stalls, we never get completion from the device. Even if we call thaw at the
very end of resume, after the devices should be up, we still end up in the same
situation. Given how I order freezing filesystems after freezing userspace I do
believe we should thaw filesystems prior unfreezing userspace, its why I placed
the call where it is now.

So this is something I was hoping others could help grok/iron out. I'd prefer
if we could live with thaw'ing unchanged, but this requires to figure this
issue out. Why we can't implicate bio_submit() on fs thaw as-is in this
implementation.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
  2017-10-03 18:59   ` Rafael J. Wysocki
  2017-10-03 20:12   ` Pavel Machek
@ 2017-10-03 20:13   ` Bart Van Assche
  2017-10-03 20:17     ` Jiri Kosina
  2017-10-03 21:04   ` Dave Chinner
  2017-10-04  6:07   ` Hannes Reinecke
  4 siblings, 1 reply; 53+ messages in thread
From: Bart Van Assche @ 2017-10-03 20:13 UTC (permalink / raw)
  To: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	jikos, len.brown, tytso
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, jgross, martin.petersen,
	oleksandr, todd.e.brandt, jack

On Tue, 2017-10-03 at 11:53 -0700, Luis R. Rodriguez wrote:
> Now that all filesystems which used to rely on kthread
> freezing have been converted to filesystem freeze/thawing
> we can remove the kernel kthread freezer.

Many subsystems use system_freezable_wq and/or
system_freezable_power_efficient_wq to queue work that should not run while
processes are frozen. Should it be mentioned in the patch description that
this patch does not affect workqueues for which WQ_FREEZABLE has been set?

What about the many drivers outside filesystems that use the set_freezable() /
try_to_freeze() / wait_event_freezable() API?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:12   ` Pavel Machek
@ 2017-10-03 20:15     ` Jiri Kosina
  2017-10-03 20:21       ` Pavel Machek
  2017-10-03 20:49     ` Luis R. Rodriguez
  1 sibling, 1 reply; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:15 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, rjw, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, 3 Oct 2017, Pavel Machek wrote:

> Can knfsd be a problem?

It'd actually be useful to walk through all the 'try_to_freeze()' 
instances in kthreads, and analyze those. I've started this effort, and as 
a result I removed most of them; but there are still quite a few 
remaining. And knfsd is one of those.

> Can memory management doing background writes be a problem?

This point I don't understand. What exactly do you mean?

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:13   ` Bart Van Assche
@ 2017-10-03 20:17     ` Jiri Kosina
  2017-10-03 20:21       ` Bart Van Assche
  2017-10-03 20:51       ` Jiri Kosina
  0 siblings, 2 replies; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:17 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	len.brown, tytso, boris.ostrovsky, ONeukum, linux-block,
	linux-kernel, nborisov, oleg.b.antonyan, linux-pm, linux-xfs,
	jgross, martin.petersen, oleksandr, todd.e.brandt, jack

On Tue, 3 Oct 2017, Bart Van Assche wrote:

> What about the many drivers outside filesystems that use the 
> set_freezable() / try_to_freeze() / wait_event_freezable() API?

Many/most of them are just completely bogus and pointless. I've killed a 
lot of those in the past, but the copy/paste programming is just too 
strong enemy to fight against.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:15     ` Jiri Kosina
@ 2017-10-03 20:21       ` Pavel Machek
  2017-10-03 20:38         ` Jiri Kosina
  0 siblings, 1 reply; 53+ messages in thread
From: Pavel Machek @ 2017-10-03 20:21 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, rjw, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 592 bytes --]

Hi!

> > Can memory management doing background writes be a problem?
> 
> This point I don't understand. What exactly do you mean?

Hibernation:

We freeze, do system snapshot, unfreeze, and write image to
disk. Memory management wakes up (this used to be called bdflush),
decides its time to write some dirty data back to disk. Powerdown.

But state on the disk now does not correspond to what the image
expects. Problem?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:17     ` Jiri Kosina
@ 2017-10-03 20:21       ` Bart Van Assche
  2017-10-03 20:24         ` Jiri Kosina
  2017-10-03 20:27         ` Luis R. Rodriguez
  2017-10-03 20:51       ` Jiri Kosina
  1 sibling, 2 replies; 53+ messages in thread
From: Bart Van Assche @ 2017-10-03 20:21 UTC (permalink / raw)
  To: jikos
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, rjw, jgross, oleksandr, todd.e.brandt, martin.petersen,
	linux-fsdevel, mcgrof, len.brown, tytso, jack

On Tue, 2017-10-03 at 22:17 +0200, Jiri Kosina wrote:
> On Tue, 3 Oct 2017, Bart Van Assche wrote:
> > What about the many drivers outside filesystems that use the 
> > set_freezable() / try_to_freeze() / wait_event_freezable() API?
> 
> Many/most of them are just completely bogus and pointless. I've killed a 
> lot of those in the past, but the copy/paste programming is just too 
> strong enemy to fight against.

If just a single driver would use that API to prevent that I/O occurs while
processes are frozen then this patch will break that driver. I would like to
know your opinion about the following patch in particular: "[PATCH v4 1/7] md:
Make md resync and reshape threads freezable"
(https://www.spinics.net/lists/linux-block/msg17854.html).

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 20:02   ` Bart Van Assche
@ 2017-10-03 20:23     ` Luis R. Rodriguez
  2017-10-03 20:32       ` Bart Van Assche
  0 siblings, 1 reply; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	jikos, len.brown, tytso, boris.ostrovsky, ONeukum, linux-block,
	linux-kernel, nborisov, oleg.b.antonyan, linux-pm, linux-xfs,
	jgross, martin.petersen, oleksandr, todd.e.brandt, jack

On Tue, Oct 03, 2017 at 08:02:22PM +0000, Bart Van Assche wrote:
> On Tue, 2017-10-03 at 11:53 -0700, Luis R. Rodriguez wrote:
> > +static bool super_allows_freeze(struct super_block *sb)
> > +{
> > +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> > +}
> 
> A minor comment: if "!!" would be left out the compiler will perform the
> conversion from int to bool implicitly

For all compilers?

> so I propose to leave out the "!!" and parentheses.

OK!

> Anyway, I agree with the approach of this patch and I think
> that freezing filesystems before processes are frozen would be a big step
> forward.

Great! But please note, the current implementation calls fs_suspend_freeze()
*after* try_to_freeze_tasks(), ie: this implementation freezes userspace and
only after then filesystems.

Order will be *critical* here to get right, so we should definitely figure
out if this is definitely the right place (TM) to call fs_suspend_freeze().

Lastly, a final minor implementation note:

I think using a PM notifier would have been much cleaner, in fact it was my the
way I originally implemented this orthogonally to Jiri's work, however to get
this right the semantics of __pm_notifier_call_chain() would need to be
expanded with another state, perhaps PM_USERSPACE_FROZEN, for example. I
decided in the end a new state was not worth it give we would just have one
user: fs freezing. So to be clear using a notifier to wrap this code into
the fs code and not touching kernel/power/process.c is not possible with
today's semantics nor do I think its worth it to expand on these semantics.

This approach is explicit about order and requirements for those that should
care: those that will maintain kernel/power/process.c and friends. Having
this in a notifier would shift this implicitly.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:21       ` Bart Van Assche
@ 2017-10-03 20:24         ` Jiri Kosina
  2017-10-03 20:27         ` Luis R. Rodriguez
  1 sibling, 0 replies; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, rjw, jgross, oleksandr, todd.e.brandt, martin.petersen,
	linux-fsdevel, mcgrof, len.brown, tytso, jack

On Tue, 3 Oct 2017, Bart Van Assche wrote:

> If just a single driver would use that API to prevent that I/O occurs while
> processes are frozen then this patch will break that driver. I would like to
> know your opinion about the following patch in particular: "[PATCH v4 1/7] md:
> Make md resync and reshape threads freezable"
> (https://www.spinics.net/lists/linux-block/msg17854.html).

Alright, if there are kthreads which are actually *creating* new I/O 
(contrary to kthreads being I/O completion helpers) -- which apparently is 
the md case here, then those definitely have to be frozen in some form.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:21       ` Bart Van Assche
  2017-10-03 20:24         ` Jiri Kosina
@ 2017-10-03 20:27         ` Luis R. Rodriguez
  1 sibling, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:27 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: jikos, boris.ostrovsky, ONeukum, linux-block, linux-kernel,
	nborisov, oleg.b.antonyan, linux-pm, linux-xfs, pavel,
	darrick.wong, viro, ming.lei, rjw, jgross, oleksandr,
	todd.e.brandt, martin.petersen, linux-fsdevel, mcgrof, len.brown,
	tytso, jack

On Tue, Oct 03, 2017 at 08:21:42PM +0000, Bart Van Assche wrote:
> On Tue, 2017-10-03 at 22:17 +0200, Jiri Kosina wrote:
> > On Tue, 3 Oct 2017, Bart Van Assche wrote:
> > > What about the many drivers outside filesystems that use the 
> > > set_freezable() / try_to_freeze() / wait_event_freezable() API?
> > 
> > Many/most of them are just completely bogus and pointless. I've killed a 
> > lot of those in the past, but the copy/paste programming is just too 
> > strong enemy to fight against.
> 
> If just a single driver would use that API to prevent that I/O occurs while
> processes are frozen then this patch will break that driver.

Yes! And although as Jiri points out, its debatable where this is being used,
but as you suggest there may be *valid* reasons for it *now* even though
originally it *may* have been bogus...

Its why I believe this now can only be done piecemeal wise, slowly but steady.
To avoid regressions, and make this effort bisectable.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 20:23     ` Luis R. Rodriguez
@ 2017-10-03 20:32       ` Bart Van Assche
  2017-10-03 20:39         ` Luis R. Rodriguez
  0 siblings, 1 reply; 53+ messages in thread
From: Bart Van Assche @ 2017-10-03 20:32 UTC (permalink / raw)
  To: mcgrof
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, rjw, jgross, oleksandr, todd.e.brandt, martin.petersen,
	linux-fsdevel, jikos, len.brown, tytso, jack

On Tue, 2017-10-03 at 22:23 +0200, Luis R. Rodriguez wrote:
> On Tue, Oct 03, 2017 at 08:02:22PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-10-03 at 11:53 -0700, Luis R. Rodriguez wrote:
> > > +static bool super_allows_freeze(struct super_block *sb)
> > > +{
> > > +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> > > +}
> > 
> > A minor comment: if "!!" would be left out the compiler will perform the
> > conversion from int to bool implicitly
> 
> For all compilers?

Let's have a look at the output of the following commands:

$ PAGER= git grep 'typedef.*[[:blank:]]bool;' include
include/linux/types.h:typedef _Bool                     bool;
$ PAGER= git grep std= Makefile
Makefile:               -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
Makefile:                  -std=gnu89 $(call cc-option,-fno-PIE)

>From https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/C-Dialect-Options.html#C-Dialect-Options:
‘gnu89’
GNU dialect of ISO C90 (including some C99 features).

I think this means that the Linux kernel tree can only be compiled correctly
by compilers that support the C11 type _Bool.

> > Anyway, I agree with the approach of this patch and I think
> > that freezing filesystems before processes are frozen would be a big step
> > forward.
> 
> Great! But please note, the current implementation calls fs_suspend_freeze()
> *after* try_to_freeze_tasks(), ie: this implementation freezes userspace and
> only after then filesystems.

What will the impact be of that choice on filesystems implemented in userspace?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:21       ` Pavel Machek
@ 2017-10-03 20:38         ` Jiri Kosina
  2017-10-03 20:41           ` Rafael J. Wysocki
  2017-10-03 20:57           ` Pavel Machek
  0 siblings, 2 replies; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, rjw, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, 3 Oct 2017, Pavel Machek wrote:

> > This point I don't understand. What exactly do you mean?
> 
> Hibernation:
> 
> We freeze, do system snapshot, unfreeze, and write image to
> disk. 

Where/why do you unfreeze between creating the snapshot and writing it 
out?

Again, I agree that the (rare) kthreads that are actually "creating" new 
I/O have to be somehow frozen and require special care.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 20:32       ` Bart Van Assche
@ 2017-10-03 20:39         ` Luis R. Rodriguez
  0 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:39 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: mcgrof, boris.ostrovsky, ONeukum, linux-block, linux-kernel,
	nborisov, oleg.b.antonyan, linux-pm, linux-xfs, pavel,
	darrick.wong, viro, ming.lei, rjw, jgross, oleksandr,
	todd.e.brandt, martin.petersen, linux-fsdevel, jikos, len.brown,
	tytso, jack

On Tue, Oct 03, 2017 at 08:32:39PM +0000, Bart Van Assche wrote:
> On Tue, 2017-10-03 at 22:23 +0200, Luis R. Rodriguez wrote:
> > On Tue, Oct 03, 2017 at 08:02:22PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-10-03 at 11:53 -0700, Luis R. Rodriguez wrote:
> > > > +static bool super_allows_freeze(struct super_block *sb)
> > > > +{
> > > > +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> > > > +}
> > > 
> > > A minor comment: if "!!" would be left out the compiler will perform the
> > > conversion from int to bool implicitly
> > 
> > For all compilers?
> 
> Let's have a look at the output of the following commands:
> 
> $ PAGER= git grep 'typedef.*[[:blank:]]bool;' include
> include/linux/types.h:typedef _Bool                     bool;
> $ PAGER= git grep std= Makefile
> Makefile:               -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
> Makefile:                  -std=gnu89 $(call cc-option,-fno-PIE)
> 
> From https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/C-Dialect-Options.html#C-Dialect-Options:
> ‘gnu89’
> GNU dialect of ISO C90 (including some C99 features).
> 
> I think this means that the Linux kernel tree can only be compiled correctly
> by compilers that support the C11 type _Bool.

:*) beautiful, thanks.

> > > Anyway, I agree with the approach of this patch and I think
> > > that freezing filesystems before processes are frozen would be a big step
> > > forward.
> > 
> > Great! But please note, the current implementation calls fs_suspend_freeze()
> > *after* try_to_freeze_tasks(), ie: this implementation freezes userspace and
> > only after then filesystems.
> 
> What will the impact be of that choice on filesystems implemented in userspace?

Depends on what kernel hooks those use? Also now is a good time for those working
on userspace filesystems to chime in :) Its why I am stating -- I am not saying
I have found the right order, I have find the right order that works for me, and
we need consensus on what the right order should be.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:38         ` Jiri Kosina
@ 2017-10-03 20:41           ` Rafael J. Wysocki
  2017-10-03 20:57           ` Pavel Machek
  1 sibling, 0 replies; 53+ messages in thread
From: Rafael J. Wysocki @ 2017-10-03 20:41 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Pavel Machek, Luis R. Rodriguez, Al Viro, bart.vanassche,
	ming.lei, Ted Ts'o, darrick.wong, Rafael J. Wysocki,
	Len Brown, linux-fsdevel, Boris Ostrovsky, Juergen Gross,
	Todd Brandt, nborisov, Jan Kara, martin.petersen, Oliver Neukum,
	oleksandr, oleg.b.antonyan, Linux PM, linux-block, linux-xfs,
	Linux Kernel Mailing List

On Tue, Oct 3, 2017 at 10:38 PM, Jiri Kosina <jikos@kernel.org> wrote:
> On Tue, 3 Oct 2017, Pavel Machek wrote:
>
>> > This point I don't understand. What exactly do you mean?
>>
>> Hibernation:
>>
>> We freeze, do system snapshot, unfreeze, and write image to
>> disk.
>
> Where/why do you unfreeze between creating the snapshot and writing it
> out?

We do not unfreeze then.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 20:05   ` Luis R. Rodriguez
@ 2017-10-03 20:47     ` Matthew Wilcox
  2017-10-03 20:54       ` Luis R. Rodriguez
  2017-10-03 20:59       ` Bart Van Assche
  2017-10-04 15:43     ` Ming Lei
  1 sibling, 2 replies; 53+ messages in thread
From: Matthew Wilcox @ 2017-10-03 20:47 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Ming Lei, viro, bart.vanassche, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, Oct 03, 2017 at 10:05:11PM +0200, Luis R. Rodriguez wrote:
> On Wed, Oct 04, 2017 at 03:33:01AM +0800, Ming Lei wrote:
> > On Tue, Oct 03, 2017 at 11:53:08AM -0700, Luis R. Rodriguez wrote:
> > > INFO: task kworker/u8:8:1320 blocked for more than 10 seconds.
> > >       Tainted: G            E   4.13.0-next-20170907+ #88
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > kworker/u8:8    D    0  1320      2 0x80000000
> > > Workqueue: events_unbound async_run_entry_fn
> > > Call Trace:
> > >  __schedule+0x2ec/0x7a0
> > >  schedule+0x36/0x80
> > >  io_schedule+0x16/0x40
> > >  get_request+0x278/0x780
> > >  ? remove_wait_queue+0x70/0x70
> > >  blk_get_request+0x9c/0x110
> > >  scsi_execute+0x7a/0x310 [scsi_mod]
> > >  sd_sync_cache+0xa3/0x190 [sd_mod]
> > >  ? blk_run_queue+0x3f/0x50
> > >  sd_suspend_common+0x7b/0x130 [sd_mod]
> > >  ? scsi_print_result+0x270/0x270 [scsi_mod]
> > >  sd_suspend_system+0x13/0x20 [sd_mod]
> > >  do_scsi_suspend+0x1b/0x30 [scsi_mod]
> > >  scsi_bus_suspend_common+0xb1/0xd0 [scsi_mod]
> > >  ? device_for_each_child+0x69/0x90
> > >  scsi_bus_suspend+0x15/0x20 [scsi_mod]
> > >  dpm_run_callback+0x56/0x140
> > >  ? scsi_bus_freeze+0x20/0x20 [scsi_mod]
> > >  __device_suspend+0xf1/0x340
> > >  async_suspend+0x1f/0xa0
> > >  async_run_entry_fn+0x38/0x160
> > >  process_one_work+0x191/0x380
> > >  worker_thread+0x4e/0x3c0
> > >  kthread+0x109/0x140
> > >  ? process_one_work+0x380/0x380
> > >  ? kthread_create_on_node+0x70/0x70
> > >  ret_from_fork+0x25/0x30
> > 
> > Actually we are trying to fix this issue inside block layer/SCSI, please
> > see the following link:
> > 
> > https://marc.info/?l=linux-scsi&m=150703947029304&w=2
> > 
> > Even though this patch can make kthread to not do I/O during
> > suspend/resume, the SCSI quiesce still can cause similar issue
> > in other case, like when sending SCSI domain validation
> > to transport_spi, which happens in revalidate path, nothing
> > to do with suspend/resume.
> 
> Are you saying that the SCSI layer can generate IO even without the filesystem
> triggering it?

The SCSI layer can send SCSI commands; they aren't I/Os in the sense that
they do reads and writes to media, but they are block requests.  Maybe those
should be allowed even to frozen devices?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:12   ` Pavel Machek
  2017-10-03 20:15     ` Jiri Kosina
@ 2017-10-03 20:49     ` Luis R. Rodriguez
  2017-10-06 12:07       ` Pavel Machek
  1 sibling, 1 reply; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:49 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 10:12:04PM +0200, Pavel Machek wrote:
> On Tue 2017-10-03 11:53:13, Luis R. Rodriguez wrote:
> > Now that all filesystems which used to rely on kthread
> > freezing have been converted to filesystem freeze/thawing
> > we can remove the kernel kthread freezer.
> 
> Are you surely-sure? You mentioned other in kernel sources of writes;
> what about those?

You perhaps did not read the cover letter, it describes this patch will very
likely take time to *ever* apply. We must cover tons of grounds first, to
address precisely what you say.

In fact other than kthreads that generate IO we may have now even crazy stupid
kthreads using the freezer API which *do not generate IO* which are totally
bogus but now acts as "features". We'll need to carefully carve out all freezer 
API uses on all kthreads. This should be done atomically to avoid regressions
and ensure bisectability.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:17     ` Jiri Kosina
  2017-10-03 20:21       ` Bart Van Assche
@ 2017-10-03 20:51       ` Jiri Kosina
  1 sibling, 0 replies; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 20:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: pavel, darrick.wong, viro, ming.lei, rjw, mcgrof, linux-fsdevel,
	len.brown, tytso, boris.ostrovsky, ONeukum, linux-block,
	linux-kernel, nborisov, oleg.b.antonyan, linux-pm, linux-xfs,
	jgross, martin.petersen, oleksandr, todd.e.brandt, jack

On Tue, 3 Oct 2017, Jiri Kosina wrote:

> > What about the many drivers outside filesystems that use the 
> > set_freezable() / try_to_freeze() / wait_event_freezable() API?
> 
> Many/most of them are just completely bogus and pointless. 

More specifically -- they don't really care at all whether they get 
scheduled out exactly at the try_to_freeze() point; they are perfectly 
happy being scheduled out at any other scheduling point, and land on 
runqueue after the resume has been completed.

Sure, certain drivers need to take action when system is undergoing 
hibernation/suspend. But that's what PM callbacks are for, not kthread 
hibernation.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 20:47     ` Matthew Wilcox
@ 2017-10-03 20:54       ` Luis R. Rodriguez
  2017-10-03 20:59       ` Bart Van Assche
  1 sibling, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 20:54 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Luis R. Rodriguez, Ming Lei, viro, bart.vanassche, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 01:47:55PM -0700, Matthew Wilcox wrote:
> On Tue, Oct 03, 2017 at 10:05:11PM +0200, Luis R. Rodriguez wrote:
> > On Wed, Oct 04, 2017 at 03:33:01AM +0800, Ming Lei wrote:
> > > Even though this patch can make kthread to not do I/O during
> > > suspend/resume, the SCSI quiesce still can cause similar issue
> > > in other case, like when sending SCSI domain validation
> > > to transport_spi, which happens in revalidate path, nothing
> > > to do with suspend/resume.
> > 
> > Are you saying that the SCSI layer can generate IO even without the filesystem
> > triggering it?
> 
> The SCSI layer can send SCSI commands; they aren't I/Os in the sense that
> they do reads and writes to media,

Then there should be no issue waiting for the device to respond?

> but they are block requests.  Maybe those should be allowed even to frozen
> devices?

I'm afraid *maybe* won't suffice here, we need a clear explanation as
to what happens with frozen devices on the block layer *if* the device
driver is already suspended. Does the block layer just queue these?
If the goal is to just get queued but not processed, then why even
send these quiesce commands? Wouldn't they only be processed *after*
resume?

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:38         ` Jiri Kosina
  2017-10-03 20:41           ` Rafael J. Wysocki
@ 2017-10-03 20:57           ` Pavel Machek
  2017-10-03 21:00             ` Jiri Kosina
  1 sibling, 1 reply; 53+ messages in thread
From: Pavel Machek @ 2017-10-03 20:57 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, rjw, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

On Tue 2017-10-03 22:38:33, Jiri Kosina wrote:
> On Tue, 3 Oct 2017, Pavel Machek wrote:
> 
> > > This point I don't understand. What exactly do you mean?
> > 
> > Hibernation:
> > 
> > We freeze, do system snapshot, unfreeze, and write image to
> > disk. 
> 
> Where/why do you unfreeze between creating the snapshot and writing it 
> out?

Yep, sorry, we don't unfreeze.

> Again, I agree that the (rare) kthreads that are actually "creating" new 
> I/O have to be somehow frozen and require special care.

Agreed. Was any effort made to identify those special kernel threads?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
  2017-10-03 20:02   ` Bart Van Assche
  2017-10-03 20:06   ` Jiri Kosina
@ 2017-10-03 20:58   ` Dave Chinner
  2017-10-03 21:16     ` Luis R. Rodriguez
  2 siblings, 1 reply; 53+ messages in thread
From: Dave Chinner @ 2017-10-03 20:58 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, Oct 03, 2017 at 11:53:10AM -0700, Luis R. Rodriguez wrote:
> This uses the existing filesystem freeze and thaw callbacks to
> freeze each filesystem on suspend/hibernation and thaw upon resume.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting durign suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.
> 
> This also implies that many kthread users exist which have been
> adding freezer semantics onto its kthreads without need. These also
> will need to be reviewed later.
> 
> This is based on prior work originally by Rafael Wysocki and later by
> Jiri Kosina.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  fs/super.c             | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 13 +++++++++
>  kernel/power/process.c | 14 ++++++++-
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index d45e92d9a38f..ce8da8b187b1 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1572,3 +1572,82 @@ int thaw_super(struct super_block *sb)
>  	return 0;
>  }
>  EXPORT_SYMBOL(thaw_super);
> +
> +#ifdef CONFIG_PM_SLEEP
> +static bool super_allows_freeze(struct super_block *sb)
> +{
> +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> +}

That's a completely misleading function name. All superblocks can be
frozen - freeze_super() is filesystem independent. And given that, I
don't see why these super_should_freeze() hoops need to be jumped
through...

> +
> +static bool super_should_freeze(struct super_block *sb)
> +{
> +	if (!sb->s_root)
> +		return false;
> +	if (!(sb->s_flags & MS_BORN))
> +		return false;
> +	/*
> +	 * We don't freeze virtual filesystems, we skip those filesystems with
> +	 * no backing device.
> +	 */
> +	if (sb->s_bdi == &noop_backing_dev_info)
> +		return false;
> +	/* No need to freeze read-only filesystems */
> +	if (sb->s_flags & MS_RDONLY)
> +		return false;
> +	if (!super_allows_freeze(sb))
> +		return false;
> +
> +	return true;
> +}

> +
> +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> +{
> +	int error = 0;
> +
> +	spin_lock(&sb_lock);
> +	if (!super_should_freeze(sb))
> +		goto out;
> +
> +	up_read(&sb->s_umount);
> +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> +	error = freeze_super(sb);
> +	down_read(&sb->s_umount);
> +out:
> +	if (error && error != -EBUSY)
> +		pr_notice("%s (%s): Unable to freeze, error=%d",
> +			  sb->s_type->name, sb->s_id, error);
> +	spin_unlock(&sb_lock);
> +	return error;
> +}

I don't think this was ever tested.  Calling freeze_super() with a
spinlock held with through "sleeping in atomic" errors all over the
place.

Also, the s_umount lock juggling is nasty. Your new copy+pasted
iterate_supers_reverse() takes the lock in read mode, yet all the
freeze/thaw callers here want to take it in write mode. So, really,
iterate_supers_reverse() needs to be iterate_supers_reverse_excl()
and take the write lock, and freeze_super/thaw_super need to be
factored into locked and unlocked versions.

In which case, we end up with:

int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
{
	return freeze_locked_super(sb);
}

int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
{
	return thaw_locked_super(sb);
}

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 20:47     ` Matthew Wilcox
  2017-10-03 20:54       ` Luis R. Rodriguez
@ 2017-10-03 20:59       ` Bart Van Assche
  1 sibling, 0 replies; 53+ messages in thread
From: Bart Van Assche @ 2017-10-03 20:59 UTC (permalink / raw)
  To: willy, mcgrof
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, rjw, jgross, oleksandr, todd.e.brandt, martin.petersen,
	linux-fsdevel, jikos, len.brown, tytso, jack

On Tue, 2017-10-03 at 13:47 -0700, Matthew Wilcox wrote:
> The SCSI layer can send SCSI commands; they aren't I/Os in the sense that
> they do reads and writes to media, but they are block requests.  Maybe those
> should be allowed even to frozen devices?

Hello Matthew,

I think only RQF_PM requests should be processed for frozen SCSI devices
and no other requests.

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:57           ` Pavel Machek
@ 2017-10-03 21:00             ` Jiri Kosina
  2017-10-03 21:09               ` Shuah Khan
  0 siblings, 1 reply; 53+ messages in thread
From: Jiri Kosina @ 2017-10-03 21:00 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, rjw, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, 3 Oct 2017, Pavel Machek wrote:

> > Again, I agree that the (rare) kthreads that are actually "creating" new 
> > I/O have to be somehow frozen and require special care.
> 
> Agreed. Was any effort made to identify those special kernel threads?

I don't think there is any other way than just inspecting all the 
try_to_freeze() instances in the kernel, and understanding what that 
particular kthread is doing.

I've cleaned up most of the low-hanging fruit already, where the 
try_to_freeze() was obviously completely pointless, but a lot more time 
needs to be invested into this.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
                     ` (2 preceding siblings ...)
  2017-10-03 20:13   ` Bart Van Assche
@ 2017-10-03 21:04   ` Dave Chinner
  2017-10-03 21:07     ` Luis R. Rodriguez
  2017-10-04  6:07   ` Hannes Reinecke
  4 siblings, 1 reply; 53+ messages in thread
From: Dave Chinner @ 2017-10-03 21:04 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tue, Oct 03, 2017 at 11:53:13AM -0700, Luis R. Rodriguez wrote:
> Now that all filesystems which used to rely on kthread
> freezing have been converted to filesystem freeze/thawing
> we can remove the kernel kthread freezer.

Really? There's no other subsystem that relies on kernel thread
and workqueue freezing to function correctly on suspend?

> -/**
> - * freeze_kernel_threads - Make freezable kernel threads go to the refrigerator.
> - *
> - * On success, returns 0.  On failure, -errno and only the kernel threads are
> - * thawed, so as to give a chance to the caller to do additional cleanups
> - * (if any) before thawing the userspace tasks. So, it is the responsibility
> - * of the caller to thaw the userspace tasks, when the time is right.
> - */
> -int freeze_kernel_threads(void)
> -{
> -	int error;
> -
> -	pr_info("Freezing remaining freezable tasks ... ");
> -
> -	pm_nosig_freezing = true;
> -	error = try_to_freeze_tasks(false);

This freezes workqueues as well as kernel threads, so this affects
any subsystem that uses WQ_FREEZABLE. A quick glance tells me this
includes graphics drivers, spi devices, usb hubs, power management,
and a few filesystems, too.

> -	if (!error)
> -		pr_cont("done.");
> -
> -	pr_cont("\n");
> -	BUG_ON(in_atomic());
> -
> -	if (error)
> -		thaw_kernel_threads();
> -	return error;
> -}
> -
>  void thaw_processes(void)
>  {
>  	struct task_struct *g, *p;
> @@ -234,23 +207,3 @@ void thaw_processes(void)
>  	pr_cont("done.\n");
>  	trace_suspend_resume(TPS("thaw_processes"), 0, false);
>  }
> -
> -void thaw_kernel_threads(void)
> -{
> -	struct task_struct *g, *p;
> -
> -	pm_nosig_freezing = false;
> -	pr_info("Restarting kernel threads ... ");
> -
> -	thaw_workqueues();

And this is where the workqueues are thawed.

So I doubt we can safely remove all this code like this...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 21:04   ` Dave Chinner
@ 2017-10-03 21:07     ` Luis R. Rodriguez
  0 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 21:07 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Wed, Oct 04, 2017 at 08:04:49AM +1100, Dave Chinner wrote:
> On Tue, Oct 03, 2017 at 11:53:13AM -0700, Luis R. Rodriguez wrote:
> > Now that all filesystems which used to rely on kthread
> > freezing have been converted to filesystem freeze/thawing
> > we can remove the kernel kthread freezer.
> 
> Really? There's no other subsystem that relies on kernel thread
> and workqueue freezing to function correctly on suspend?

On my cover letter I noted this patch should not be consideredd
until *all* kthreads are vetted. So I agree vehemently with you.
The patch set only addressed 2 filesystem, so two kthreads. Much
other work would be needed before this lat patch could *ever*
be considered.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 21:00             ` Jiri Kosina
@ 2017-10-03 21:09               ` Shuah Khan
  2017-10-03 21:18                 ` Luis R. Rodriguez
  0 siblings, 1 reply; 53+ messages in thread
From: Shuah Khan @ 2017-10-03 21:09 UTC (permalink / raw)
  To: Jiri Kosina, mchehab
  Cc: Pavel Machek, Luis R. Rodriguez, viro, bart.vanassche, ming.lei,
	tytso, darrick.wong, rjw, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Linux PM list, linux-block, linux-xfs, LKML, linux-media,
	shuahkh

On Tue, Oct 3, 2017 at 3:00 PM, Jiri Kosina <jikos@kernel.org> wrote:
> On Tue, 3 Oct 2017, Pavel Machek wrote:
>
>> > Again, I agree that the (rare) kthreads that are actually "creating" new
>> > I/O have to be somehow frozen and require special care.
>>
>> Agreed. Was any effort made to identify those special kernel threads?
>
> I don't think there is any other way than just inspecting all the
> try_to_freeze() instances in the kernel, and understanding what that
> particular kthread is doing.
>
> I've cleaned up most of the low-hanging fruit already, where the
> try_to_freeze() was obviously completely pointless, but a lot more time
> needs to be invested into this.
>

There are about 36 drivers that call try_to_freeze() and half (18 ) of
those are media drivers. Maybe it is easier handle sub-system by
sub-system basis for a review of which one of these usages could be
removed. cc'ing Mauro and linux-media

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:59   ` Rafael J. Wysocki
@ 2017-10-03 21:15     ` Rafael J. Wysocki
  2017-10-04  0:47       ` Luis R. Rodriguez
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2017-10-03 21:15 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos,
	pavel, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Tuesday, October 3, 2017 8:59:00 PM CEST Rafael J. Wysocki wrote:
> On Tuesday, October 3, 2017 8:53:13 PM CEST Luis R. Rodriguez wrote:
> > Now that all filesystems which used to rely on kthread
> > freezing have been converted to filesystem freeze/thawing
> > we can remove the kernel kthread freezer.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> 
> I like this one. :-)

However, suspend_freeze/thaw_processes() require some more work.

In particular, the freezing of workqueues is being removed here
without a replacement.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 2/5] fs: freeze on suspend and thaw on resume
  2017-10-03 20:58   ` Dave Chinner
@ 2017-10-03 21:16     ` Luis R. Rodriguez
  0 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 21:16 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Wed, Oct 04, 2017 at 07:58:41AM +1100, Dave Chinner wrote:
> On Tue, Oct 03, 2017 at 11:53:10AM -0700, Luis R. Rodriguez wrote:
> > diff --git a/fs/super.c b/fs/super.c
> > index d45e92d9a38f..ce8da8b187b1 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -1572,3 +1572,82 @@ int thaw_super(struct super_block *sb)
> >  	return 0;
> >  }
> >  EXPORT_SYMBOL(thaw_super);
> > +
> > +#ifdef CONFIG_PM_SLEEP
> > +static bool super_allows_freeze(struct super_block *sb)
> > +{
> > +	return !!(sb->s_type->fs_flags & FS_FREEZE_ON_SUSPEND);
> > +}
> 
> That's a completely misleading function name. All superblocks can be
> frozen - freeze_super() is filesystem independent. And given that, I
> don't see why these super_should_freeze() hoops need to be jumped
> through...

I added this given ext4's current thaw implementation stalls on resume 
as it implicates a bio_submit() and this never completes. So I refactored
ext4 thaw skip a sync on thaw. This requires more eyeballs. This may be an
underlying issue elsewhere.  If its not an bug elsewhere or on ordering, then
there may be some restrictions on thaw when used on resume. Refer to my notes
to Ted on patch #4 for ext4.

> > +int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> > +{
> > +	int error = 0;
> > +
> > +	spin_lock(&sb_lock);
> > +	if (!super_should_freeze(sb))
> > +		goto out;
> > +
> > +	up_read(&sb->s_umount);
> > +	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
> > +	error = freeze_super(sb);
> > +	down_read(&sb->s_umount);
> > +out:
> > +	if (error && error != -EBUSY)
> > +		pr_notice("%s (%s): Unable to freeze, error=%d",
> > +			  sb->s_type->name, sb->s_id, error);
> > +	spin_unlock(&sb_lock);
> > +	return error;
> > +}
> 
> I don't think this was ever tested.  Calling freeze_super() with a
> spinlock held with through "sleeping in atomic" errors all over the
> place.

No, I run time tested it with a rootfs with ext4 and xfs with my
development tree and hammering on both while I loop on suspend
and resume.

> Also, the s_umount lock juggling is nasty. Your new copy+pasted
> iterate_supers_reverse() takes the lock in read mode, yet all the
> freeze/thaw callers here want to take it in write mode.

Yeap, yuk!

> So, really,
> iterate_supers_reverse() needs to be iterate_supers_reverse_excl()
> and take the write lock, and freeze_super/thaw_super need to be
> factored into locked and unlocked versions.
> 
> In which case, we end up with:
> 
> int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
> {
> 	return freeze_locked_super(sb);
> }
> 
> int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
> {
> 	return thaw_locked_super(sb);
> }

Groovy, thanks, I suspected that locking was convoluted and
we could come up with something better. Will do it this way.

Its really what I hoped we could do :) I just needed to get
it in writing.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 21:09               ` Shuah Khan
@ 2017-10-03 21:18                 ` Luis R. Rodriguez
  0 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-03 21:18 UTC (permalink / raw)
  To: Shuah Khan
  Cc: Jiri Kosina, mchehab, Pavel Machek, Luis R. Rodriguez, viro,
	bart.vanassche, ming.lei, tytso, darrick.wong, rjw, len.brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	Linux PM list, linux-block, linux-xfs, LKML, linux-media,
	shuahkh

On Tue, Oct 03, 2017 at 03:09:53PM -0600, Shuah Khan wrote:
> On Tue, Oct 3, 2017 at 3:00 PM, Jiri Kosina <jikos@kernel.org> wrote:
> > On Tue, 3 Oct 2017, Pavel Machek wrote:
> >
> >> > Again, I agree that the (rare) kthreads that are actually "creating" new
> >> > I/O have to be somehow frozen and require special care.
> >>
> >> Agreed. Was any effort made to identify those special kernel threads?
> >
> > I don't think there is any other way than just inspecting all the
> > try_to_freeze() instances in the kernel, and understanding what that
> > particular kthread is doing.
> >
> > I've cleaned up most of the low-hanging fruit already, where the
> > try_to_freeze() was obviously completely pointless, but a lot more time
> > needs to be invested into this.
> >
> 
> There are about 36 drivers that call try_to_freeze() and half (18 ) of
> those are media drivers. Maybe it is easier handle sub-system by
> sub-system basis for a review of which one of these usages could be
> removed. cc'ing Mauro and linux-media

Yes :)

I guess no one reads cover letters, but indeed. To be clear, this last
patch should only go in after a few kernels from now all kthreads
are vetted for piece-meal wise.

This patch would be the nail on the kthread freezer coffin. It should
go in last, who knows how many years from now, and if ever.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 21:15     ` Rafael J. Wysocki
@ 2017-10-04  0:47       ` Luis R. Rodriguez
  2017-10-04  1:03         ` Bart Van Assche
  2017-10-04  7:18         ` Dave Chinner
  0 siblings, 2 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-10-04  0:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 11:15:07PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, October 3, 2017 8:59:00 PM CEST Rafael J. Wysocki wrote:
> > On Tuesday, October 3, 2017 8:53:13 PM CEST Luis R. Rodriguez wrote:
> > > Now that all filesystems which used to rely on kthread
> > > freezing have been converted to filesystem freeze/thawing
> > > we can remove the kernel kthread freezer.
> > > 
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > 
> > I like this one. :-)
> 
> However, suspend_freeze/thaw_processes() require some more work.
> 
> In particular, the freezing of workqueues is being removed here
> without a replacement.

Hrm, where do you see that? freeze_workqueues_busy() is still called on
try_to_freeze_tasks(). Likewise thaw_processes() also calls thaw_workqueues().
I did forget to nuke pm_nosig_freezing though.

Also as I have noted a few times now we have yet to determine if we can remove
all freezer calls on kthreads without causing a regression. Granted I'm being
overly careful here, I still would not be surprised to learn about a stupid use
case where this will be hard to remove now.

Only once this is done should this patch be considered. This will take time. So
I'd like more general consensus on long term approach:

  1) first address each fs to use its freezer calls on susend/hibernate / and thaw
     on resume. While at it, remove freezer API calls on their respective
     kthreads.
  2) In parallel address removing freezer API calls on non-IO kthreads, these
     should be trivial, but who knows what surprises lurk here
  3) Lookup for kthreads which seem to generate IO -- address / review if
     removal of the freezer API can be done somehow with a quescing. This
     is currently for example being done on SCSI / md.
  4) Only after all the above is done should we consider this patch or some
     form of it.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-04  0:47       ` Luis R. Rodriguez
@ 2017-10-04  1:03         ` Bart Van Assche
  2017-11-29 23:05           ` Luis R. Rodriguez
  2017-10-04  7:18         ` Dave Chinner
  1 sibling, 1 reply; 53+ messages in thread
From: Bart Van Assche @ 2017-10-04  1:03 UTC (permalink / raw)
  To: rjw, mcgrof
  Cc: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, jgross, oleksandr, todd.e.brandt, martin.petersen,
	linux-fsdevel, jikos, len.brown, tytso, jack

On Wed, 2017-10-04 at 02:47 +0200, Luis R. Rodriguez wrote:
>   3) Lookup for kthreads which seem to generate IO -- address / review if
>      removal of the freezer API can be done somehow with a quescing. This
>      is currently for example being done on SCSI / md.
>  4) Only after all the above is done should we consider this patch or some
>     form of it.

After having given this more thought, I think we should omit these last two
steps. Modifying the md driver such that it does not submit I/O requests while
processes are frozen requires either to use the freezer API or to open-code it.
I think there is general agreement in the kernel community that open-coding a
single mechanism in multiple drivers is wrong. Does this mean that instead of
removing the freezer API we should keep it and review all its users instead?

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-03 20:13     ` Luis R. Rodriguez
@ 2017-10-04  1:42       ` Theodore Ts'o
  2017-10-04  7:05         ` Dave Chinner
  0 siblings, 1 reply; 53+ messages in thread
From: Theodore Ts'o @ 2017-10-04  1:42 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, darrick.wong, jikos, rjw, pavel,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 10:13:21PM +0200, Luis R. Rodriguez wrote:
> > After we remove add the NEEDS_RECOVERY flag, we need to make sure
> > recovery flag is pushed out to disk before any other changes are
> > allowed to be pushed out to disk.  That's why we originally did the
> > update synchronously.
> 
> I see. I had to try to dig through linux-history to see why this was done, and
> I actually could not find an exact commit which explained what you just did.
> Thanks!
> 
> Hrm, so on freeze we issue the same commit as well, so is a sync *really* needed
> on thaw?

So let me explain what's going on.  When we do a freeze, we make sure
all of the blocks that are written to the journal are writen to the
final location on disk, and the journal is is truncated.  (This is
called a "journal checkpoint".)  Then we clear the NEEDS RECOVERY
feature flag and set the EXT4_VALID_FS flags in the superblock.  In
the no journal case, we flush out all metadata writes, and we set the
EXT4_VALID_FS flag.  In both cases, we call ext4_commit_super(sb, 1)
to make sure the flags update is pushed out to disk.  This puts the
file system into a "quiscent" state; in fact, it looks like the file
system has been unmounted, so that it becomes safe to run
dump/restore, run fsck -n on the file system, etc.

The problem on the thaw side is that we need to make sure that
EXT4_VALID_FS flag is removed (and if journaling is enabled, the NEEDS
RECOVERY feature flag is set) and the superblock is flushed out to the
storage device before any other writes are persisted on the disk.  In
the case where we have journalling enabled, we could wait until the
first journal commit to write out the superblock, since in journal
mode all metadata updates to the disk are suppressed until the journal
commit.  We don't do that today, but it is a change we could make.

However, in the no journal node we need to make sure the EXT4_VALID_FS
flag is cleared on disk before any other metadata operations can take
place, and calling ext4_commit_super(sb, 1) is the only real way to do
that.

> No, it was am empirical evaluation done at testing, I observed bio_submit()
> stalls, we never get completion from the device. Even if we call thaw at the
> very end of resume, after the devices should be up, we still end up in the same
> situation. Given how I order freezing filesystems after freezing userspace I do
> believe we should thaw filesystems prior unfreezing userspace, its why I placed
> the call where it is now.

So when we call ext4_commit_super(sb, 1), we issue the bio, and then
we block waiting for the I/O to complete.  That's a blocking call.  Is
the thaw context one which allows blocking?  If userspace is still
frozen, does that imply that the scheduler isn't allow to run?  If
that's the case, then that's probably the problem.

More generally, if the thawing code needs to allocate memory, or do
any number of things that could potentially block, this could
potentially be an issue.  Or if we have a network block device, or
something else in the storage stack that needs to run a kernel thread
context (or a workqueue, etc.) --- is the fact that userspace is
frozen mean the scheduler is going to refuse to schedule()?

I know at one point we made a distinction between freezing userspace
threads and kernel threads, but were there people who didn't like this
because of unnecessary complexity.  It sounds to me like on the thaw
side, we might also need to unfreeze kernel threads first, and then
allow userspace threads to run.  Do we do that today, or in the new
freeze/thaw code?  It's been quite a while since I've looked at that
part of the kernel.

					- Ted
					

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
                     ` (3 preceding siblings ...)
  2017-10-03 21:04   ` Dave Chinner
@ 2017-10-04  6:07   ` Hannes Reinecke
  4 siblings, 0 replies; 53+ messages in thread
From: Hannes Reinecke @ 2017-10-04  6:07 UTC (permalink / raw)
  To: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, rjw, pavel, len.brown, linux-fsdevel
  Cc: boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On 10/03/2017 08:53 PM, Luis R. Rodriguez wrote:
> Now that all filesystems which used to rely on kthread
> freezing have been converted to filesystem freeze/thawing
> we can remove the kernel kthread freezer.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  Documentation/power/freezing-of-tasks.txt |  9 ------
>  drivers/xen/manage.c                      |  6 ----
>  include/linux/freezer.h                   |  4 ---
>  kernel/power/hibernate.c                  | 10 ++-----
>  kernel/power/power.h                      | 20 +------------
>  kernel/power/process.c                    | 47 -------------------------------
>  kernel/power/user.c                       |  9 ------
>  tools/power/pm-graph/analyze_suspend.py   |  1 -
>  8 files changed, 3 insertions(+), 103 deletions(-)
> 
Weelll ... have you checked MD?
It's using kernel threads, which probably should be stopped, too, when
going into suspend. After all, MD might be doing on of it's internal
operations (eg resync) at that time, which will generate quite some I/O.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-04  1:42       ` Theodore Ts'o
@ 2017-10-04  7:05         ` Dave Chinner
  2017-10-04 15:25           ` Bart Van Assche
  2017-10-04 16:48           ` Theodore Ts'o
  0 siblings, 2 replies; 53+ messages in thread
From: Dave Chinner @ 2017-10-04  7:05 UTC (permalink / raw)
  To: Theodore Ts'o, Luis R. Rodriguez, viro, bart.vanassche,
	ming.lei, darrick.wong, jikos, rjw, pavel, len.brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 09:42:04PM -0400, Theodore Ts'o wrote:
> On Tue, Oct 03, 2017 at 10:13:21PM +0200, Luis R. Rodriguez wrote:
> > > After we remove add the NEEDS_RECOVERY flag, we need to make sure
> > > recovery flag is pushed out to disk before any other changes are
> > > allowed to be pushed out to disk.  That's why we originally did the
> > > update synchronously.
> > 
> > I see. I had to try to dig through linux-history to see why this was done, and
> > I actually could not find an exact commit which explained what you just did.
> > Thanks!
> > 
> > Hrm, so on freeze we issue the same commit as well, so is a sync *really* needed
> > on thaw?
> 
> So let me explain what's going on.  When we do a freeze, we make sure
> all of the blocks that are written to the journal are writen to the
> final location on disk, and the journal is is truncated.  (This is
> called a "journal checkpoint".)  Then we clear the NEEDS RECOVERY
> feature flag and set the EXT4_VALID_FS flags in the superblock.  In
> the no journal case, we flush out all metadata writes, and we set the
> EXT4_VALID_FS flag.  In both cases, we call ext4_commit_super(sb, 1)
> to make sure the flags update is pushed out to disk.  This puts the
> file system into a "quiscent" state; in fact, it looks like the file
> system has been unmounted, so that it becomes safe to run
> dump/restore, run fsck -n on the file system, etc.
> 
> The problem on the thaw side is that we need to make sure that
> EXT4_VALID_FS flag is removed (and if journaling is enabled, the NEEDS
> RECOVERY feature flag is set) and the superblock is flushed out to the
> storage device before any other writes are persisted on the disk.  In
> the case where we have journalling enabled, we could wait until the
> first journal commit to write out the superblock, since in journal
> mode all metadata updates to the disk are suppressed until the journal
> commit.  We don't do that today, but it is a change we could make.
> 
> However, in the no journal node we need to make sure the EXT4_VALID_FS
> flag is cleared on disk before any other metadata operations can take
> place, and calling ext4_commit_super(sb, 1) is the only real way to do
> that.
> 
> > No, it was am empirical evaluation done at testing, I observed bio_submit()
> > stalls, we never get completion from the device. Even if we call thaw at the
> > very end of resume, after the devices should be up, we still end up in the same
> > situation. Given how I order freezing filesystems after freezing userspace I do
> > believe we should thaw filesystems prior unfreezing userspace, its why I placed
> > the call where it is now.
> 
> So when we call ext4_commit_super(sb, 1), we issue the bio, and then
> we block waiting for the I/O to complete.  That's a blocking call.  Is
> the thaw context one which allows blocking?

thaw_super() does down_write(), so it *must* be able to sleep.

> If userspace is still
> frozen, does that imply that the scheduler isn't allow to run?  If
> that's the case, then that's probably the problem.
> 
> More generally, if the thawing code needs to allocate memory, or do

Which is does. xfs_fs_unfreeze() queues work to a workqueue, so
there's memory allocation there.

> any number of things that could potentially block, this could
> potentially be an issue.

yup, gfs2_unfreeze does even more stuff - it releases glocks which
may end up queuing work, waking other threads and freeing stuff via
call_rcu()...

Basically, before thawing filesystems the rest of the kernel
infrastructure needs to have been restarted. i.e. the order
needs to be:

freeze userspace
freeze filesystems
freeze kernel threads
freeze workqueues

thaw workqueues
thaw kernel threads
thaw filesystems
thaw userspace

and it should end up that way.

> Or if we have a network block device, or
> something else in the storage stack that needs to run a kernel thread
> context (or a workqueue, etc.) --- is the fact that userspace is
> frozen mean the scheduler is going to refuse to schedule()?

No.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-04  0:47       ` Luis R. Rodriguez
  2017-10-04  1:03         ` Bart Van Assche
@ 2017-10-04  7:18         ` Dave Chinner
  1 sibling, 0 replies; 53+ messages in thread
From: Dave Chinner @ 2017-10-04  7:18 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Rafael J. Wysocki, viro, bart.vanassche, ming.lei, tytso,
	darrick.wong, jikos, pavel, len.brown, linux-fsdevel,
	boris.ostrovsky, jgross, todd.e.brandt, nborisov, jack,
	martin.petersen, ONeukum, oleksandr, oleg.b.antonyan, linux-pm,
	linux-block, linux-xfs, linux-kernel

On Wed, Oct 04, 2017 at 02:47:56AM +0200, Luis R. Rodriguez wrote:
> On Tue, Oct 03, 2017 at 11:15:07PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, October 3, 2017 8:59:00 PM CEST Rafael J. Wysocki wrote:
> > > On Tuesday, October 3, 2017 8:53:13 PM CEST Luis R. Rodriguez wrote:
> > > > Now that all filesystems which used to rely on kthread
> > > > freezing have been converted to filesystem freeze/thawing
> > > > we can remove the kernel kthread freezer.
> > > > 
> > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > 
> > > I like this one. :-)
> > 
> > However, suspend_freeze/thaw_processes() require some more work.
> > 
> > In particular, the freezing of workqueues is being removed here
> > without a replacement.
> 
> Hrm, where do you see that? freeze_workqueues_busy() is still called on
> try_to_freeze_tasks(). Likewise thaw_processes() also calls thaw_workqueues().
> I did forget to nuke pm_nosig_freezing though.

static int try_to_freeze_tasks(bool user_only)
{
.....
        if (!user_only)
		freeze_workqueues_begin();
.....
                if (!user_only) {
			wq_busy = freeze_workqueues_busy();
.....
}

i.e. only try_to_freeze_tasks(false) will freeze workqueues, and the
only function that calls that - freeze_kernel_threads() - is removed
by this patch.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-04  7:05         ` Dave Chinner
@ 2017-10-04 15:25           ` Bart Van Assche
  2017-10-04 16:48           ` Theodore Ts'o
  1 sibling, 0 replies; 53+ messages in thread
From: Bart Van Assche @ 2017-10-04 15:25 UTC (permalink / raw)
  To: boris.ostrovsky, ONeukum, linux-block, linux-kernel, nborisov,
	oleg.b.antonyan, linux-pm, linux-xfs, pavel, darrick.wong, viro,
	ming.lei, rjw, jgross, oleksandr, todd.e.brandt, martin.petersen,
	mcgrof, linux-fsdevel, jikos, len.brown, tytso, david, jack

On Wed, 2017-10-04 at 18:05 +1100, Dave Chinner wrote:
> Basically, before thawing filesystems the rest of the kernel
> infrastructure needs to have been restarted. i.e. the order
> needs to be:
> 
> freeze userspace
> freeze filesystems
> freeze kernel threads
> freeze workqueues
> 
> thaw workqueues
> thaw kernel threads
> thaw filesystems
> thaw userspace
> 
> and it should end up that way.

Does this order support filesystems that have been implemented in user space,
e.g. filesystems based on FUSE?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw
  2017-10-03 20:05   ` Luis R. Rodriguez
  2017-10-03 20:47     ` Matthew Wilcox
@ 2017-10-04 15:43     ` Ming Lei
  1 sibling, 0 replies; 53+ messages in thread
From: Ming Lei @ 2017-10-04 15:43 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, tytso, darrick.wong, jikos, rjw, pavel,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

On Tue, Oct 03, 2017 at 10:05:11PM +0200, Luis R. Rodriguez wrote:
> On Wed, Oct 04, 2017 at 03:33:01AM +0800, Ming Lei wrote:
> > On Tue, Oct 03, 2017 at 11:53:08AM -0700, Luis R. Rodriguez wrote:
> > > INFO: task kworker/u8:8:1320 blocked for more than 10 seconds.
> > >       Tainted: G            E   4.13.0-next-20170907+ #88
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > kworker/u8:8    D    0  1320      2 0x80000000
> > > Workqueue: events_unbound async_run_entry_fn
> > > Call Trace:
> > >  __schedule+0x2ec/0x7a0
> > >  schedule+0x36/0x80
> > >  io_schedule+0x16/0x40
> > >  get_request+0x278/0x780
> > >  ? remove_wait_queue+0x70/0x70
> > >  blk_get_request+0x9c/0x110
> > >  scsi_execute+0x7a/0x310 [scsi_mod]
> > >  sd_sync_cache+0xa3/0x190 [sd_mod]
> > >  ? blk_run_queue+0x3f/0x50
> > >  sd_suspend_common+0x7b/0x130 [sd_mod]
> > >  ? scsi_print_result+0x270/0x270 [scsi_mod]
> > >  sd_suspend_system+0x13/0x20 [sd_mod]
> > >  do_scsi_suspend+0x1b/0x30 [scsi_mod]
> > >  scsi_bus_suspend_common+0xb1/0xd0 [scsi_mod]
> > >  ? device_for_each_child+0x69/0x90
> > >  scsi_bus_suspend+0x15/0x20 [scsi_mod]
> > >  dpm_run_callback+0x56/0x140
> > >  ? scsi_bus_freeze+0x20/0x20 [scsi_mod]
> > >  __device_suspend+0xf1/0x340
> > >  async_suspend+0x1f/0xa0
> > >  async_run_entry_fn+0x38/0x160
> > >  process_one_work+0x191/0x380
> > >  worker_thread+0x4e/0x3c0
> > >  kthread+0x109/0x140
> > >  ? process_one_work+0x380/0x380
> > >  ? kthread_create_on_node+0x70/0x70
> > >  ret_from_fork+0x25/0x30
> > 
> > Actually we are trying to fix this issue inside block layer/SCSI, please
> > see the following link:
> > 
> > https://marc.info/?l=linux-scsi&m=150703947029304&w=2
> > 
> > Even though this patch can make kthread to not do I/O during
> > suspend/resume, the SCSI quiesce still can cause similar issue
> > in other case, like when sending SCSI domain validation
> > to transport_spi, which happens in revalidate path, nothing
> > to do with suspend/resume.
> 
> Are you saying that the SCSI layer can generate IO even without the filesystem
> triggering it?

Yes, such as sg_io, in case of transport_spi, actually with SCSI
quiesced involved in the revalidate path, not related with PM.

> 
> If so then by all means these are certainly other areas we should address
> quiescing as I noted in my email.
> 
> Also, *iff* the generated IO is triggered on the SCSI suspend callback, then
> clearly the next question is if this is truly needed. If so then yes, it
> should be quiesced and all restrictions should be considered.
> 
> Note that device pm ops get called first, then later the notifiers are
> processed, and only later is userspace frozen. Its this gap this patch
> set addresses, and its also where where I saw the issue creep in. Depending on
> the questions above we may or not need more work in other layers.
> 
> So I am not saying this patch set is sufficient to address all IO quiescing,
> quite the contrary I acknowledged that each subsystem should vet if they have
> non-FS generated IO (seems you and Bart are doing  great job at doing this
> analysis on SCSI). This patchset however should help with odd corner cases
> which *are* triggered by the FS and the spaghetti code requirements of the
> kthread freezing clearly does not suffice.

Could you share us a bit what the odd corner case is?

> 
> > So IMO the root cause is in SCSI's quiesce.
> > 
> > You can find the similar description in above link:
> > 
> > 	Once SCSI device is put into QUIESCE, no new request except for
> > 	RQF_PREEMPT can be dispatched to SCSI successfully, and
> > 	scsi_device_quiesce() just simply waits for completion of I/Os
> > 	dispatched to SCSI stack. It isn't enough at all.
> 
> I see so the race here is *on* the pm ops of SCSI we have generated IO
> to QUIESCE.
> 
> > 
> > 	Because new request still can be coming, but all the allocated
> > 	requests can't be dispatched successfully, so request pool can be
> > 	consumed up easily. Then RQF_PREEMPT can't be allocated, and
> > 	hang forever, just like the stack trace you posted.
> > 
> 
> I see. Makes sense. So SCSI quiesce has restrictions and they're being
> violated.
> 
> Anyway, don't think of this as a replacement for yours or Bart's work then, but
> rather supplemental.
> 
> Are you saying we should not move forward with this patch set, or simply that
> the above splat is rather properly fixed with SCSI quiescing? Given you're
> explanation I'd have to agree. But even with this considered and accepted, from
> a theoretical perspective -- why would this patch set actually seem to fix the
> same issue? Is it, that it just *seems* to fix it?

Actually it is just because you posted out the very same stack trace,
and I am pretty sure that is caused by SCSI quiesce vs. RQF_PREEMPT.

Also IMO, SCSI quiesce vs. RQF_PREEMPT is one specific case wrt.
IO hang, and maybe there isn't same case on other disks. If that is
true, even without any change in kthread freeze, the patchset of
'making SCSI quiesce safe' should be enough for avoiding IO hang
in PM suspend/resume.

But I still don't understand your real motivation of this patchset
completely yet, is it only for avoiding I/O hang? Or is there other
purposes?  Looks I need to dig into more the patches.

-- 
Ming

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-04  7:05         ` Dave Chinner
  2017-10-04 15:25           ` Bart Van Assche
@ 2017-10-04 16:48           ` Theodore Ts'o
  2017-10-04 22:22             ` Dave Chinner
  1 sibling, 1 reply; 53+ messages in thread
From: Theodore Ts'o @ 2017-10-04 16:48 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, darrick.wong,
	jikos, rjw, pavel, len.brown, linux-fsdevel, boris.ostrovsky,
	jgross, todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Wed, Oct 04, 2017 at 06:05:23PM +1100, Dave Chinner wrote:
> Basically, before thawing filesystems the rest of the kernel
> infrastructure needs to have been restarted. i.e. the order
> needs to be:
> 
> freeze userspace
> freeze filesystems
> freeze kernel threads
> freeze workqueues
> 
> thaw workqueues
> thaw kernel threads
> thaw filesystems
> thaw userspace
> 
> and it should end up that way.
> 
> > Or if we have a network block device, or
> > something else in the storage stack that needs to run a kernel thread
> > context (or a workqueue, etc.) --- is the fact that userspace is
> > frozen mean the scheduler is going to refuse to schedule()?
> 
> No.

Well, that's what the answer *should* be.  I was asking what this
patch series does, and given that Luis reported that with this patch
series ext4_commit_super(sb, 1) is hanging, I have my suspicions about
what the answer might be with this patch set.  (Especially since the
claimed goal of the patch set is, "kthread freezing with filesystem
freeze/thaw".

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
  2017-10-04 16:48           ` Theodore Ts'o
@ 2017-10-04 22:22             ` Dave Chinner
  0 siblings, 0 replies; 53+ messages in thread
From: Dave Chinner @ 2017-10-04 22:22 UTC (permalink / raw)
  To: Theodore Ts'o, Luis R. Rodriguez, viro, bart.vanassche,
	ming.lei, darrick.wong, jikos, rjw, pavel, len.brown,
	linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt, nborisov,
	jack, martin.petersen, ONeukum, oleksandr, oleg.b.antonyan,
	linux-pm, linux-block, linux-xfs, linux-kernel

On Wed, Oct 04, 2017 at 12:48:39PM -0400, Theodore Ts'o wrote:
> On Wed, Oct 04, 2017 at 06:05:23PM +1100, Dave Chinner wrote:
> > Basically, before thawing filesystems the rest of the kernel
> > infrastructure needs to have been restarted. i.e. the order
> > needs to be:
> > 
> > freeze userspace
> > freeze filesystems
> > freeze kernel threads
> > freeze workqueues
> > 
> > thaw workqueues
> > thaw kernel threads
> > thaw filesystems
> > thaw userspace
> > 
> > and it should end up that way.
> > 
> > > Or if we have a network block device, or
> > > something else in the storage stack that needs to run a kernel thread
> > > context (or a workqueue, etc.) --- is the fact that userspace is
> > > frozen mean the scheduler is going to refuse to schedule()?
> > 
> > No.
> 
> Well, that's what the answer *should* be.  I was asking what this
> patch series does, and given that Luis reported that with this patch
> series ext4_commit_super(sb, 1) is hanging, I have my suspicions about
> what the answer might be with this patch set.  (Especially since the
> claimed goal of the patch set is, "kthread freezing with filesystem
> freeze/thaw".

There are know bugs in the patchset w.r.t.  workqueues and kernel
threads, and IO completion requires workqueues and kernel threads to
be running correctly. Hence filesystem thaw needs to occur after
workqueues and kernel threads are thawed.

The existing code has this assumption - that filesystem will start
working again the moment workqueues and kernel threads are thawed,
but trying to do IO before that will not work. So it's the same
with freeze/thaw of filesystems - thaw of filesystems will not work
if the rest of the kernel machinery is still frozen...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-03 20:49     ` Luis R. Rodriguez
@ 2017-10-06 12:07       ` Pavel Machek
  2017-10-06 12:54         ` Theodore Ts'o
  0 siblings, 1 reply; 53+ messages in thread
From: Pavel Machek @ 2017-10-06 12:07 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: viro, bart.vanassche, ming.lei, tytso, darrick.wong, jikos, rjw,
	len.brown, linux-fsdevel, boris.ostrovsky, jgross, todd.e.brandt,
	nborisov, jack, martin.petersen, ONeukum, oleksandr,
	oleg.b.antonyan, linux-pm, linux-block, linux-xfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]

On Tue 2017-10-03 22:49:40, Luis R. Rodriguez wrote:
> On Tue, Oct 03, 2017 at 10:12:04PM +0200, Pavel Machek wrote:
> > On Tue 2017-10-03 11:53:13, Luis R. Rodriguez wrote:
> > > Now that all filesystems which used to rely on kthread
> > > freezing have been converted to filesystem freeze/thawing
> > > we can remove the kernel kthread freezer.
> > 
> > Are you surely-sure? You mentioned other in kernel sources of writes;
> > what about those?
> 
> You perhaps did not read the cover letter, it describes this patch will very
> likely take time to *ever* apply. We must cover tons of grounds first, to
> address precisely what you say.

Yeah, I was not careful enough reading cover letter. Having series
where 1-4/5 are ready to go, and 5/5 not-good-idea for years to come
is quite confusing.

Really relevant parts of cover letter should be moved to 5/5 and it
should be clearly marked as "not for application".
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-06 12:07       ` Pavel Machek
@ 2017-10-06 12:54         ` Theodore Ts'o
  0 siblings, 0 replies; 53+ messages in thread
From: Theodore Ts'o @ 2017-10-06 12:54 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Luis R. Rodriguez, viro, bart.vanassche, ming.lei, darrick.wong,
	jikos, rjw, len.brown, linux-fsdevel, boris.ostrovsky, jgross,
	todd.e.brandt, nborisov, jack, martin.petersen, ONeukum,
	oleksandr, oleg.b.antonyan, linux-pm, linux-block, linux-xfs,
	linux-kernel

On Fri, Oct 06, 2017 at 02:07:13PM +0200, Pavel Machek wrote:
> 
> Yeah, I was not careful enough reading cover letter. Having series
> where 1-4/5 are ready to go, and 5/5 not-good-idea for years to come
> is quite confusing.

4/5 is not ready to go either, at the very least....

       	   	       	       	  - Ted

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [RFC 5/5] pm: remove kernel thread freezing
  2017-10-04  1:03         ` Bart Van Assche
@ 2017-11-29 23:05           ` Luis R. Rodriguez
  0 siblings, 0 replies; 53+ messages in thread
From: Luis R. Rodriguez @ 2017-11-29 23:05 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: rjw, mcgrof, boris.ostrovsky, ONeukum, linux-block, linux-kernel,
	nborisov, oleg.b.antonyan, linux-pm, linux-xfs, pavel,
	darrick.wong, viro, ming.lei, jgross, oleksandr, todd.e.brandt,
	martin.petersen, linux-fsdevel, jikos, len.brown, tytso, jack

On Wed, Oct 04, 2017 at 01:03:54AM +0000, Bart Van Assche wrote:
> On Wed, 2017-10-04 at 02:47 +0200, Luis R. Rodriguez wrote:
> >   3) Lookup for kthreads which seem to generate IO -- address / review if
> >      removal of the freezer API can be done somehow with a quescing. This
> >      is currently for example being done on SCSI / md.
> >  4) Only after all the above is done should we consider this patch or some
> >     form of it.
> 
> After having given this more thought, I think we should omit these last two
> steps. Modifying the md driver such that it does not submit I/O requests while
> processes are frozen requires either to use the freezer API or to open-code it.
> I think there is general agreement in the kernel community that open-coding a
> single mechanism in multiple drivers is wrong. Does this mean that instead of
> removing the freezer API we should keep it and review all its users instead?

Agreed.

  Luis

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2017-11-29 23:05 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 1/5] fs: add iterate_supers_reverse() Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
2017-10-03 20:02   ` Bart Van Assche
2017-10-03 20:23     ` Luis R. Rodriguez
2017-10-03 20:32       ` Bart Van Assche
2017-10-03 20:39         ` Luis R. Rodriguez
2017-10-03 20:06   ` Jiri Kosina
2017-10-03 20:58   ` Dave Chinner
2017-10-03 21:16     ` Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 3/5] xfs: allow fs freeze on suspend/hibernation Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 4/5] ext4: add fs freezing support " Luis R. Rodriguez
2017-10-03 19:59   ` Theodore Ts'o
2017-10-03 20:13     ` Luis R. Rodriguez
2017-10-04  1:42       ` Theodore Ts'o
2017-10-04  7:05         ` Dave Chinner
2017-10-04 15:25           ` Bart Van Assche
2017-10-04 16:48           ` Theodore Ts'o
2017-10-04 22:22             ` Dave Chinner
2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
2017-10-03 18:59   ` Rafael J. Wysocki
2017-10-03 21:15     ` Rafael J. Wysocki
2017-10-04  0:47       ` Luis R. Rodriguez
2017-10-04  1:03         ` Bart Van Assche
2017-11-29 23:05           ` Luis R. Rodriguez
2017-10-04  7:18         ` Dave Chinner
2017-10-03 20:12   ` Pavel Machek
2017-10-03 20:15     ` Jiri Kosina
2017-10-03 20:21       ` Pavel Machek
2017-10-03 20:38         ` Jiri Kosina
2017-10-03 20:41           ` Rafael J. Wysocki
2017-10-03 20:57           ` Pavel Machek
2017-10-03 21:00             ` Jiri Kosina
2017-10-03 21:09               ` Shuah Khan
2017-10-03 21:18                 ` Luis R. Rodriguez
2017-10-03 20:49     ` Luis R. Rodriguez
2017-10-06 12:07       ` Pavel Machek
2017-10-06 12:54         ` Theodore Ts'o
2017-10-03 20:13   ` Bart Van Assche
2017-10-03 20:17     ` Jiri Kosina
2017-10-03 20:21       ` Bart Van Assche
2017-10-03 20:24         ` Jiri Kosina
2017-10-03 20:27         ` Luis R. Rodriguez
2017-10-03 20:51       ` Jiri Kosina
2017-10-03 21:04   ` Dave Chinner
2017-10-03 21:07     ` Luis R. Rodriguez
2017-10-04  6:07   ` Hannes Reinecke
2017-10-03 19:33 ` [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Ming Lei
2017-10-03 20:05   ` Luis R. Rodriguez
2017-10-03 20:47     ` Matthew Wilcox
2017-10-03 20:54       ` Luis R. Rodriguez
2017-10-03 20:59       ` Bart Van Assche
2017-10-04 15:43     ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).