All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation
@ 2016-07-14 10:45 Chen Yu
  2016-07-14 10:45 ` [PATCH 1/2][RFC " Chen Yu
  2016-07-14 10:45 ` [PATCH 2/2][RFC v3] PM / Documentation: Add description for snapshot test mode Chen Yu
  0 siblings, 2 replies; 4+ messages in thread
From: Chen Yu @ 2016-07-14 10:45 UTC (permalink / raw)
  To: linux-pm
  Cc: Rafael J. Wysocki, Pavel Machek, Len Brown, linux-kernel, Chen Yu

Sometimes we need to debug during hibernation resume stage, and
it would be convenient for the user to test if the snapshot stored
in the disk can be successfully restored without BIOSes or system
re-initialization involved in. So this patch has introduced a snapshot
test mode for this purpose.

Chen Yu (2):
  PM / hibernate: Introduce snapshot test mode for hibernation
  PM / Documentation: Add description for snapshot test mode

 Documentation/power/basic-pm-debugging.txt | 15 +++++++--
 kernel/power/hibernate.c                   | 49 +++++++++++++++++++++++++++++-
 kernel/power/main.c                        |  3 ++
 kernel/power/power.h                       |  3 ++
 kernel/power/suspend.c                     |  7 +++++
 kernel/power/swap.c                        |  7 +++++
 6 files changed, 80 insertions(+), 4 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation
  2016-07-14 10:45 [PATCH 0/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation Chen Yu
@ 2016-07-14 10:45 ` Chen Yu
  2016-07-14 11:09   ` Chen Yu
  2016-07-14 10:45 ` [PATCH 2/2][RFC v3] PM / Documentation: Add description for snapshot test mode Chen Yu
  1 sibling, 1 reply; 4+ messages in thread
From: Chen Yu @ 2016-07-14 10:45 UTC (permalink / raw)
  To: linux-pm
  Cc: Rafael J. Wysocki, Pavel Machek, Len Brown, linux-kernel, Chen Yu

This mode is used to verify if the snapshot data written to
the swap device can be successfully restored to the memory. It
is useful to ease the debugging process on hibernation,
since this mode can not only bypass the BIOSes/bootloader,
but also the system re-initialization.

For example:

echo snapshot > /sys/power/pm_test
echo disk > /sys/power/state

[  267.959784] PM: Image saving progress:  80%
[  268.038669] PM: Image saving progress:  90%
[  268.111745] PM: Image saving progress: 100%
[  268.129269] PM: Image saving done.
[  268.133485] PM: Wrote 518612 kbytes in 0.75 seconds (691.48 MB/s)
[  268.140564] PM: S|
[  268.160067] hibernation debug: Waiting for 5 seconds.
...
[  273.508638] PM: Looking for hibernation image.
[  273.516583] PM: Image signature found, resuming
[  273.926547] PM: Loading and decompressing image data (129653 pages)...
[  274.122452] PM: Image loading progress:   0%
[  274.322127] PM: Image loading progress:  10%
...

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
v3:
 - According to the discussion, there was a potential risk in previous
   version that might break the filesystem. This version avoids that issue
   by restoring the pages with user/kernel threads kept in frozen.
   Also updated the document.
---
 kernel/power/hibernate.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-
 kernel/power/main.c      |  3 +++
 kernel/power/power.h     |  3 +++
 kernel/power/suspend.c   |  7 +++++++
 kernel/power/swap.c      |  7 +++++++
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 51441d8..c0c8fe3 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -643,11 +643,53 @@ static void power_down(void)
 }
 
 /**
+ * software_resume_unthaw - Resume from a saved hibernation image with threads frozen.
+ *
+ * This routine is similar to software_resume, except that this one tries
+ * to resume from hibernation with user/kernel threads frozen.
+ * It is mainly used for snapshot test mode because it is important to keep the
+ * threads frozen, otherwise the filesystem might be broken due to inconsistent
+ * disk-data/metadata across hibernate.
+ */
+static int software_resume_unthaw(void)
+{
+	int error;
+	unsigned int flags;
+
+	pr_debug("PM: Hibernation image partition %d:%d present\n",
+		MAJOR(swsusp_resume_device), MINOR(swsusp_resume_device));
+
+	pr_debug("PM: Looking for hibernation image.\n");
+	error = swsusp_check();
+	if (error)
+		return error;
+
+	lock_device_hotplug();
+	error = create_basic_memory_bitmaps();
+	if (error)
+		goto Unlock;
+
+	error = swsusp_read(&flags);
+	swsusp_close(FMODE_READ);
+	if (!error)
+		hibernation_restore(flags & SF_PLATFORM_MODE);
+
+	printk(KERN_ERR "PM: Failed to load hibernation image, recovering.\n");
+	swsusp_free();
+	free_basic_memory_bitmaps();
+ Unlock:
+	unlock_device_hotplug();
+
+	return error;
+}
+
+/**
  * hibernate - Carry out system hibernation, including saving the image.
  */
 int hibernate(void)
 {
 	int error, nr_calls = 0;
+	bool snapshot_test = false;
 
 	if (!hibernation_available()) {
 		pr_debug("PM: Hibernation not available.\n");
@@ -699,7 +741,9 @@ int hibernate(void)
 		pr_debug("PM: writing image.\n");
 		error = swsusp_write(flags);
 		swsusp_free();
-		if (!error)
+		if (hibernation_test(TEST_SNAPSHOT))
+			snapshot_test = true;
+		if (!error && !snapshot_test)
 			power_down();
 		in_suspend = 0;
 		pm_restore_gfp_mask();
@@ -711,6 +755,8 @@ int hibernate(void)
 	free_basic_memory_bitmaps();
  Thaw:
 	unlock_device_hotplug();
+	if (snapshot_test)
+		software_resume_unthaw();
 	thaw_processes();
 
 	/* Don't bother checking whether freezer_test_done is true */
@@ -721,6 +767,7 @@ int hibernate(void)
 	atomic_inc(&snapshot_device_available);
  Unlock:
 	unlock_system_sleep();
+
 	return error;
 }
 
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 5ea50b1..80fe48e 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -83,6 +83,9 @@ int pm_test_level = TEST_NONE;
 
 static const char * const pm_tests[__TEST_AFTER_LAST] = {
 	[TEST_NONE] = "none",
+#ifdef CONFIG_HIBERNATION
+	[TEST_SNAPSHOT] = "snapshot",
+#endif
 	[TEST_CORE] = "core",
 	[TEST_CPUS] = "processors",
 	[TEST_PLATFORM] = "platform",
diff --git a/kernel/power/power.h b/kernel/power/power.h
index 064963e..101d636 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -225,6 +225,9 @@ static inline int restore_highmem(void) { return 0; }
 enum {
 	/* keep first */
 	TEST_NONE,
+#ifdef CONFIG_HIBERNATION
+	TEST_SNAPSHOT,
+#endif
 	TEST_CORE,
 	TEST_CPUS,
 	TEST_PLATFORM,
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0acab9d..7c0d5e9 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -479,6 +479,13 @@ static int enter_state(suspend_state_t state)
 			return -EAGAIN;
 		}
 #endif
+	} else if (state == PM_SUSPEND_MEM) {
+#ifdef CONFIG_PM_DEBUG
+		if (pm_test_level != TEST_NONE && pm_test_level < TEST_CORES) {
+			pr_warn("PM: Unsupported test mode for suspend to ram, please choose none/freezer/devices/platform/processors/core.\n");
+			return -EAGAIN;
+		}
+#endif
 	} else if (!valid_state(state)) {
 		return -EINVAL;
 	}
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 160e100..facd71b 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -348,6 +348,13 @@ static int swsusp_swap_check(void)
 	if (res < 0)
 		blkdev_put(hib_resume_bdev, FMODE_WRITE);
 
+	/*
+	 * Update the resume device to the one actually used,
+	 * so software_resume() can use it in case it is invoked
+	 * from hibernate() to test the snapshot.
+	 */
+	swsusp_resume_device = hib_resume_bdev->bd_dev;
+
 	return res;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2][RFC v3] PM / Documentation: Add description for snapshot test mode
  2016-07-14 10:45 [PATCH 0/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation Chen Yu
  2016-07-14 10:45 ` [PATCH 1/2][RFC " Chen Yu
@ 2016-07-14 10:45 ` Chen Yu
  1 sibling, 0 replies; 4+ messages in thread
From: Chen Yu @ 2016-07-14 10:45 UTC (permalink / raw)
  To: linux-pm
  Cc: Rafael J. Wysocki, Pavel Machek, Len Brown, linux-kernel, Chen Yu

Introduce snapshot test mode for hibernation debugging, and
update the document accordingly.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 Documentation/power/basic-pm-debugging.txt | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
index b96098c..5cc5b21 100644
--- a/Documentation/power/basic-pm-debugging.txt
+++ b/Documentation/power/basic-pm-debugging.txt
@@ -41,7 +41,7 @@ a) Test modes of hibernation
 To find out why hibernation fails on your system, you can use a special testing
 facility available if the kernel is compiled with CONFIG_PM_DEBUG set.  Then,
 there is the file /sys/power/pm_test that can be used to make the hibernation
-core run in a test mode.  There are 5 test modes available:
+core run in a test mode.  There are 6 test modes available:
 
 freezer
 - test the freezing of processes
@@ -62,6 +62,11 @@ core
   control methods(*), the disabling of nonboot CPUs and suspending of
   platform/system devices
 
+snapshot
+- test the freezing of processes, suspending of devices, platform global
+  control methods(*), the disabling of nonboot CPUs, suspending of
+  platform/system devices and the restoring of snapshot image
+
 (*) the platform global control methods are only available on ACPI systems
     and are only tested if the hibernation mode is set to "platform"
 
@@ -95,7 +100,7 @@ one and the "core" level tests the hardware and drivers as deeply as possible
 without creating a hibernation image.  Obviously, if the "devices" test fails,
 the "platform" test will fail as well and so on.  Thus, as a rule of thumb, you
 should try the test modes starting from "freezer", through "devices", "platform"
-and "processors" up to "core" (repeat the test on each level a couple of times
+, "processors" and "core" up to "snapshot" (repeat the test on each level a couple of times
 to make sure that any random factors are avoided).
 
 If the "freezer" test fails, there is a task that cannot be frozen (in that case
@@ -140,7 +145,11 @@ devices has failed (these devices are suspended on one CPU with interrupts off),
 the problem is most probably hardware-related and serious, so it should be
 reported.
 
-A failure of any of the "platform", "processors" or "core" tests may cause your
+If the "snapshot" test fails, which means that restoring of the pages from the
+swap device has failed, it might be caused by a broken swap device, or other memory
+management issues, so it should be reported.
+
+A failure of any of the "platform", "processors" "core" or "snapshot" tests may cause your
 system to hang or become unstable, so please beware.  Such a failure usually
 indicates a serious problem that very well may be related to the hardware, but
 please report it anyway.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation
  2016-07-14 10:45 ` [PATCH 1/2][RFC " Chen Yu
@ 2016-07-14 11:09   ` Chen Yu
  0 siblings, 0 replies; 4+ messages in thread
From: Chen Yu @ 2016-07-14 11:09 UTC (permalink / raw)
  To: linux-pm; +Cc: Rafael J. Wysocki, Pavel Machek, Len Brown, linux-kernel

Please ignore this patch, will resend another version later.
thanks!

On 2016年07月14日 18:45, Chen Yu wrote:
> This mode is used to verify if the snapshot data written to
> the swap device can be successfully restored to the memory. It
> is useful to ease the debugging process on hibernation,
> since this mode can not only bypass the BIOSes/bootloader,
> but also the system re-initialization.
>
> For example:
>
> echo snapshot > /sys/power/pm_test
> echo disk > /sys/power/state
>
> [  267.959784] PM: Image saving progress:  80%
> [  268.038669] PM: Image saving progress:  90%
> [  268.111745] PM: Image saving progress: 100%
> [  268.129269] PM: Image saving done.
> [  268.133485] PM: Wrote 518612 kbytes in 0.75 seconds (691.48 MB/s)
> [  268.140564] PM: S|
> [  268.160067] hibernation debug: Waiting for 5 seconds.
> ...
> [  273.508638] PM: Looking for hibernation image.
> [  273.516583] PM: Image signature found, resuming
> [  273.926547] PM: Loading and decompressing image data (129653 pages)...
> [  274.122452] PM: Image loading progress:   0%
> [  274.322127] PM: Image loading progress:  10%
> ...
>
> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
> v3:
>   - According to the discussion, there was a potential risk in previous
>     version that might break the filesystem. This version avoids that issue
>     by restoring the pages with user/kernel threads kept in frozen.
>     Also updated the document.
> ---
>   kernel/power/hibernate.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-
>   kernel/power/main.c      |  3 +++
>   kernel/power/power.h     |  3 +++
>   kernel/power/suspend.c   |  7 +++++++
>   kernel/power/swap.c      |  7 +++++++
>   5 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index 51441d8..c0c8fe3 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -643,11 +643,53 @@ static void power_down(void)
>   }
>   
>   /**
> + * software_resume_unthaw - Resume from a saved hibernation image with threads frozen.
> + *
> + * This routine is similar to software_resume, except that this one tries
> + * to resume from hibernation with user/kernel threads frozen.
> + * It is mainly used for snapshot test mode because it is important to keep the
> + * threads frozen, otherwise the filesystem might be broken due to inconsistent
> + * disk-data/metadata across hibernate.
> + */
> +static int software_resume_unthaw(void)
> +{
> +	int error;
> +	unsigned int flags;
> +
> +	pr_debug("PM: Hibernation image partition %d:%d present\n",
> +		MAJOR(swsusp_resume_device), MINOR(swsusp_resume_device));
> +
> +	pr_debug("PM: Looking for hibernation image.\n");
> +	error = swsusp_check();
> +	if (error)
> +		return error;
> +
> +	lock_device_hotplug();
> +	error = create_basic_memory_bitmaps();
> +	if (error)
> +		goto Unlock;
> +
> +	error = swsusp_read(&flags);
> +	swsusp_close(FMODE_READ);
> +	if (!error)
> +		hibernation_restore(flags & SF_PLATFORM_MODE);
> +
> +	printk(KERN_ERR "PM: Failed to load hibernation image, recovering.\n");
> +	swsusp_free();
> +	free_basic_memory_bitmaps();
> + Unlock:
> +	unlock_device_hotplug();
> +
> +	return error;
> +}
> +
> +/**
>    * hibernate - Carry out system hibernation, including saving the image.
>    */
>   int hibernate(void)
>   {
>   	int error, nr_calls = 0;
> +	bool snapshot_test = false;
>   
>   	if (!hibernation_available()) {
>   		pr_debug("PM: Hibernation not available.\n");
> @@ -699,7 +741,9 @@ int hibernate(void)
>   		pr_debug("PM: writing image.\n");
>   		error = swsusp_write(flags);
>   		swsusp_free();
> -		if (!error)
> +		if (hibernation_test(TEST_SNAPSHOT))
> +			snapshot_test = true;
> +		if (!error && !snapshot_test)
>   			power_down();
>   		in_suspend = 0;
>   		pm_restore_gfp_mask();
> @@ -711,6 +755,8 @@ int hibernate(void)
>   	free_basic_memory_bitmaps();
>    Thaw:
>   	unlock_device_hotplug();
> +	if (snapshot_test)
> +		software_resume_unthaw();
>   	thaw_processes();
>   
>   	/* Don't bother checking whether freezer_test_done is true */
> @@ -721,6 +767,7 @@ int hibernate(void)
>   	atomic_inc(&snapshot_device_available);
>    Unlock:
>   	unlock_system_sleep();
> +
>   	return error;
>   }
>   
> diff --git a/kernel/power/main.c b/kernel/power/main.c
> index 5ea50b1..80fe48e 100644
> --- a/kernel/power/main.c
> +++ b/kernel/power/main.c
> @@ -83,6 +83,9 @@ int pm_test_level = TEST_NONE;
>   
>   static const char * const pm_tests[__TEST_AFTER_LAST] = {
>   	[TEST_NONE] = "none",
> +#ifdef CONFIG_HIBERNATION
> +	[TEST_SNAPSHOT] = "snapshot",
> +#endif
>   	[TEST_CORE] = "core",
>   	[TEST_CPUS] = "processors",
>   	[TEST_PLATFORM] = "platform",
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index 064963e..101d636 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -225,6 +225,9 @@ static inline int restore_highmem(void) { return 0; }
>   enum {
>   	/* keep first */
>   	TEST_NONE,
> +#ifdef CONFIG_HIBERNATION
> +	TEST_SNAPSHOT,
> +#endif
>   	TEST_CORE,
>   	TEST_CPUS,
>   	TEST_PLATFORM,
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 0acab9d..7c0d5e9 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -479,6 +479,13 @@ static int enter_state(suspend_state_t state)
>   			return -EAGAIN;
>   		}
>   #endif
> +	} else if (state == PM_SUSPEND_MEM) {
> +#ifdef CONFIG_PM_DEBUG
> +		if (pm_test_level != TEST_NONE && pm_test_level < TEST_CORES) {
> +			pr_warn("PM: Unsupported test mode for suspend to ram, please choose none/freezer/devices/platform/processors/core.\n");
> +			return -EAGAIN;
> +		}
> +#endif
>   	} else if (!valid_state(state)) {
>   		return -EINVAL;
>   	}
> diff --git a/kernel/power/swap.c b/kernel/power/swap.c
> index 160e100..facd71b 100644
> --- a/kernel/power/swap.c
> +++ b/kernel/power/swap.c
> @@ -348,6 +348,13 @@ static int swsusp_swap_check(void)
>   	if (res < 0)
>   		blkdev_put(hib_resume_bdev, FMODE_WRITE);
>   
> +	/*
> +	 * Update the resume device to the one actually used,
> +	 * so software_resume() can use it in case it is invoked
> +	 * from hibernate() to test the snapshot.
> +	 */
> +	swsusp_resume_device = hib_resume_bdev->bd_dev;
> +
>   	return res;
>   }
>   

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-07-14 11:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-14 10:45 [PATCH 0/2][RFC v3] PM / hibernate: Introduce snapshot test mode for hibernation Chen Yu
2016-07-14 10:45 ` [PATCH 1/2][RFC " Chen Yu
2016-07-14 11:09   ` Chen Yu
2016-07-14 10:45 ` [PATCH 2/2][RFC v3] PM / Documentation: Add description for snapshot test mode Chen Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.