[Sorry, resending with the correct mailinglist address as recipient] Hello, This patch adds a run-time switch at `/sys/power/suspend_sync`. The switch allows to enable or disable the final sync() from the suspend.c Linux Kernel system suspend implementation. This is useful to avoid race conditions if block devices have been suspended before. Be aware that you have to take care of sync() yourself before suspending the system if you disable it here. Since this is my first patch against the Linux kernel and I don't consider it ready for inclusion yet, I decided to send it to pm-linux and the PM subsystem maintainers only first. Would be very glad if you could take a look and comment on it :) Some questions: * There already is a build-time config flag[2] for en- or disabling the sync() in suspend.c. Is it acceptable to have both a build-time *and* a *run-time* switch? Or would a run-time switch have to replace the build-time switch? If so, a direct question to Rafael, as you added the build-time flag: Would that be ok for you? * I'm unsure about the naming: since the default is to have the sync enabled, would `suspend_disable_sync` be a better name for the switch, obviously defaulting to 0 then and skipping the sync at value 1? To give a bit more contect: In Debian, we're currently working[3] on support to suspend unlocked dm-crypt devices before system suspend. During that work, we realized that the final sync() from Linux Kernel system suspend implementation can lead to a dead lock. I wrote a simple reproducer[4] to cause the dead lock in a reliable way. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/suspend.c?id=54ecb8f#n569 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2fd77f [3] https://salsa.debian.org/mejo/cryptsetup-suspend [4] https://salsa.debian.org/mejo/cryptsetup-suspend/snippets/334 Signed-off-by: Jonas Meurer --- Documentation/ABI/testing/sysfs-power | 16 ++++++++++++++- include/linux/suspend.h | 2 + kernel/power/main.c | 35 ++++++++++++++++++++++++++++++++++ kernel/power/suspend.c | 2 - 4 files changed, 53 insertions(+), 2 deletions(-) --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -575,7 +575,7 @@ static int enter_state(suspend_state_t s if (state == PM_SUSPEND_TO_IDLE) s2idle_begin(); - if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) { + if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC) && suspend_sync_enabled) { trace_suspend_resume(TPS("sync_filesystems"), 0, true); ksys_sync_helper(); trace_suspend_resume(TPS("sync_filesystems"), 0, false); --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -328,6 +328,7 @@ extern void arch_suspend_disable_irqs(vo extern void arch_suspend_enable_irqs(void); extern int pm_suspend(suspend_state_t state); +extern bool suspend_sync_enabled; #else /* !CONFIG_SUSPEND */ #define suspend_valid_only_mem NULL @@ -340,6 +341,7 @@ static inline bool pm_suspend_via_s2idle static inline void suspend_set_ops(const struct platform_suspend_ops *ops) {} static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; } +static inline bool suspend_sync_enabled(void) { return true; } static inline bool idle_should_enter_s2idle(void) { return false; } static inline void __init pm_states_init(void) {} static inline void s2idle_set_ops(const struct platform_s2idle_ops *ops) {} --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -191,6 +191,40 @@ static ssize_t mem_sleep_store(struct ko power_attr(mem_sleep); #endif /* CONFIG_SUSPEND */ +#ifdef CONFIG_SUSPEND +/* + * suspend_sync: invoke ksys_sync_helper() before suspend. + * + * show() returns whether ksys_sync_helper() is invoked before suspend. + * store() accepts 0 or 1. 0 disables ksys_sync_helper() and 1 enables it. + */ +bool suspend_sync_enabled = true; + +static ssize_t suspend_sync_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", suspend_sync_enabled); +} + +static ssize_t suspend_sync_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t n) +{ + unsigned long val; + + if (kstrtoul(buf, 10, &val)) + return -EINVAL; + + if (val > 1) + return -EINVAL; + + suspend_sync_enabled = !!val; + return n; +} + +power_attr(suspend_sync); +#endif /* CONFIG_SUSPEND */ + #ifdef CONFIG_PM_SLEEP_DEBUG int pm_test_level = TEST_NONE; @@ -769,6 +803,7 @@ static struct attribute * g[] = { &wakeup_count_attr.attr, #ifdef CONFIG_SUSPEND &mem_sleep_attr.attr, + &suspend_sync_attr.attr, #endif #ifdef CONFIG_PM_AUTOSLEEP &autosleep_attr.attr, --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -300,4 +300,18 @@ Description: attempt. Using this sysfs file will override any values that were - set using the kernel command line for disk offset. \ No newline at end of file + set using the kernel command line for disk offset. + +What: /sys/power/suspend_sync +Date: October 2019 +Contact: Jonas Meurer +Description: + This file controls the switch to enable or disable the final + sync() before system suspend. This is useful to avoid race + conditions if block devices have been suspended before. Be + aware that you have to take care of sync() yourself before + suspending the system if you disable it here. + + Writing a "1" (default) to this file enables the sync() and + writing a "0" disables it. Reads from the file return the + current value. -- Jonas Meurer