* [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 17:37 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:37 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Hi, The following two patches modify the way in which we handle disabling interrupts during suspend and enabling them during resume. Namely, currently interrupts are disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed on this list. 1/2 is based on an earlier patch from Linus and it only splits up sysdev_[suspend|resume] from the ["late suspend|"early" resume'] of devices. 2/2 actually modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework. The patches have been initially tested and they don't appear to break suspend on my boxes, but this is the first approximation only. In particular, I'm not sure if I did the XEN, kexec and APM parts right, so people with experience in these areas are gently requested to have a look and tell me if there's anything to fix in there. Moreover, the real purpose of these changes is to be able to execute the "late" suspend and "early" resume device callbacks with timer interrupts enabled, so that they can use mutexes etc. However, x86 currently doesn't set the IRQF_TIMER flag and I need to make it do so before going further in this direction and changing the PCI PM framework to take advantage of the $subject changes, for example. So, I need to know how to modify x86 timer code so that the IRQF_TIMER flag is set by it. Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 17:37 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:37 UTC (permalink / raw) To: LKML Cc: Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list Hi, The following two patches modify the way in which we handle disabling interrupts during suspend and enabling them during resume. Namely, currently interrupts are disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed on this list. 1/2 is based on an earlier patch from Linus and it only splits up sysdev_[suspend|resume] from the ["late suspend|"early" resume'] of devices. 2/2 actually modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework. The patches have been initially tested and they don't appear to break suspend on my boxes, but this is the first approximation only. In particular, I'm not sure if I did the XEN, kexec and APM parts right, so people with experience in these areas are gently requested to have a look and tell me if there's anything to fix in there. Moreover, the real purpose of these changes is to be able to execute the "late" suspend and "early" resume device callbacks with timer interrupts enabled, so that they can use mutexes etc. However, x86 currently doesn't set the IRQF_TIMER flag and I need to make it do so before going further in this direction and changing the PCI PM framework to take advantage of the $subject changes, for example. So, I need to know how to modify x86 timer code so that the IRQF_TIMER flag is set by it. Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:37 ` Rafael J. Wysocki (?) @ 2009-02-22 17:38 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:38 UTC (permalink / raw) To: LKML Cc: Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Move the sysdev_suspend/resume from the callee to the callers, with no real change in semantics, so that we can rework the disabling of interrupts during suspend/hibernation. This is based on an earlier patch from Linus. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 4 ++++ drivers/base/base.h | 2 -- drivers/base/power/main.c | 3 --- drivers/xen/manage.c | 8 ++++++++ include/linux/pm.h | 2 ++ kernel/kexec.c | 7 +++++++ kernel/power/disk.c | 11 +++++++++++ kernel/power/main.c | 8 ++++++-- 8 files changed, 38 insertions(+), 7 deletions(-) Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1192,6 +1192,7 @@ static int suspend(int vetoable) device_suspend(PMSG_SUSPEND); local_irq_disable(); device_power_down(PMSG_SUSPEND); + sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1208,6 +1209,7 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); device_power_up(PMSG_RESUME); local_irq_enable(); device_resume(PMSG_RESUME); @@ -1228,6 +1230,7 @@ static void standby(void) local_irq_disable(); device_power_down(PMSG_SUSPEND); + sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); err = set_system_power_state(APM_STATE_STANDBY); @@ -1235,6 +1238,7 @@ static void standby(void) apm_error("standby", err); local_irq_disable(); + sysdev_resume(); device_power_up(PMSG_RESUME); local_irq_enable(); } Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -333,7 +333,6 @@ static void dpm_power_up(pm_message_t st */ void device_power_up(pm_message_t state) { - sysdev_resume(); dpm_power_up(state); } EXPORT_SYMBOL_GPL(device_power_up); @@ -577,8 +576,6 @@ int device_power_down(pm_message_t state } dev->power.status = DPM_OFF_IRQ; } - if (!error) - error = sysdev_suspend(state); if (error) dpm_power_up(resume_event(state)); return error; Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -45,6 +45,13 @@ static int xen_suspend(void *data) err); return err; } + err = sysdev_suspend(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", + err); + device_power_up(PMSG_RESUME); + return err; + } xen_mm_pin_all(); gnttab_suspend(); @@ -61,6 +68,7 @@ static int xen_suspend(void *data) gnttab_resume(); xen_mm_unpin_all(); + sysdev_resume(); device_power_up(PMSG_RESUME); if (!*cancelled) { Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1465,6 +1465,11 @@ int kernel_kexec(void) error = device_power_down(PMSG_FREEZE); if (error) goto Enable_irqs; + + /* Suspend system devices */ + error = sysdev_suspend(PMSG_FREEZE); + if (error) + goto Power_up_devices; } else #endif { @@ -1477,6 +1482,8 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { + sysdev_resume(); + Power_up_devices: device_power_up(PMSG_RESTORE); Enable_irqs: local_irq_enable(); Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -227,6 +227,12 @@ static int create_image(int platform_mod "aborting hibernation\n"); goto Enable_irqs; } + sysdev_suspend(PMSG_FREEZE); + if (error) { + printk(KERN_ERR "PM: Some devices failed to power down, " + "aborting hibernation\n"); + goto Power_up_devices; + } if (hibernation_test(TEST_CORE)) goto Power_up; @@ -242,9 +248,11 @@ static int create_image(int platform_mod if (!in_suspend) platform_leave(platform_mode); Power_up: + sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); Enable_irqs: @@ -335,6 +343,7 @@ static int resume_target_kernel(void) "aborting resume\n"); goto Enable_irqs; } + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -357,6 +366,7 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); device_power_up(PMSG_RECOVER); Enable_irqs: local_irq_enable(); @@ -440,6 +450,7 @@ int hibernation_platform_enter(void) local_irq_disable(); error = device_power_down(PMSG_HIBERNATE); if (!error) { + sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -298,8 +298,12 @@ static int suspend_enter(suspend_state_t goto Done; } - if (!suspend_test(TEST_CORE)) - error = suspend_ops->enter(state); + error = sysdev_suspend(PMSG_SUSPEND); + if (!error) { + if (!suspend_test(TEST_CORE)) + error = suspend_ops->enter(state); + sysdev_resume(); + } device_power_up(PMSG_RESUME); Done: Index: linux-2.6/drivers/base/base.h =================================================================== --- linux-2.6.orig/drivers/base/base.h +++ linux-2.6/drivers/base/base.h @@ -88,8 +88,6 @@ extern void driver_detach(struct device_ extern int driver_probe_device(struct device_driver *drv, struct device *dev); extern void sysdev_shutdown(void); -extern int sysdev_suspend(pm_message_t state); -extern int sysdev_resume(void); extern char *make_class_name(const char *name, struct kobject *kobj); Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -381,10 +381,12 @@ struct dev_pm_info { #ifdef CONFIG_PM_SLEEP extern void device_pm_lock(void); +extern int sysdev_resume(void); extern void device_power_up(pm_message_t state); extern void device_resume(pm_message_t state); extern void device_pm_unlock(void); +extern int sysdev_suspend(pm_message_t state); extern int device_power_down(pm_message_t state); extern int device_suspend(pm_message_t state); extern int device_prepare_suspend(pm_message_t state); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:37 ` Rafael J. Wysocki (?) (?) @ 2009-02-22 17:38 ` Rafael J. Wysocki 2009-02-22 20:56 ` Adrian Bunk ` (3 more replies) -1 siblings, 4 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:38 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Move the sysdev_suspend/resume from the callee to the callers, with no real change in semantics, so that we can rework the disabling of interrupts during suspend/hibernation. This is based on an earlier patch from Linus. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 4 ++++ drivers/base/base.h | 2 -- drivers/base/power/main.c | 3 --- drivers/xen/manage.c | 8 ++++++++ include/linux/pm.h | 2 ++ kernel/kexec.c | 7 +++++++ kernel/power/disk.c | 11 +++++++++++ kernel/power/main.c | 8 ++++++-- 8 files changed, 38 insertions(+), 7 deletions(-) Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1192,6 +1192,7 @@ static int suspend(int vetoable) device_suspend(PMSG_SUSPEND); local_irq_disable(); device_power_down(PMSG_SUSPEND); + sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1208,6 +1209,7 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); device_power_up(PMSG_RESUME); local_irq_enable(); device_resume(PMSG_RESUME); @@ -1228,6 +1230,7 @@ static void standby(void) local_irq_disable(); device_power_down(PMSG_SUSPEND); + sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); err = set_system_power_state(APM_STATE_STANDBY); @@ -1235,6 +1238,7 @@ static void standby(void) apm_error("standby", err); local_irq_disable(); + sysdev_resume(); device_power_up(PMSG_RESUME); local_irq_enable(); } Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -333,7 +333,6 @@ static void dpm_power_up(pm_message_t st */ void device_power_up(pm_message_t state) { - sysdev_resume(); dpm_power_up(state); } EXPORT_SYMBOL_GPL(device_power_up); @@ -577,8 +576,6 @@ int device_power_down(pm_message_t state } dev->power.status = DPM_OFF_IRQ; } - if (!error) - error = sysdev_suspend(state); if (error) dpm_power_up(resume_event(state)); return error; Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -45,6 +45,13 @@ static int xen_suspend(void *data) err); return err; } + err = sysdev_suspend(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", + err); + device_power_up(PMSG_RESUME); + return err; + } xen_mm_pin_all(); gnttab_suspend(); @@ -61,6 +68,7 @@ static int xen_suspend(void *data) gnttab_resume(); xen_mm_unpin_all(); + sysdev_resume(); device_power_up(PMSG_RESUME); if (!*cancelled) { Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1465,6 +1465,11 @@ int kernel_kexec(void) error = device_power_down(PMSG_FREEZE); if (error) goto Enable_irqs; + + /* Suspend system devices */ + error = sysdev_suspend(PMSG_FREEZE); + if (error) + goto Power_up_devices; } else #endif { @@ -1477,6 +1482,8 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { + sysdev_resume(); + Power_up_devices: device_power_up(PMSG_RESTORE); Enable_irqs: local_irq_enable(); Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -227,6 +227,12 @@ static int create_image(int platform_mod "aborting hibernation\n"); goto Enable_irqs; } + sysdev_suspend(PMSG_FREEZE); + if (error) { + printk(KERN_ERR "PM: Some devices failed to power down, " + "aborting hibernation\n"); + goto Power_up_devices; + } if (hibernation_test(TEST_CORE)) goto Power_up; @@ -242,9 +248,11 @@ static int create_image(int platform_mod if (!in_suspend) platform_leave(platform_mode); Power_up: + sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); Enable_irqs: @@ -335,6 +343,7 @@ static int resume_target_kernel(void) "aborting resume\n"); goto Enable_irqs; } + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -357,6 +366,7 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); device_power_up(PMSG_RECOVER); Enable_irqs: local_irq_enable(); @@ -440,6 +450,7 @@ int hibernation_platform_enter(void) local_irq_disable(); error = device_power_down(PMSG_HIBERNATE); if (!error) { + sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -298,8 +298,12 @@ static int suspend_enter(suspend_state_t goto Done; } - if (!suspend_test(TEST_CORE)) - error = suspend_ops->enter(state); + error = sysdev_suspend(PMSG_SUSPEND); + if (!error) { + if (!suspend_test(TEST_CORE)) + error = suspend_ops->enter(state); + sysdev_resume(); + } device_power_up(PMSG_RESUME); Done: Index: linux-2.6/drivers/base/base.h =================================================================== --- linux-2.6.orig/drivers/base/base.h +++ linux-2.6/drivers/base/base.h @@ -88,8 +88,6 @@ extern void driver_detach(struct device_ extern int driver_probe_device(struct device_driver *drv, struct device *dev); extern void sysdev_shutdown(void); -extern int sysdev_suspend(pm_message_t state); -extern int sysdev_resume(void); extern char *make_class_name(const char *name, struct kobject *kobj); Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -381,10 +381,12 @@ struct dev_pm_info { #ifdef CONFIG_PM_SLEEP extern void device_pm_lock(void); +extern int sysdev_resume(void); extern void device_power_up(pm_message_t state); extern void device_resume(pm_message_t state); extern void device_pm_unlock(void); +extern int sysdev_suspend(pm_message_t state); extern int device_power_down(pm_message_t state); extern int device_suspend(pm_message_t state); extern int device_prepare_suspend(pm_message_t state); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:38 ` Rafael J. Wysocki @ 2009-02-22 20:56 ` Adrian Bunk 2009-02-22 21:07 ` Linus Torvalds 2009-02-22 20:56 ` Adrian Bunk ` (2 subsequent siblings) 3 siblings, 1 reply; 373+ messages in thread From: Adrian Bunk @ 2009-02-22 20:56 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, Feb 22, 2009 at 06:38:50PM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Move the sysdev_suspend/resume from the callee to the callers, with > no real change in semantics, so that we can rework the disabling of > interrupts during suspend/hibernation. > > This is based on an earlier patch from Linus. > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> >... > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > +++ linux-2.6/arch/x86/kernel/apm_32.c > @@ -1192,6 +1192,7 @@ static int suspend(int vetoable) > device_suspend(PMSG_SUSPEND); > local_irq_disable(); > device_power_down(PMSG_SUSPEND); > + sysdev_suspend(PMSG_SUSPEND); > > local_irq_enable(); > > @@ -1208,6 +1209,7 @@ static int suspend(int vetoable) > if (err != APM_SUCCESS) > apm_error("suspend", err); > err = (err == APM_SUCCESS) ? 0 : -EIO; > + sysdev_resume(); > device_power_up(PMSG_RESUME); > local_irq_enable(); > device_resume(PMSG_RESUME); > @@ -1228,6 +1230,7 @@ static void standby(void) > > local_irq_disable(); > device_power_down(PMSG_SUSPEND); > + sysdev_suspend(PMSG_SUSPEND); > local_irq_enable(); > > err = set_system_power_state(APM_STATE_STANDBY); > @@ -1235,6 +1238,7 @@ static void standby(void) > apm_error("standby", err); > > local_irq_disable(); > + sysdev_resume(); > device_power_up(PMSG_RESUME); > local_irq_enable(); > } >... This causes the following build error with CONFIG_APM=m: <-- snip --> ... MODPOST 2586 modules ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! make[2]: *** [__modpost] Error 1 <-- snip --> cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 20:56 ` Adrian Bunk @ 2009-02-22 21:07 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 21:07 UTC (permalink / raw) To: Adrian Bunk Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Adrian Bunk wrote: > ... > MODPOST 2586 modules > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > make[2]: *** [__modpost] Error 1 Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've split them, so must sysdev_[suspend|resume] be. Does this fix it? Linus --- drivers/base/sys.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/sys.c b/drivers/base/sys.c index c98c31e..ef2055e 100644 --- a/drivers/base/sys.c +++ b/drivers/base/sys.c @@ -432,6 +432,7 @@ aux_driver: } return ret; } +EXPORT_SYMBOL_GPL(sysdev_suspend); /** @@ -463,6 +464,7 @@ int sysdev_resume(void) } return 0; } +EXPORT_SYMBOL_GPL(sysdev_resume); int __init system_bus_init(void) ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] @ 2009-02-22 21:07 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 21:07 UTC (permalink / raw) To: Adrian Bunk Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 22 Feb 2009, Adrian Bunk wrote: > ... > MODPOST 2586 modules > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > make[2]: *** [__modpost] Error 1 Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've split them, so must sysdev_[suspend|resume] be. Does this fix it? Linus --- drivers/base/sys.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/sys.c b/drivers/base/sys.c index c98c31e..ef2055e 100644 --- a/drivers/base/sys.c +++ b/drivers/base/sys.c @@ -432,6 +432,7 @@ aux_driver: } return ret; } +EXPORT_SYMBOL_GPL(sysdev_suspend); /** @@ -463,6 +464,7 @@ int sysdev_resume(void) } return 0; } +EXPORT_SYMBOL_GPL(sysdev_resume); int __init system_bus_init(void) ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 21:07 ` Linus Torvalds @ 2009-02-22 21:12 ` Ingo Molnar -1 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-22 21:12 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Rafael J. Wysocki, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Sun, 22 Feb 2009, Adrian Bunk wrote: > > ... > > MODPOST 2586 modules > > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > > make[2]: *** [__modpost] Error 1 > > Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've > split them, so must sysdev_[suspend|resume] be. > > Does this fix it? I just hit the same issue in -tip testing and did the same fix: git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git core/urgent Ingo ------------------> Ingo Molnar (1): PM: Split up sysdev_[suspend|resume] from device_power_[down|up], fix drivers/base/sys.c | 7 ++----- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/base/sys.c b/drivers/base/sys.c index c98c31e..b428c8c 100644 --- a/drivers/base/sys.c +++ b/drivers/base/sys.c @@ -303,7 +303,6 @@ void sysdev_unregister(struct sys_device * sysdev) * is guaranteed by virtue of the fact that child devices are registered * after their parents. */ - void sysdev_shutdown(void) { struct sysdev_class * cls; @@ -363,7 +362,6 @@ static void __sysdev_resume(struct sys_device *dev) * This is only called by the device PM core, so we let them handle * all synchronization. */ - int sysdev_suspend(pm_message_t state) { struct sysdev_class * cls; @@ -432,7 +430,7 @@ aux_driver: } return ret; } - +EXPORT_SYMBOL_GPL(sysdev_suspend); /** * sysdev_resume - Bring system devices back to life. @@ -442,7 +440,6 @@ aux_driver: * * Note: Interrupts are disabled when called. */ - int sysdev_resume(void) { struct sysdev_class * cls; @@ -463,7 +460,7 @@ int sysdev_resume(void) } return 0; } - +EXPORT_SYMBOL_GPL(sysdev_resume); int __init system_bus_init(void) { ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] @ 2009-02-22 21:12 ` Ingo Molnar 0 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-22 21:12 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, LKML, Jesse Barnes, Eric W. Biederman, Jeremy Fitzhardinge, pm list, Thomas Gleixner * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Sun, 22 Feb 2009, Adrian Bunk wrote: > > ... > > MODPOST 2586 modules > > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > > make[2]: *** [__modpost] Error 1 > > Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've > split them, so must sysdev_[suspend|resume] be. > > Does this fix it? I just hit the same issue in -tip testing and did the same fix: git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git core/urgent Ingo ------------------> Ingo Molnar (1): PM: Split up sysdev_[suspend|resume] from device_power_[down|up], fix drivers/base/sys.c | 7 ++----- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/base/sys.c b/drivers/base/sys.c index c98c31e..b428c8c 100644 --- a/drivers/base/sys.c +++ b/drivers/base/sys.c @@ -303,7 +303,6 @@ void sysdev_unregister(struct sys_device * sysdev) * is guaranteed by virtue of the fact that child devices are registered * after their parents. */ - void sysdev_shutdown(void) { struct sysdev_class * cls; @@ -363,7 +362,6 @@ static void __sysdev_resume(struct sys_device *dev) * This is only called by the device PM core, so we let them handle * all synchronization. */ - int sysdev_suspend(pm_message_t state) { struct sysdev_class * cls; @@ -432,7 +430,7 @@ aux_driver: } return ret; } - +EXPORT_SYMBOL_GPL(sysdev_suspend); /** * sysdev_resume - Bring system devices back to life. @@ -442,7 +440,6 @@ aux_driver: * * Note: Interrupts are disabled when called. */ - int sysdev_resume(void) { struct sysdev_class * cls; @@ -463,7 +460,7 @@ int sysdev_resume(void) } return 0; } - +EXPORT_SYMBOL_GPL(sysdev_resume); int __init system_bus_init(void) { ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 21:07 ` Linus Torvalds (?) (?) @ 2009-02-22 22:42 ` Adrian Bunk -1 siblings, 0 replies; 373+ messages in thread From: Adrian Bunk @ 2009-02-22 22:42 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, Feb 22, 2009 at 01:07:28PM -0800, Linus Torvalds wrote: > > > On Sun, 22 Feb 2009, Adrian Bunk wrote: > > ... > > MODPOST 2586 modules > > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > > make[2]: *** [__modpost] Error 1 > > Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've > split them, so must sysdev_[suspend|resume] be. > > Does this fix it? Thanks, works fine. > Linus > --- > drivers/base/sys.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/base/sys.c b/drivers/base/sys.c > index c98c31e..ef2055e 100644 > --- a/drivers/base/sys.c > +++ b/drivers/base/sys.c > @@ -432,6 +432,7 @@ aux_driver: > } > return ret; > } > +EXPORT_SYMBOL_GPL(sysdev_suspend); > > > /** > @@ -463,6 +464,7 @@ int sysdev_resume(void) > } > return 0; > } > +EXPORT_SYMBOL_GPL(sysdev_resume); > > > int __init system_bus_init(void) cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 21:07 ` Linus Torvalds ` (2 preceding siblings ...) (?) @ 2009-02-22 22:42 ` Adrian Bunk -1 siblings, 0 replies; 373+ messages in thread From: Adrian Bunk @ 2009-02-22 22:42 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, Feb 22, 2009 at 01:07:28PM -0800, Linus Torvalds wrote: > > > On Sun, 22 Feb 2009, Adrian Bunk wrote: > > ... > > MODPOST 2586 modules > > ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! > > ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! > > make[2]: *** [__modpost] Error 1 > > Ahh. device_power_[down|up] were EXPORT_SYMBOL_GPL, so now that we've > split them, so must sysdev_[suspend|resume] be. > > Does this fix it? Thanks, works fine. > Linus > --- > drivers/base/sys.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/base/sys.c b/drivers/base/sys.c > index c98c31e..ef2055e 100644 > --- a/drivers/base/sys.c > +++ b/drivers/base/sys.c > @@ -432,6 +432,7 @@ aux_driver: > } > return ret; > } > +EXPORT_SYMBOL_GPL(sysdev_suspend); > > > /** > @@ -463,6 +464,7 @@ int sysdev_resume(void) > } > return 0; > } > +EXPORT_SYMBOL_GPL(sysdev_resume); > > > int __init system_bus_init(void) cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:38 ` Rafael J. Wysocki 2009-02-22 20:56 ` Adrian Bunk @ 2009-02-22 20:56 ` Adrian Bunk 2009-03-05 16:54 ` Pavel Machek 2009-03-05 16:54 ` Pavel Machek 3 siblings, 0 replies; 373+ messages in thread From: Adrian Bunk @ 2009-02-22 20:56 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Sun, Feb 22, 2009 at 06:38:50PM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Move the sysdev_suspend/resume from the callee to the callers, with > no real change in semantics, so that we can rework the disabling of > interrupts during suspend/hibernation. > > This is based on an earlier patch from Linus. > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> >... > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > +++ linux-2.6/arch/x86/kernel/apm_32.c > @@ -1192,6 +1192,7 @@ static int suspend(int vetoable) > device_suspend(PMSG_SUSPEND); > local_irq_disable(); > device_power_down(PMSG_SUSPEND); > + sysdev_suspend(PMSG_SUSPEND); > > local_irq_enable(); > > @@ -1208,6 +1209,7 @@ static int suspend(int vetoable) > if (err != APM_SUCCESS) > apm_error("suspend", err); > err = (err == APM_SUCCESS) ? 0 : -EIO; > + sysdev_resume(); > device_power_up(PMSG_RESUME); > local_irq_enable(); > device_resume(PMSG_RESUME); > @@ -1228,6 +1230,7 @@ static void standby(void) > > local_irq_disable(); > device_power_down(PMSG_SUSPEND); > + sysdev_suspend(PMSG_SUSPEND); > local_irq_enable(); > > err = set_system_power_state(APM_STATE_STANDBY); > @@ -1235,6 +1238,7 @@ static void standby(void) > apm_error("standby", err); > > local_irq_disable(); > + sysdev_resume(); > device_power_up(PMSG_RESUME); > local_irq_enable(); > } >... This causes the following build error with CONFIG_APM=m: <-- snip --> ... MODPOST 2586 modules ERROR: "sysdev_resume" [arch/x86/kernel/apm.ko] undefined! ERROR: "sysdev_suspend" [arch/x86/kernel/apm.ko] undefined! make[2]: *** [__modpost] Error 1 <-- snip --> cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:38 ` Rafael J. Wysocki 2009-02-22 20:56 ` Adrian Bunk 2009-02-22 20:56 ` Adrian Bunk @ 2009-03-05 16:54 ` Pavel Machek 2009-03-05 16:54 ` Pavel Machek 3 siblings, 0 replies; 373+ messages in thread From: Pavel Machek @ 2009-03-05 16:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list Hi! > Move the sysdev_suspend/resume from the callee to the callers, with > no real change in semantics, so that we can rework the disabling of > interrupts during suspend/hibernation. > > This is based on an earlier patch from Linus. > Index: linux-2.6/drivers/base/power/main.c > =================================================================== > --- linux-2.6.orig/drivers/base/power/main.c > +++ linux-2.6/drivers/base/power/main.c > @@ -333,7 +333,6 @@ static void dpm_power_up(pm_message_t st > */ > void device_power_up(pm_message_t state) > { > - sysdev_resume(); > dpm_power_up(state); > } And at this point we can rename dpm_power_up -> device_power_up. Good. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] 2009-02-22 17:38 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-05 16:54 ` Pavel Machek @ 2009-03-05 16:54 ` Pavel Machek 3 siblings, 0 replies; 373+ messages in thread From: Pavel Machek @ 2009-03-05 16:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Hi! > Move the sysdev_suspend/resume from the callee to the callers, with > no real change in semantics, so that we can rework the disabling of > interrupts during suspend/hibernation. > > This is based on an earlier patch from Linus. > Index: linux-2.6/drivers/base/power/main.c > =================================================================== > --- linux-2.6.orig/drivers/base/power/main.c > +++ linux-2.6/drivers/base/power/main.c > @@ -333,7 +333,6 @@ static void dpm_power_up(pm_message_t st > */ > void device_power_up(pm_message_t state) > { > - sysdev_resume(); > dpm_power_up(state); > } And at this point we can rename dpm_power_up -> device_power_up. Good. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (2 preceding siblings ...) (?) @ 2009-02-22 17:39 ` Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds ` (3 more replies) -1 siblings, 4 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:39 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to disable device interrupts (at the IO-APIC level) during suspend or hibernation and enable them during the subsequent resume, respectively, so that the timer interrupts are enabled while "late" suspend callbacks and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device interrupts will be disabled (at the IO-APIC level), with the help of the new helper function, before calling "late" suspend callbacks provided by device drivers and analogously during resume. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 20 ++++++++-- drivers/xen/manage.c | 37 ++++++++++++-------- include/linux/interrupt.h | 3 + kernel/irq/manage.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 11 ++++- kernel/power/disk.c | 56 +++++++++++++++++++++++++++--- kernel/power/main.c | 27 +++++++++++--- 7 files changed, 208 insertions(+), 31 deletions(-) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -746,3 +746,88 @@ int request_irq(unsigned int irq, irq_ha return retval; } EXPORT_SYMBOL(request_irq); + +#ifdef CONFIG_PM_SLEEP +struct disabled_irq { + struct list_head list; + int irq; +}; + +static LIST_HEAD(resume_irqs_list); + +/** + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() + * + * Enable all interrupt lines previously disabled by disable_device_irqs() + * that are on resume_irqs_list. + */ +void enable_device_irqs(void) +{ + struct disabled_irq *resume_irq, *tmp; + + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { + enable_irq(resume_irq->irq); + list_del(&resume_irq->list); + kfree(resume_irq); + } +} + +/** + * disable_device_irqs - disable all enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and saves their numbers for enable_device_irqs(). + */ +int disable_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + struct disabled_irq *resume_irq; + struct irqaction *action; + bool is_timer_irq; + + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); + if (!resume_irq) { + enable_device_irqs(); + return -ENOMEM; + } + + spin_lock_irqsave(&desc->lock, flags); + + is_timer_irq = false; + action = desc->action; + while (action) { + if (action->flags | IRQF_TIMER) { + is_timer_irq = true; + break; + } + action = action->next; + } + + if (!is_timer_irq && !desc->depth) { + desc->depth++; + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } else { + spin_unlock_irqrestore(&desc->lock, flags); + kfree(resume_irq); + continue; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (desc->action) + synchronize_irq(irq); + + resume_irq->irq = irq; + list_add(&resume_irq->list, &resume_irqs_list); + } + + return 0; +} +#endif /* CONFIG_PM_SLEEP */ Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -470,4 +470,7 @@ extern int early_irq_init(void); extern int arch_early_irq_init(void); extern int arch_init_chip_data(struct irq_desc *desc, int cpu); +extern int disable_device_irqs(void); +extern void enable_device_irqs(void); + #endif Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -22,6 +22,7 @@ #include <linux/freezer.h> #include <linux/vmstat.h> #include <linux/syscalls.h> +#include <linux/interrupt.h> #include "power.h" @@ -287,17 +288,25 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +314,17 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: + enable_device_irqs(); + + Unlock: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -22,6 +22,7 @@ #include <linux/console.h> #include <linux/cpu.h> #include <linux/freezer.h> +#include <linux/interrupt.h> #include "power.h" @@ -214,7 +215,13 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -227,6 +234,9 @@ static int create_image(int platform_mod "aborting hibernation\n"); goto Enable_irqs; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,11 +262,17 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); + Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + + Unlock: device_pm_unlock(); return error; } @@ -336,13 +352,22 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); goto Enable_irqs; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +391,19 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); + + local_irq_enable(); + device_power_up(PMSG_RECOVER); + Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +480,23 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) + goto Unlock; + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + + enable_device_irqs(); + + Unlock: device_pm_unlock(); /* @@ -464,12 +505,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -228,6 +228,7 @@ #include <linux/suspend.h> #include <linux/kthread.h> #include <linux/jiffies.h> +#include <linux/interrupt.h> #include <asm/system.h> #include <asm/uaccess.h> @@ -1190,8 +1191,11 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + + disable_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1213,13 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + enable_device_irqs(); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1236,10 @@ static void standby(void) { int err; - local_irq_disable(); + disable_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1249,10 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + enable_device_irqs(); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,13 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); - - if (!*cancelled) { - xen_irq_resume(); - xen_console_resume(); - xen_timer_resume(); - } return 0; } @@ -108,6 +95,18 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = disable_device_irqs(); + if (err) { + printk(KERN_ERR "disable_device_irqs failed: %d\n", err); + goto resume_devices; + } + + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto enable_irqs; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,18 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + + if (!cancelled) { + xen_irq_resume(); + xen_console_resume(); + xen_timer_resume(); + } + +enable_irqs: + enable_device_irqs(); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,11 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) + goto Unlock_pm; + /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1466,6 +1470,7 @@ int kernel_kexec(void) if (error) goto Enable_irqs; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1489,11 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: + local_irq_enable(); device_power_up(PMSG_RESTORE); Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:39 ` [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki @ 2009-02-22 18:01 ` Linus Torvalds 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds ` (2 subsequent siblings) 3 siblings, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. I think this patch is actually a bit too complicated. > +struct disabled_irq { > + struct list_head list; > + int irq; > +}; > + > +static LIST_HEAD(resume_irqs_list); > + > +/** > + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() > + * > + * Enable all interrupt lines previously disabled by disable_device_irqs() > + * that are on resume_irqs_list. > + */ > +void enable_device_irqs(void) > +{ > + struct disabled_irq *resume_irq, *tmp; > + > + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { > + enable_irq(resume_irq->irq); > + list_del(&resume_irq->list); > + kfree(resume_irq); > + } > +} Don't do this whole separate list. Instead, just add a per-irq-descriptor flag to the desc->status field that says "suspended". IOW, just do something like diff --git a/include/linux/irq.h b/include/linux/irq.h index f899b50..7bc2a31 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsigned int irq, #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) and then just make the suspend sequence do for_each_irq_desc(irq, desc) { .. check desc if we should disable it .. disable_irq(irq); desc->status |= IRQ_SUSPENDED; } and the resume sequence do for_each_irq_desc(irq, desc) { if (!(desc->status & IRQ_SUSPENDED)) continue; desc->status &= ~IRQ_SUSPENDED; enabled_irq(irq); } And that simplifcation then gets rid of > +/** > + * disable_device_irqs - disable all enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this > + * purpose. It disables all interrupt lines that are enabled at the > + * moment and saves their numbers for enable_device_irqs(). > + */ > +int disable_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + struct disabled_irq *resume_irq; > + struct irqaction *action; > + bool is_timer_irq; > + > + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); > + if (!resume_irq) { > + enable_device_irqs(); > + return -ENOMEM; > + } this just goes away. > + is_timer_irq = false; > + action = desc->action; > + while (action) { > + if (action->flags | IRQF_TIMER) { > + is_timer_irq = true; > + break; > + } > + action = action->next; > + } This is also pointless and wrong (and buggy). You should use '&' to test that flag, not '|', but more importantly, if you share interrupts with a timer irq, there's nothing sane the irq layer can do ANYWAY, so just ignore the whole problem. Just look at the first one, don't try to be clever, because your clever code doesn't buy anything at all. So get rid of the loop, and just do if (desc->action && !(desc->action->flags & IRQF_TIMER)) { desc->depth++; desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; desc->chip->disable(irq); } spin_unlock_irqrestore(&desc->lock, flags); and you're done. Also, I'd actually suggest that the whole "synchronize_irq()" be handled in a separate loop after the main one, so make that one just be for_each_irq_desc(irq, desc) { if (desc->status & IRQ_SUSPENDED) serialize_irq(irq); } at the end. No need for desc->lock, since the IRQ_SUSPENDED bit is stable. Finally: > +extern int disable_device_irqs(void); > +extern void enable_device_irqs(void); I think the naming is not great. It's not about disable/enable, it's very much about suspend/resume. In your version, it had that global "disabled_irq" list, and in mine it has that IRQ_SUSPENDED bit - and in both cases you can't nest things, and you can't consider them in any way "generic" enable/disable things, they are very specialized "shut up everything but the timer irq". I also don't think there is any reasonable error case, so just make the "suspend" thing return 'void', and don't complicate the caller. We don't error out on the simple "disable_irq()" either. It's a imperative statement, not a "please can you try to do that" thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 18:01 ` Linus Torvalds @ 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-22 22:42 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 22:42 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sunday 22 February 2009, Linus Torvalds wrote: > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > interrupts will be disabled (at the IO-APIC level), with the help of > > the new helper function, before calling "late" suspend callbacks > > provided by device drivers and analogously during resume. > > I think this patch is actually a bit too complicated. > > > +struct disabled_irq { > > + struct list_head list; > > + int irq; > > +}; > > + > > +static LIST_HEAD(resume_irqs_list); > > + > > +/** > > + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by disable_device_irqs() > > + * that are on resume_irqs_list. > > + */ > > +void enable_device_irqs(void) > > +{ > > + struct disabled_irq *resume_irq, *tmp; > > + > > + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { > > + enable_irq(resume_irq->irq); > > + list_del(&resume_irq->list); > > + kfree(resume_irq); > > + } > > +} > > Don't do this whole separate list. Instead, just add a per-irq-descriptor > flag to the desc->status field that says "suspended". IOW, just do > something like OK > diff --git a/include/linux/irq.h b/include/linux/irq.h > index f899b50..7bc2a31 100644 > --- a/include/linux/irq.h > +++ b/include/linux/irq.h > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsigned int irq, > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > #ifdef CONFIG_IRQ_PER_CPU > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) > > and then just make the suspend sequence do > > for_each_irq_desc(irq, desc) { > .. check desc if we should disable it .. > disable_irq(irq); > desc->status |= IRQ_SUSPENDED; > } > > and the resume sequence do > > for_each_irq_desc(irq, desc) { > if (!(desc->status & IRQ_SUSPENDED)) > continue; > desc->status &= ~IRQ_SUSPENDED; > enabled_irq(irq); > } > > And that simplifcation then gets rid of > > > +/** > > + * disable_device_irqs - disable all enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this > > + * purpose. It disables all interrupt lines that are enabled at the > > + * moment and saves their numbers for enable_device_irqs(). > > + */ > > +int disable_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + struct disabled_irq *resume_irq; > > + struct irqaction *action; > > + bool is_timer_irq; > > + > > + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); > > + if (!resume_irq) { > > + enable_device_irqs(); > > + return -ENOMEM; > > + } > > this just goes away. > > > + is_timer_irq = false; > > + action = desc->action; > > + while (action) { > > + if (action->flags | IRQF_TIMER) { > > + is_timer_irq = true; > > + break; > > + } > > + action = action->next; > > + } > > This is also pointless and wrong (and buggy). You should use '&' to > test that flag, not '|', Ouch, sorry. > but more importantly, if you share interrupts with a timer irq, there's > nothing sane the irq layer can do ANYWAY, so just ignore the whole problem. > Just look at the first one, don't try to be clever, because your clever code > doesn't buy anything at all. > > So get rid of the loop, and just do > > if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > desc->depth++; > desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > desc->chip->disable(irq); > } > spin_unlock_irqrestore(&desc->lock, flags); > > and you're done. OK > Also, I'd actually suggest that the whole "synchronize_irq()" be handled > in a separate loop after the main one, so make that one just be > > for_each_irq_desc(irq, desc) { > if (desc->status & IRQ_SUSPENDED) > serialize_irq(irq); > } > > at the end. No need for desc->lock, since the IRQ_SUSPENDED bit is stable. OK > Finally: > > > +extern int disable_device_irqs(void); > > +extern void enable_device_irqs(void); > > I think the naming is not great. It's not about disable/enable, it's very > much about suspend/resume. In your version, it had that global > "disabled_irq" list, and in mine it has that IRQ_SUSPENDED bit - and in > both cases you can't nest things, and you can't consider them in any way > "generic" enable/disable things, they are very specialized "shut up > everything but the timer irq". OK, would extern void suspend_device_irqs(void); extern void resume_device_irqs(void); be better? > I also don't think there is any reasonable error case, so just make the > "suspend" thing return 'void', and don't complicate the caller. We don't > error out on the simple "disable_irq()" either. It's a imperative > statement, not a "please can you try to do that" thing. The error is there just because the memory allocation can fail. With the IRQ_SUSPENDED flag as per your suggestion it won't be necessary any more. Thanks a lot for your comments, I'll send an updated patch shortly. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 22:42 ` Rafael J. Wysocki @ 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-23 0:05 ` Linus Torvalds ` (5 more replies) 2009-02-22 23:48 ` Rafael J. Wysocki 1 sibling, 6 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 23:48 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sunday 22 February 2009, Rafael J. Wysocki wrote: > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: [--snip--] > > Thanks a lot for your comments, I'll send an updated patch shortly. The updated patch is appended. It has been initially tested, but requires more testing, especially with APM, XEN, kexec jump etc. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) Introduce two helper functions allowing us to disable device interrupts (at the IO-APIC level) during suspend or hibernation and enable them during the subsequent resume, respectively, so that the timer interrupts are enabled while "late" suspend callbacks and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device interrupts will be disabled (at the IO-APIC level), with the help of the new helper function, before calling "late" suspend callbacks provided by device drivers and analogously during resume. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 20 ++++++++++++---- drivers/xen/manage.c | 32 +++++++++++++++---------- include/linux/interrupt.h | 3 ++ include/linux/irq.h | 1 kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 10 ++++---- kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- kernel/power/main.c | 20 +++++++++++----- 8 files changed, 152 insertions(+), 37 deletions(-) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha return retval; } EXPORT_SYMBOL(request_irq); + +#ifdef CONFIG_PM_SLEEP +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and sets the IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && desc->action + && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() + * that have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + if (!(desc->status & IRQ_SUSPENDED)) + continue; + desc->status &= ~IRQ_SUSPENDED; + enable_irq(irq); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); +#endif /* CONFIG_PM_SLEEP */ Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -470,4 +470,7 @@ extern int early_irq_init(void); extern int arch_early_irq_init(void); extern int arch_init_chip_data(struct irq_desc *desc, int cpu); +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); + #endif Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -22,6 +22,7 @@ #include <linux/freezer.h> #include <linux/vmstat.h> #include <linux/syscalls.h> +#include <linux/interrupt.h> #include "power.h" @@ -287,17 +288,20 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); + suspend_device_irqs(); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +309,15 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: + resume_device_irqs(); device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -22,6 +22,7 @@ #include <linux/console.h> #include <linux/cpu.h> #include <linux/freezer.h> +#include <linux/interrupt.h> #include "power.h" @@ -214,7 +215,8 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +227,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +257,17 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: + resume_device_irqs(); device_pm_unlock(); + return error; } @@ -336,13 +346,17 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +380,17 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: + resume_device_irqs(); device_pm_unlock(); + return error; } @@ -447,15 +467,18 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + + resume_device_irqs(); device_pm_unlock(); /* @@ -464,12 +487,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -228,6 +228,7 @@ #include <linux/suspend.h> #include <linux/kthread.h> #include <linux/jiffies.h> +#include <linux/interrupt.h> #include <asm/system.h> #include <asm/uaccess.h> @@ -1190,8 +1191,11 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + + suspend_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1213,13 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + resume_device_irqs(); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1236,10 @@ static void standby(void) { int err; - local_irq_disable(); + suspend_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1249,10 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + resume_device_irqs(); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,13 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); - - if (!*cancelled) { - xen_irq_resume(); - xen_console_resume(); - xen_timer_resume(); - } return 0; } @@ -108,6 +95,14 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + suspend_device_irqs(); + + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +115,17 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + + if (!cancelled) { + xen_irq_resume(); + xen_console_resume(); + xen_timer_resume(); + } + +resume_devices: + resume_device_irqs(); + device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,7 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1464,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Resume_irqs; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1485,10 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Resume_irqs: + resume_device_irqs(); device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki @ 2009-02-23 0:05 ` Linus Torvalds 2009-02-23 0:05 ` Linus Torvalds ` (4 subsequent siblings) 5 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 0:05 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Mon, 23 Feb 2009, Rafael J. Wysocki wrote: > > The updated patch is appended. Ok, looks sane to me. I'll try it on my poor eeepc, although right now Fedora-11 rawhide (that poor laptop gets _all_ the crazy stuff thrown at it, and runs btrfs to boot) has broken something in X by enabling DRI2, so I may not get around to it today. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-23 0:05 ` Linus Torvalds @ 2009-02-23 0:05 ` Linus Torvalds 2009-02-23 1:23 ` Linus Torvalds 2009-02-23 3:04 ` Eric W. Biederman ` (3 subsequent siblings) 5 siblings, 1 reply; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 0:05 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Mon, 23 Feb 2009, Rafael J. Wysocki wrote: > > The updated patch is appended. Ok, looks sane to me. I'll try it on my poor eeepc, although right now Fedora-11 rawhide (that poor laptop gets _all_ the crazy stuff thrown at it, and runs btrfs to boot) has broken something in X by enabling DRI2, so I may not get around to it today. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 0:05 ` Linus Torvalds @ 2009-02-23 1:23 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 1:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Linus Torvalds wrote: > > Ok, looks sane to me. I'll try it on my poor eeepc, although right now > Fedora-11 rawhide (that poor laptop gets _all_ the crazy stuff thrown at > it, and runs btrfs to boot) has broken something in X by enabling DRI2, so > I may not get around to it today. Well, the suspend/resume part seems to work for me. My X issues keep my from testing it with compiz, but here's an ack for v2 of the 2/2 patch at least on my EeePC. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 1:23 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 1:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 22 Feb 2009, Linus Torvalds wrote: > > Ok, looks sane to me. I'll try it on my poor eeepc, although right now > Fedora-11 rawhide (that poor laptop gets _all_ the crazy stuff thrown at > it, and runs btrfs to boot) has broken something in X by enabling DRI2, so > I may not get around to it today. Well, the suspend/resume part seems to work for me. My X issues keep my from testing it with compiz, but here's an ack for v2 of the 2/2 patch at least on my EeePC. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 1:23 ` Linus Torvalds (?) @ 2009-02-23 10:52 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 10:52 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Linus Torvalds wrote: > > On Sun, 22 Feb 2009, Linus Torvalds wrote: > > > > Ok, looks sane to me. I'll try it on my poor eeepc, although right now > > Fedora-11 rawhide (that poor laptop gets _all_ the crazy stuff thrown at > > it, and runs btrfs to boot) has broken something in X by enabling DRI2, so > > I may not get around to it today. > > Well, the suspend/resume part seems to work for me. Great, thanks for testing. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-23 0:05 ` Linus Torvalds 2009-02-23 0:05 ` Linus Torvalds @ 2009-02-23 3:04 ` Eric W. Biederman 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 3:04 ` Eric W. Biederman ` (2 subsequent siblings) 5 siblings, 2 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 3:04 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, LKML, Ingo Molnar, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner "Rafael J. Wysocki" <rjw@sisk.pl> writes: > On Sunday 22 February 2009, Rafael J. Wysocki wrote: >> On Sunday 22 February 2009, Linus Torvalds wrote: >> > >> > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > [--snip--] >> >> Thanks a lot for your comments, I'll send an updated patch shortly. > > The updated patch is appended. > > It has been initially tested, but requires more testing, especially with APM, > XEN, kexec jump etc. > > Thanks, > Rafael > > --- > From: Rafael J. Wysocki <rjw@sisk.pl> > Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) > > Introduce two helper functions allowing us to disable device > interrupts (at the IO-APIC level) during suspend or hibernation > and enable them during the subsequent resume, respectively, so that > the timer interrupts are enabled while "late" suspend callbacks and > "early" resume callbacks provided by device drivers are being > executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. I don't have an issue with the code, but I do have an issue with this description of it. Calling disable especially for ioapics does nothing directly. It simply arranges for the irq to be marked pending and for the irq to be masked if the irq happens. So what you are doing is arranging so that no interrupts will be delivered to drivers. Not really disabling interrupts at the IO-APIC level. In addition not all interrupts (even on x86) go through an IO-APIC anymore so describing the patch in terms of an IO-APIC makes it a bit hard to understand what your intent actually is. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 3:04 ` Eric W. Biederman @ 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 8:44 ` Ingo Molnar 1 sibling, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 8:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, pm list, Linus Torvalds, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > >> On Sunday 22 February 2009, Linus Torvalds wrote: > >> > > >> > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > [--snip--] > >> > >> Thanks a lot for your comments, I'll send an updated patch shortly. > > > > The updated patch is appended. > > > > It has been initially tested, but requires more testing, especially with APM, > > XEN, kexec jump etc. > > > > Thanks, > > Rafael > > > > --- > > From: Rafael J. Wysocki <rjw@sisk.pl> > > Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) > > > > Introduce two helper functions allowing us to disable device > > interrupts (at the IO-APIC level) during suspend or > > hibernation and enable them during the subsequent resume, > > respectively, so that the timer interrupts are enabled while > > "late" suspend callbacks and "early" resume callbacks > > provided by device drivers are being executed. > > > > Use these functions to rework the handling of interrupts > > during suspend (hibernation) and resume. Namely, interrupts > > will only be disabled on the CPU right before suspending > > sysdevs, while device interrupts will be disabled (at the > > IO-APIC level), with the help of the new helper function, > > before calling "late" suspend callbacks provided by device > > drivers and analogously during resume. > > I don't have an issue with the code, but I do have an issue > with this description of it. > > Calling disable especially for ioapics does nothing directly. > It simply arranges for the irq to be marked pending and for > the irq to be masked if the irq happens. > > So what you are doing is arranging so that no interrupts will > be delivered to drivers. Not really disabling interrupts at > the IO-APIC level. > > In addition not all interrupts (even on x86) go through an > IO-APIC anymore so describing the patch in terms of an IO-APIC > makes it a bit hard to understand what your intent actually > is. I think this aspect has been well-understood during the discussion of this topic and it's just a slightly misleading changelog. The new suspend code does not rely on truly disabling IRQs on the low level. The purpose is to not get IRQs to drivers - which might crash/hang/race/misbehave. Still, it might make sense to not just use the ->disable sequence but primarily the ->shutdown irqchip method (when it's available in the irqchip). While we obviously cannot turn off the PIC that delivers timer IRQs at this stage - there's no theoretical reason why the suspend sequence couldnt power down some secondary PICs as well - in some arch code, or maybe even in the generic driver suspend sequence if the device tree is structured carefully enough so that the PIC gets turned off last. So turning off all device IRQs in the most lowlevel way possible would be prudent. I.e. the suspend stage should do: if (desc->chip->shutdown) desc->chip->shutdown(irq); else desc->chip->disable(irq); (there's no change needed for the resume stage) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 3:04 ` Eric W. Biederman 2009-02-23 8:44 ` Ingo Molnar @ 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:22 ` Eric W. Biederman 1 sibling, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 8:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > >> On Sunday 22 February 2009, Linus Torvalds wrote: > >> > > >> > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > [--snip--] > >> > >> Thanks a lot for your comments, I'll send an updated patch shortly. > > > > The updated patch is appended. > > > > It has been initially tested, but requires more testing, especially with APM, > > XEN, kexec jump etc. > > > > Thanks, > > Rafael > > > > --- > > From: Rafael J. Wysocki <rjw@sisk.pl> > > Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) > > > > Introduce two helper functions allowing us to disable device > > interrupts (at the IO-APIC level) during suspend or > > hibernation and enable them during the subsequent resume, > > respectively, so that the timer interrupts are enabled while > > "late" suspend callbacks and "early" resume callbacks > > provided by device drivers are being executed. > > > > Use these functions to rework the handling of interrupts > > during suspend (hibernation) and resume. Namely, interrupts > > will only be disabled on the CPU right before suspending > > sysdevs, while device interrupts will be disabled (at the > > IO-APIC level), with the help of the new helper function, > > before calling "late" suspend callbacks provided by device > > drivers and analogously during resume. > > I don't have an issue with the code, but I do have an issue > with this description of it. > > Calling disable especially for ioapics does nothing directly. > It simply arranges for the irq to be marked pending and for > the irq to be masked if the irq happens. > > So what you are doing is arranging so that no interrupts will > be delivered to drivers. Not really disabling interrupts at > the IO-APIC level. > > In addition not all interrupts (even on x86) go through an > IO-APIC anymore so describing the patch in terms of an IO-APIC > makes it a bit hard to understand what your intent actually > is. I think this aspect has been well-understood during the discussion of this topic and it's just a slightly misleading changelog. The new suspend code does not rely on truly disabling IRQs on the low level. The purpose is to not get IRQs to drivers - which might crash/hang/race/misbehave. Still, it might make sense to not just use the ->disable sequence but primarily the ->shutdown irqchip method (when it's available in the irqchip). While we obviously cannot turn off the PIC that delivers timer IRQs at this stage - there's no theoretical reason why the suspend sequence couldnt power down some secondary PICs as well - in some arch code, or maybe even in the generic driver suspend sequence if the device tree is structured carefully enough so that the PIC gets turned off last. So turning off all device IRQs in the most lowlevel way possible would be prudent. I.e. the suspend stage should do: if (desc->chip->shutdown) desc->chip->shutdown(irq); else desc->chip->disable(irq); (there's no change needed for the resume stage) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 8:44 ` Ingo Molnar @ 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:22 ` Eric W. Biederman 1 sibling, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 9:22 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, pm list, Linus Torvalds, Thomas Gleixner Ingo Molnar <mingo@elte.hu> writes: > I think this aspect has been well-understood during the > discussion of this topic and it's just a slightly misleading > changelog. As I was a member of that discussion I did not see that. It took me several passes through the patches to realize the goal is to allow drivers to be able to sleep while they are in their late pm shutdown routines. Why we want this I don't know. But it seems simple enough to implement, and it makes it harder to get the late pm suspend routines wrong, which is always good. > The new suspend code does not rely on truly disabling IRQs on > the low level. The purpose is to not get IRQs to drivers - which > might crash/hang/race/misbehave. Reasonable. I expect one of the problems with drivers getting it wrong is that the interface is too complex for mortal humans to understand. > Still, it might make sense to not just use the ->disable > sequence but primarily the ->shutdown irqchip method (when it's > available in the irqchip). Disable seems fine to me. This is interesting in the context of all of the irqs that will when masked show up somewhere else (think boot interrupts). > While we obviously cannot turn off the PIC that delivers timer > IRQs at this stage - there's no theoretical reason why the > suspend sequence couldnt power down some secondary PICs as well > - in some arch code, or maybe even in the generic driver suspend > sequence if the device tree is structured carefully enough so > that the PIC gets turned off last. If the point is simply to prevent deliver of irqs to the drivers I don't see the point of anything more than what the patch does now. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 9:22 ` Eric W. Biederman @ 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 9:22 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Ingo Molnar <mingo@elte.hu> writes: > I think this aspect has been well-understood during the > discussion of this topic and it's just a slightly misleading > changelog. As I was a member of that discussion I did not see that. It took me several passes through the patches to realize the goal is to allow drivers to be able to sleep while they are in their late pm shutdown routines. Why we want this I don't know. But it seems simple enough to implement, and it makes it harder to get the late pm suspend routines wrong, which is always good. > The new suspend code does not rely on truly disabling IRQs on > the low level. The purpose is to not get IRQs to drivers - which > might crash/hang/race/misbehave. Reasonable. I expect one of the problems with drivers getting it wrong is that the interface is too complex for mortal humans to understand. > Still, it might make sense to not just use the ->disable > sequence but primarily the ->shutdown irqchip method (when it's > available in the irqchip). Disable seems fine to me. This is interesting in the context of all of the irqs that will when masked show up somewhere else (think boot interrupts). > While we obviously cannot turn off the PIC that delivers timer > IRQs at this stage - there's no theoretical reason why the > suspend sequence couldnt power down some secondary PICs as well > - in some arch code, or maybe even in the generic driver suspend > sequence if the device tree is structured carefully enough so > that the PIC gets turned off last. If the point is simply to prevent deliver of irqs to the drivers I don't see the point of anything more than what the patch does now. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:22 ` Eric W. Biederman @ 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar ` (2 subsequent siblings) 3 siblings, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 9:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > I think this aspect has been well-understood during the > > discussion of this topic and it's just a slightly misleading > > changelog. > > As I was a member of that discussion I did not see that. > > It took me several passes through the patches to realize the > goal is to allow drivers to be able to sleep while they are in > their late pm shutdown routines. > > Why we want this I don't know. But it seems simple enough to > implement, and it makes it harder to get the late pm suspend > routines wrong, which is always good. That's not the only goal. The other goal is to further shrink a particular window of suspend fragility: the irqs-disabled stage of the suspend/resume sequence. Since suspend/resume is a mini-reboot sequence, there's a large amount of code executed - and the variety of code is large as well. We had repeat cases of random drivers re-enabling interrupts and thus breaking other drivers - and these are nasty to debug. So this patchset disables device IRQs centrally and serializes with pending work - so there's no races with pending IRQs anymore. The fact that we keep the timer irq running is two-fold: firstly the timer code is special and not really part of the regular suspend/resume sequence. Drivers want to take timestamps, sometimes they even want to do a small usleep(), etc. Ideally the suspend/resume code is pretty much _the same_ as a regular bootup (and shutdown) code - so we want to provide a similar environment to how drivers initialize and deinitialize, and we want to enable them to share code between bootup/shutdown and suspend/resume agressively. So the more generic kernel environment we give these fragile handlers, the better we are off in the end. Since we already had IRQS_TIMER, that was just the natural thing to do. > > The new suspend code does not rely on truly disabling IRQs > > on the low level. The purpose is to not get IRQs to drivers > > - which might crash/hang/race/misbehave. > > Reasonable. I expect one of the problems with drivers getting > it wrong is that the interface is too complex for mortal > humans to understand. The suspend/resume state machine certainly used to be a piece of code that makes a seasoned kernel developer weep in fear. That has changed drastically in the past few months. The suspend+hibernation logic got unified (at least as far as driver methods go), and all the flow and ordering has been cleaned up and has been made more robust. What makes s2ram fragile is not human failure but the combination of a handful of physical property: 1) Psychology: shutting the lid or pushing the suspend button is a deceivingly 'simple' action to the user. But under the hood, a ton of stuff happens: we deinitialize a lot of things, we go through _all hardware state_, and we do so in a serial fashion. If just one piece fails to do the right thing, the box might not resume. Still, the user expects this 'simple' thing to just work, all the time. No excuses accepted. 2) Length of code: To get a successful s2ram sequence the kernel runs through tens of thousands of lines of code. Code which never gets executed on a normal box - only if we s2ram. If just one step fails, we get a hung box. 3) Debuggability: a lot of s2ram code runs with the console off, making any bugs hard to debug. Furthermore we have no meaningful persistent storage either for kernel bug messages. The RTC trick of PM_DEBUG works but is a very narrow channel of information and it takes a lot of time to debug a bug via that method. The combination of these factors really makes up for a perfect storm in terms of kernel technology: we have this very-deceivingly-simple-looking but complex-and-rarely-executed piece of code, which is very hard to debug. Even just one of these factors would be enough to make an otherwise healthy subsystem fragile - no wonder s2ram has been a problem ever since it existed in the upstream kernel. So now we need just one thing: patience and more of the same good stuff that happened lately. > > Still, it might make sense to not just use the ->disable > > sequence but primarily the ->shutdown irqchip method (when > > it's available in the irqchip). > > Disable seems fine to me. This is interesting in the context > of all of the irqs that will when masked show up somewhere > else (think boot interrupts). > > > While we obviously cannot turn off the PIC that delivers > > timer IRQs at this stage - there's no theoretical reason why > > the suspend sequence couldnt power down some secondary PICs > > as well - in some arch code, or maybe even in the generic > > driver suspend sequence if the device tree is structured > > carefully enough so that the PIC gets turned off last. > > If the point is simply to prevent deliver of irqs to the > drivers I don't see the point of anything more than what the > patch does now. ... except for the usecase i described above. Say some PIC sits on a piece of silicon which gets turned off. I'm not talking about x86 but some custom device. We really dont want that IRQ line to send half of an IRQ message (un-ACK-ed) when it gets turned off. So physically 'suspending' all IRQ lines does make a certain level of long-term sense. Especially if it's just 3 extra lines of code to the existing patch. There _might_ be one downside: overhead of ->shutdown() methods. With a typical IRQ count on the typical netbook i doubt it's more than ~50 usecs combined. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:44 ` Ingo Molnar @ 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki ` (3 more replies) 2009-02-23 10:42 ` Eric W. Biederman 1 sibling, 4 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 10:42 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Ingo Molnar <mingo@elte.hu> writes: > * Eric W. Biederman <ebiederm@xmission.com> wrote: > >> Ingo Molnar <mingo@elte.hu> writes: >> >> > I think this aspect has been well-understood during the >> > discussion of this topic and it's just a slightly misleading >> > changelog. >> >> As I was a member of that discussion I did not see that. >> >> It took me several passes through the patches to realize the >> goal is to allow drivers to be able to sleep while they are in >> their late pm shutdown routines. >> >> Why we want this I don't know. But it seems simple enough to >> implement, and it makes it harder to get the late pm suspend >> routines wrong, which is always good. > > That's not the only goal. The other goal is to further shrink a > particular window of suspend fragility: the irqs-disabled stage > of the suspend/resume sequence. > > Since suspend/resume is a mini-reboot sequence, there's a large > amount of code executed - and the variety of code is large as > well. We had repeat cases of random drivers re-enabling > interrupts and thus breaking other drivers - and these are nasty > to debug. > > So this patchset disables device IRQs centrally and serializes > with pending work - so there's no races with pending IRQs > anymore. > > The fact that we keep the timer irq running is two-fold: firstly > the timer code is special and not really part of the regular > suspend/resume sequence. > > Drivers want to take timestamps, sometimes they even want to do > a small usleep(), etc. Ideally the suspend/resume code is pretty > much _the same_ as a regular bootup (and shutdown) code - so we > want to provide a similar environment to how drivers initialize > and deinitialize, and we want to enable them to share code > between bootup/shutdown and suspend/resume agressively. > > So the more generic kernel environment we give these fragile > handlers, the better we are off in the end. Since we already had > IRQS_TIMER, that was just the natural thing to do. I am all for sharing code, especially if we can factor if we can find common factors that do the same thing. I don't know how many times I have found drivers doing something weird in their shutdown routines that they don't know how to get the device out of. The e1000 driver has shown up several times because it likes to suspend the device on shutdown. The fact that the methods exposed to drivers were only defined to be usable on the s2ram/hibernate path is something I have brought up on more than one occasion as a bad choice. I'm really not convinced that the rational for separating out the shutdown methods from the remove methods has been very good. That of we don't need to clean up the in-kernel data structures on reboot so why do something extra that can introduce instability. So having been watching a smaller form of this drama on the reboot path for several years. Having had a device method with fixed semantics, and not the dwm sematics of the historical suspend routing. I expect there is still a ways to go before it is simple and easy for drivers to figure out what they need to implement out of the confusing variety of possible device methods. >> > The new suspend code does not rely on truly disabling IRQs >> > on the low level. The purpose is to not get IRQs to drivers >> > - which might crash/hang/race/misbehave. >> >> Reasonable. I expect one of the problems with drivers getting >> it wrong is that the interface is too complex for mortal >> humans to understand. > > The suspend/resume state machine certainly used to be a piece of > code that makes a seasoned kernel developer weep in fear. > > That has changed drastically in the past few months. The > suspend+hibernation logic got unified (at least as far as driver > methods go), and all the flow and ordering has been cleaned up > and has been made more robust. I will have to look again. My impression is that overloading a single method is part of what got us into this mess in the first place. No that I don't see things getting better. > What makes s2ram fragile is not human failure but the > combination of a handful of physical property: > > 1) Psychology: shutting the lid or pushing the suspend button is > a deceivingly 'simple' action to the user. But under the > hood, a ton of stuff happens: we deinitialize a lot of > things, we go through _all hardware state_, and we do so in a > serial fashion. If just one piece fails to do the right > thing, the box might not resume. Still, the user expects this > 'simple' thing to just work, all the time. No excuses > accepted. > > 2) Length of code: To get a successful s2ram sequence the kernel > runs through tens of thousands of lines of code. Code which > never gets executed on a normal box - only if we s2ram. If > just one step fails, we get a hung box. > > 3) Debuggability: a lot of s2ram code runs with the console off, > making any bugs hard to debug. Furthermore we have no > meaningful persistent storage either for kernel bug messages. > The RTC trick of PM_DEBUG works but is a very narrow channel > of information and it takes a lot of time to debug a bug via > that method. Yep that is an issue. > The combination of these factors really makes up for a perfect > storm in terms of kernel technology: we have this > very-deceivingly-simple-looking but complex-and-rarely-executed > piece of code, which is very hard to debug. And much of this as you are finding with this piece of code is how the software was designed rather then how the software needed to be. > Even just one of these factors would be enough to make an > otherwise healthy subsystem fragile - no wonder s2ram has been a > problem ever since it existed in the upstream kernel. > > So now we need just one thing: patience and more of the same > good stuff that happened lately. I think there has been some good progress, and so I am happy to be patient. I will still mention on occasion what it seems we are doing wrong. Unfortunately I don't have time to do a lot more than that. >> > Still, it might make sense to not just use the ->disable >> > sequence but primarily the ->shutdown irqchip method (when >> > it's available in the irqchip). >> >> Disable seems fine to me. This is interesting in the context >> of all of the irqs that will when masked show up somewhere >> else (think boot interrupts). >> >> > While we obviously cannot turn off the PIC that delivers >> > timer IRQs at this stage - there's no theoretical reason why >> > the suspend sequence couldnt power down some secondary PICs >> > as well - in some arch code, or maybe even in the generic >> > driver suspend sequence if the device tree is structured >> > carefully enough so that the PIC gets turned off last. >> >> If the point is simply to prevent deliver of irqs to the >> drivers I don't see the point of anything more than what the >> patch does now. > > ... except for the usecase i described above. Say some PIC sits > on a piece of silicon which gets turned off. I'm not talking > about x86 but some custom device. We really dont want that IRQ > line to send half of an IRQ message (un-ACK-ed) when it gets > turned off. So physically 'suspending' all IRQ lines does make a > certain level of long-term sense. Good point. We will loose both level and edge triggered events that occur between suspending the irqs and restoring them but that is inevitable. So we might as well call shutdown and totally turn off the irqs if we can. I don't know where in the state machine this is getting called but I would suggest doing this before we shutdown cpus. We are quickly reaching the point where laptops will exceed the 8 core limit, of lowest priority delivery mode. And only in lowest priority delivery mode is it possible to migrate irqs outside of the interrupt handlers. That plus if we suspend the irqs before shutting down the cpus it means we can safely support more vectors than a single cpu can catch. I was a little a worried about the shutdown code path because it requires in the worst case acking a level triggered irq when we have it disabled, but looking at ack_apic_level that appears to be a well tested code path. We just can't reprogram the vector. > There _might_ be one downside: overhead of ->shutdown() methods. > With a typical IRQ count on the typical netbook i doubt it's > more than ~50 usecs combined. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 10:42 ` Eric W. Biederman @ 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 11:03 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Eric W. Biederman wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > >> Ingo Molnar <mingo@elte.hu> writes: > >> > >> > I think this aspect has been well-understood during the > >> > discussion of this topic and it's just a slightly misleading > >> > changelog. > >> > >> As I was a member of that discussion I did not see that. > >> > >> It took me several passes through the patches to realize the > >> goal is to allow drivers to be able to sleep while they are in > >> their late pm shutdown routines. > >> > >> Why we want this I don't know. But it seems simple enough to > >> implement, and it makes it harder to get the late pm suspend > >> routines wrong, which is always good. > > > > That's not the only goal. The other goal is to further shrink a > > particular window of suspend fragility: the irqs-disabled stage > > of the suspend/resume sequence. > > > > Since suspend/resume is a mini-reboot sequence, there's a large > > amount of code executed - and the variety of code is large as > > well. We had repeat cases of random drivers re-enabling > > interrupts and thus breaking other drivers - and these are nasty > > to debug. > > > > So this patchset disables device IRQs centrally and serializes > > with pending work - so there's no races with pending IRQs > > anymore. > > > > The fact that we keep the timer irq running is two-fold: firstly > > the timer code is special and not really part of the regular > > suspend/resume sequence. > > > > Drivers want to take timestamps, sometimes they even want to do > > a small usleep(), etc. Ideally the suspend/resume code is pretty > > much _the same_ as a regular bootup (and shutdown) code - so we > > want to provide a similar environment to how drivers initialize > > and deinitialize, and we want to enable them to share code > > between bootup/shutdown and suspend/resume agressively. > > > > So the more generic kernel environment we give these fragile > > handlers, the better we are off in the end. Since we already had > > IRQS_TIMER, that was just the natural thing to do. > > I am all for sharing code, especially if we can factor if > we can find common factors that do the same thing. > > I don't know how many times I have found drivers doing something > weird in their shutdown routines that they don't know how > to get the device out of. The e1000 driver has shown up several > times because it likes to suspend the device on shutdown. > > The fact that the methods exposed to drivers were only defined > to be usable on the s2ram/hibernate path is something I have > brought up on more than one occasion as a bad choice. > > I'm really not convinced that the rational for separating > out the shutdown methods from the remove methods has > been very good. That of we don't need to clean up the in-kernel > data structures on reboot so why do something extra that can > introduce instability. > > So having been watching a smaller form of this drama on the > reboot path for several years. Having had a device method > with fixed semantics, and not the dwm sematics of the historical > suspend routing. I expect there is still a ways to go before > it is simple and easy for drivers to figure out what they need > to implement out of the confusing variety of possible device > methods. > > >> > The new suspend code does not rely on truly disabling IRQs > >> > on the low level. The purpose is to not get IRQs to drivers > >> > - which might crash/hang/race/misbehave. > >> > >> Reasonable. I expect one of the problems with drivers getting > >> it wrong is that the interface is too complex for mortal > >> humans to understand. > > > > The suspend/resume state machine certainly used to be a piece of > > code that makes a seasoned kernel developer weep in fear. > > > > That has changed drastically in the past few months. The > > suspend+hibernation logic got unified (at least as far as driver > > methods go), and all the flow and ordering has been cleaned up > > and has been made more robust. > > I will have to look again. My impression is that overloading > a single method is part of what got us into this mess in the > first place. > > No that I don't see things getting better. > > > What makes s2ram fragile is not human failure but the > > combination of a handful of physical property: > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > a deceivingly 'simple' action to the user. But under the > > hood, a ton of stuff happens: we deinitialize a lot of > > things, we go through _all hardware state_, and we do so in a > > serial fashion. If just one piece fails to do the right > > thing, the box might not resume. Still, the user expects this > > 'simple' thing to just work, all the time. No excuses > > accepted. > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > runs through tens of thousands of lines of code. Code which > > never gets executed on a normal box - only if we s2ram. If > > just one step fails, we get a hung box. > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > making any bugs hard to debug. Furthermore we have no > > meaningful persistent storage either for kernel bug messages. > > The RTC trick of PM_DEBUG works but is a very narrow channel > > of information and it takes a lot of time to debug a bug via > > that method. > > Yep that is an issue. > > > The combination of these factors really makes up for a perfect > > storm in terms of kernel technology: we have this > > very-deceivingly-simple-looking but complex-and-rarely-executed > > piece of code, which is very hard to debug. > > And much of this as you are finding with this piece of code > is how the software was designed rather then how the software > needed to be. > > > Even just one of these factors would be enough to make an > > otherwise healthy subsystem fragile - no wonder s2ram has been a > > problem ever since it existed in the upstream kernel. > > > > So now we need just one thing: patience and more of the same > > good stuff that happened lately. > > I think there has been some good progress, and so I am happy > to be patient. I will still mention on occasion what it > seems we are doing wrong. Unfortunately I don't have time > to do a lot more than that. > > >> > Still, it might make sense to not just use the ->disable > >> > sequence but primarily the ->shutdown irqchip method (when > >> > it's available in the irqchip). > >> > >> Disable seems fine to me. This is interesting in the context > >> of all of the irqs that will when masked show up somewhere > >> else (think boot interrupts). > >> > >> > While we obviously cannot turn off the PIC that delivers > >> > timer IRQs at this stage - there's no theoretical reason why > >> > the suspend sequence couldnt power down some secondary PICs > >> > as well - in some arch code, or maybe even in the generic > >> > driver suspend sequence if the device tree is structured > >> > carefully enough so that the PIC gets turned off last. > >> > >> If the point is simply to prevent deliver of irqs to the > >> drivers I don't see the point of anything more than what the > >> patch does now. > > > > ... except for the usecase i described above. Say some PIC sits > > on a piece of silicon which gets turned off. I'm not talking > > about x86 but some custom device. We really dont want that IRQ > > line to send half of an IRQ message (un-ACK-ed) when it gets > > turned off. So physically 'suspending' all IRQ lines does make a > > certain level of long-term sense. > > Good point. We will loose both level and edge triggered events > that occur between suspending the irqs and restoring them but > that is inevitable. So we might as well call shutdown and totally > turn off the irqs if we can. > > I don't know where in the state machine this is getting called but > I would suggest doing this before we shutdown cpus. This is the plan. In fact, I'm going to do this in the next patch after the $subject one has been tested and found acceptable. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:03 ` Rafael J. Wysocki @ 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 15:28 ` Eric W. Biederman 1 sibling, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 15:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list "Rafael J. Wysocki" <rjw@sisk.pl> writes: >> I don't know where in the state machine this is getting called but >> I would suggest doing this before we shutdown cpus. > > This is the plan. In fact, I'm going to do this in the next patch after the > $subject one has been tested and found acceptable. Good to hear. Then let's please get a version of the irq disable that calls shutdown, so we can be certain we don't have hardware irqs in flight. For the drivers it should not matter for clean cpu shutdown it will. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 15:28 ` Eric W. Biederman @ 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-23 21:39 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 15:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner "Rafael J. Wysocki" <rjw@sisk.pl> writes: >> I don't know where in the state machine this is getting called but >> I would suggest doing this before we shutdown cpus. > > This is the plan. In fact, I'm going to do this in the next patch after the > $subject one has been tested and found acceptable. Good to hear. Then let's please get a version of the irq disable that calls shutdown, so we can be certain we don't have hardware irqs in flight. For the drivers it should not matter for clean cpu shutdown it will. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 15:28 ` Eric W. Biederman @ 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-23 21:39 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 21:39 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Monday 23 February 2009, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > >> I don't know where in the state machine this is getting called but > >> I would suggest doing this before we shutdown cpus. > > > > This is the plan. In fact, I'm going to do this in the next patch after the > > $subject one has been tested and found acceptable. > > Good to hear. Then let's please get a version of the irq disable that calls > shutdown, so we can be certain we don't have hardware irqs in flight. > > For the drivers it should not matter for clean cpu shutdown it will. OK, I will. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 21:39 ` Rafael J. Wysocki @ 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-24 3:30 ` Eric W. Biederman 2009-02-24 3:30 ` Eric W. Biederman 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 21:39 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > >> I don't know where in the state machine this is getting called but > >> I would suggest doing this before we shutdown cpus. > > > > This is the plan. In fact, I'm going to do this in the next patch after the > > $subject one has been tested and found acceptable. > > Good to hear. Then let's please get a version of the irq disable that calls > shutdown, so we can be certain we don't have hardware irqs in flight. > > For the drivers it should not matter for clean cpu shutdown it will. OK, I will. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 21:39 ` Rafael J. Wysocki @ 2009-02-24 3:30 ` Eric W. Biederman 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 3:30 ` Eric W. Biederman 1 sibling, 2 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-24 3:30 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner "Rafael J. Wysocki" <rjw@sisk.pl> writes: > On Monday 23 February 2009, Eric W. Biederman wrote: >> "Rafael J. Wysocki" <rjw@sisk.pl> writes: >> >> >> I don't know where in the state machine this is getting called but >> >> I would suggest doing this before we shutdown cpus. >> > >> > This is the plan. In fact, I'm going to do this in the next patch after the >> > $subject one has been tested and found acceptable. >> >> Good to hear. Then let's please get a version of the irq disable that calls >> shutdown, so we can be certain we don't have hardware irqs in flight. >> >> For the drivers it should not matter for clean cpu shutdown it will. > > OK, I will. My apologies I was wrong. Calling shutdown is not safe. I just remembered that masking an ioapic from anywhere besides the irq handler can lock the ioapic state machine, and lead to non-recoverable interrupts. It is rare but I have seen it happen. I wanted to figure out how to migrate interrupts outside of interrupt context and this was what prevented me. A suspend/resume cycle might be enough of a reset to get the ioapic out of that state but I don't know. The only safe way on x86 to shutdown a level triggered ioapic irq outside of irq context is for the driver to program the hardware to not generate an irq. Therefore doing anything with the irqs at the point where we are suspending them is a formality, and perhaps simply code that ensures in-flight irqs don't make it past a certain point. I believe we just need to call disable() and print a big nasty warning if any irq comes in after the suspend stage. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 3:30 ` Eric W. Biederman @ 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 22:51 ` Linus Torvalds ` (3 more replies) 2009-02-24 22:42 ` Rafael J. Wysocki 1 sibling, 4 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 22:42 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Tuesday 24 February 2009, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > On Monday 23 February 2009, Eric W. Biederman wrote: > >> "Rafael J. Wysocki" <rjw@sisk.pl> writes: > >> > >> >> I don't know where in the state machine this is getting called but > >> >> I would suggest doing this before we shutdown cpus. > >> > > >> > This is the plan. In fact, I'm going to do this in the next patch after the > >> > $subject one has been tested and found acceptable. > >> > >> Good to hear. Then let's please get a version of the irq disable that calls > >> shutdown, so we can be certain we don't have hardware irqs in flight. > >> > >> For the drivers it should not matter for clean cpu shutdown it will. > > > > OK, I will. > > My apologies I was wrong. Calling shutdown is not safe. > > I just remembered that masking an ioapic from anywhere besides the > irq handler can lock the ioapic state machine, and lead to non-recoverable > interrupts. It is rare but I have seen it happen. I wanted to figure out > how to migrate interrupts outside of interrupt context and this was what > prevented me. A suspend/resume cycle might be enough of a reset to > get the ioapic out of that state but I don't know. > > The only safe way on x86 to shutdown a level triggered ioapic irq > outside of irq context is for the driver to program the hardware to > not generate an irq. Well, that changes things quite a bit, because it means we can't change the suspend-resume sequence in a way we thought we could without fixing all drivers first, but this is exactly what we'd like to avoid by changing the core. I think the most important source of level triggered interrupts are PCI devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device Control register to prevent devices from generating INTx after the drivers' suspend routines have been executed? > Therefore doing anything with the irqs at the point where we are > suspending them is a formality, and perhaps simply code that ensures > in-flight irqs don't make it past a certain point. > > I believe we just need to call disable() and print a big nasty warning > if any irq comes in after the suspend stage. At the moment we're safe, since PCI devices are put into low power states in the suspend stage. However, we'd like to make that happen in the "late suspend" stage to avoid a problem with a shared interrupt occuring after one of the devices using it has been suspended and its driver's irq handler can't cope with that. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:42 ` Rafael J. Wysocki @ 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 22:51 ` Linus Torvalds ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-24 22:51 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > outside of irq context is for the driver to program the hardware to > > not generate an irq. > > Well, that changes things quite a bit, because it means we can't change the > suspend-resume sequence in a way we thought we could without fixing all > drivers first, but this is exactly what we'd like to avoid by changing the > core. Calling "disable_irq()" is perfectly fine. What is not possible on that broken IO-APIC (among other things) is to actually turn the interrupts off at the apic (ie the whole ->shutdown() thing). But that's not what we even want to do. What we care about is just disabling the interrupt from a drievr perspective. IOW, the patches I have seen are fine, and all the comments from Eric are just confusion about what we want done. WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but that's totally unrelated to this whole "suspend_device_irq()" thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 22:51 ` Linus Torvalds @ 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 23:07 ` Rafael J. Wysocki ` (3 more replies) 2009-02-25 15:32 ` Alan Stern 2009-02-25 15:32 ` [linux-pm] " Alan Stern 3 siblings, 4 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-24 22:51 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Eric W. Biederman, Ingo Molnar, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > outside of irq context is for the driver to program the hardware to > > not generate an irq. > > Well, that changes things quite a bit, because it means we can't change the > suspend-resume sequence in a way we thought we could without fixing all > drivers first, but this is exactly what we'd like to avoid by changing the > core. Calling "disable_irq()" is perfectly fine. What is not possible on that broken IO-APIC (among other things) is to actually turn the interrupts off at the apic (ie the whole ->shutdown() thing). But that's not what we even want to do. What we care about is just disabling the interrupt from a drievr perspective. IOW, the patches I have seen are fine, and all the comments from Eric are just confusion about what we want done. WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but that's totally unrelated to this whole "suspend_device_irq()" thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:51 ` Linus Torvalds @ 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:07 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 23:07 UTC (permalink / raw) To: Linus Torvalds Cc: Eric W. Biederman, Ingo Molnar, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Tuesday 24 February 2009, Linus Torvalds wrote: > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > outside of irq context is for the driver to program the hardware to > > > not generate an irq. > > > > Well, that changes things quite a bit, because it means we can't change the > > suspend-resume sequence in a way we thought we could without fixing all > > drivers first, but this is exactly what we'd like to avoid by changing the > > core. > > Calling "disable_irq()" is perfectly fine. > > What is not possible on that broken IO-APIC (among other things) is to > actually turn the interrupts off at the apic (ie the whole ->shutdown() > thing). But that's not what we even want to do. What we care about is > just disabling the interrupt from a drievr perspective. > > IOW, the patches I have seen are fine, and all the comments from Eric are > just confusion about what we want done. Ah, OK. Thanks for the explanation, I got confused too. > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but > that's totally unrelated to this whole "suspend_device_irq()" thing. Yeah. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:07 ` Rafael J. Wysocki @ 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-24 23:09 ` Ingo Molnar 1 sibling, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-24 23:09 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > outside of irq context is for the driver to program the hardware to > > > > not generate an irq. > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > suspend-resume sequence in a way we thought we could without fixing all > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > core. > > > > Calling "disable_irq()" is perfectly fine. > > > > What is not possible on that broken IO-APIC (among other > > things) is to actually turn the interrupts off at the apic > > (ie the whole ->shutdown() thing). But that's not what we > > even want to do. What we care about is just disabling the > > interrupt from a drievr perspective. > > > > IOW, the patches I have seen are fine, and all the comments > > from Eric are just confusion about what we want done. > > Ah, OK. Thanks for the explanation, I got confused too. > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > happen later, but that's totally unrelated to this whole > > "suspend_device_irq()" thing. > > Yeah. We definitely dont want to turn off x86 IO-APICs - the timer IRQ goes via one of them. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:09 ` Ingo Molnar @ 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-24 23:29 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 23:29 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Wednesday 25 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > > outside of irq context is for the driver to program the hardware to > > > > > not generate an irq. > > > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > > suspend-resume sequence in a way we thought we could without fixing all > > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > > core. > > > > > > Calling "disable_irq()" is perfectly fine. > > > > > > What is not possible on that broken IO-APIC (among other > > > things) is to actually turn the interrupts off at the apic > > > (ie the whole ->shutdown() thing). But that's not what we > > > even want to do. What we care about is just disabling the > > > interrupt from a drievr perspective. > > > > > > IOW, the patches I have seen are fine, and all the comments > > > from Eric are just confusion about what we want done. > > > > Ah, OK. Thanks for the explanation, I got confused too. > > > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > > happen later, but that's totally unrelated to this whole > > > "suspend_device_irq()" thing. > > > > Yeah. > > We definitely dont want to turn off x86 IO-APICs - the timer IRQ > goes via one of them. No, we don't. At least not at this point. BTW, appended is the current (3rd) version of the $subject patch with some of your comments taken into account. In particular, I did the following: - moved [suspend|resume]_device_irqs() to a separate file (pm.c) - fixed interrupt.h so that their headers are at a better place - made enable_irq() clear IRQ_SUSPENDED - made device_power_down() and device_power_up() call suspend_device_irqs() and resume_device_irqs(), respectively, which simplified the callers quite a bit (it changed the Xen code ordering, though, but I _think_ it still should work). Please have a look. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 3) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++++-- drivers/base/power/main.c | 20 ++++++++------ drivers/xen/manage.c | 16 ++++++----- include/linux/interrupt.h | 4 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/manage.c | 3 +- kernel/irq/pm.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++--- kernel/power/disk.c | 39 +++++++++++++++++++++------- kernel/power/main.c | 17 ++++++++---- 11 files changed, 146 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,10 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following two functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,63 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and sets the IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && !(desc->status & IRQ_WAKEUP) + && desc->action && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() + * that have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + enable_irq(irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { - unsigned int status = desc->status & ~IRQ_DISABLED; + unsigned int status; + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:29 ` Rafael J. Wysocki @ 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-25 13:23 ` Ingo Molnar ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 23:29 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wednesday 25 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > > outside of irq context is for the driver to program the hardware to > > > > > not generate an irq. > > > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > > suspend-resume sequence in a way we thought we could without fixing all > > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > > core. > > > > > > Calling "disable_irq()" is perfectly fine. > > > > > > What is not possible on that broken IO-APIC (among other > > > things) is to actually turn the interrupts off at the apic > > > (ie the whole ->shutdown() thing). But that's not what we > > > even want to do. What we care about is just disabling the > > > interrupt from a drievr perspective. > > > > > > IOW, the patches I have seen are fine, and all the comments > > > from Eric are just confusion about what we want done. > > > > Ah, OK. Thanks for the explanation, I got confused too. > > > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > > happen later, but that's totally unrelated to this whole > > > "suspend_device_irq()" thing. > > > > Yeah. > > We definitely dont want to turn off x86 IO-APICs - the timer IRQ > goes via one of them. No, we don't. At least not at this point. BTW, appended is the current (3rd) version of the $subject patch with some of your comments taken into account. In particular, I did the following: - moved [suspend|resume]_device_irqs() to a separate file (pm.c) - fixed interrupt.h so that their headers are at a better place - made enable_irq() clear IRQ_SUSPENDED - made device_power_down() and device_power_up() call suspend_device_irqs() and resume_device_irqs(), respectively, which simplified the callers quite a bit (it changed the Xen code ordering, though, but I _think_ it still should work). Please have a look. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 3) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++++-- drivers/base/power/main.c | 20 ++++++++------ drivers/xen/manage.c | 16 ++++++----- include/linux/interrupt.h | 4 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/manage.c | 3 +- kernel/irq/pm.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++--- kernel/power/disk.c | 39 +++++++++++++++++++++------- kernel/power/main.c | 17 ++++++++---- 11 files changed, 146 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,10 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following two functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,63 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and sets the IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && !(desc->status & IRQ_WAKEUP) + && desc->action && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() + * that have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + enable_irq(irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { - unsigned int status = desc->status & ~IRQ_DISABLED; + unsigned int status; + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:29 ` Rafael J. Wysocki @ 2009-02-25 13:23 ` Ingo Molnar 2009-02-25 13:23 ` Ingo Molnar ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-25 13:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Wednesday 25 February 2009, Ingo Molnar wrote: > > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > > > outside of irq context is for the driver to program the hardware to > > > > > > not generate an irq. > > > > > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > > > suspend-resume sequence in a way we thought we could without fixing all > > > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > > > core. > > > > > > > > Calling "disable_irq()" is perfectly fine. > > > > > > > > What is not possible on that broken IO-APIC (among other > > > > things) is to actually turn the interrupts off at the apic > > > > (ie the whole ->shutdown() thing). But that's not what we > > > > even want to do. What we care about is just disabling the > > > > interrupt from a drievr perspective. > > > > > > > > IOW, the patches I have seen are fine, and all the comments > > > > from Eric are just confusion about what we want done. > > > > > > Ah, OK. Thanks for the explanation, I got confused too. > > > > > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > > > happen later, but that's totally unrelated to this whole > > > > "suspend_device_irq()" thing. > > > > > > Yeah. > > > > We definitely dont want to turn off x86 IO-APICs - the timer IRQ > > goes via one of them. > > No, we don't. At least not at this point. > > BTW, appended is the current (3rd) version of the $subject patch with some > of your comments taken into account. In particular, I did the following: > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > - fixed interrupt.h so that their headers are at a better place > - made enable_irq() clear IRQ_SUSPENDED > - made device_power_down() and device_power_up() call > suspend_device_irqs() and resume_device_irqs(), respectively, which > simplified the callers quite a bit (it changed the Xen code ordering, though, > but I _think_ it still should work). > > Please have a look. Looks good, thanks Rafael! Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-25 13:23 ` Ingo Molnar @ 2009-02-25 13:23 ` Ingo Molnar 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-26 1:17 ` Arve Hjønnevåg 3 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-25 13:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Wednesday 25 February 2009, Ingo Molnar wrote: > > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > > > outside of irq context is for the driver to program the hardware to > > > > > > not generate an irq. > > > > > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > > > suspend-resume sequence in a way we thought we could without fixing all > > > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > > > core. > > > > > > > > Calling "disable_irq()" is perfectly fine. > > > > > > > > What is not possible on that broken IO-APIC (among other > > > > things) is to actually turn the interrupts off at the apic > > > > (ie the whole ->shutdown() thing). But that's not what we > > > > even want to do. What we care about is just disabling the > > > > interrupt from a drievr perspective. > > > > > > > > IOW, the patches I have seen are fine, and all the comments > > > > from Eric are just confusion about what we want done. > > > > > > Ah, OK. Thanks for the explanation, I got confused too. > > > > > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > > > happen later, but that's totally unrelated to this whole > > > > "suspend_device_irq()" thing. > > > > > > Yeah. > > > > We definitely dont want to turn off x86 IO-APICs - the timer IRQ > > goes via one of them. > > No, we don't. At least not at this point. > > BTW, appended is the current (3rd) version of the $subject patch with some > of your comments taken into account. In particular, I did the following: > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > - fixed interrupt.h so that their headers are at a better place > - made enable_irq() clear IRQ_SUSPENDED > - made device_power_down() and device_power_up() call > suspend_device_irqs() and resume_device_irqs(), respectively, which > simplified the callers quite a bit (it changed the Xen code ordering, though, > but I _think_ it still should work). > > Please have a look. Looks good, thanks Rafael! Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-25 13:23 ` Ingo Molnar 2009-02-25 13:23 ` Ingo Molnar @ 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-26 1:27 ` Linus Torvalds ` (3 more replies) 2009-02-26 1:17 ` Arve Hjønnevåg 3 siblings, 4 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 1:17 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > BTW, appended is the current (3rd) version of the $subject patch with some > of your comments taken into account. In particular, I did the following: > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > - fixed interrupt.h so that their headers are at a better place > - made enable_irq() clear IRQ_SUSPENDED > - made device_power_down() and device_power_up() call > suspend_device_irqs() and resume_device_irqs(), respectively, which > simplified the callers quite a bit (it changed the Xen code ordering, though, > but I _think_ it still should work). Do you plan to fix edge triggered wakeup interrupts? It still looks like edge triggered wakeup interrupts that occur between suspend_device_irqs and local_irq_disable will not cause a wakeup. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:17 ` Arve Hjønnevåg @ 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 1:27 ` Linus Torvalds ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 1:27 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > Do you plan to fix edge triggered wakeup interrupts? It still looks > like edge triggered wakeup interrupts that occur between > suspend_device_irqs and local_irq_disable will not cause a wakeup. IF we ever see this as a real issue, we can either see it in the IRQ_PENDING flag, or we can mark such interrupts specially. So it would be solvable. That said, I haven't actually heard any real usage cases. Normal wakeup events are _not_ interrupts in the regular "device interrupt controller" sense. So can you actually point to an explicit example of something where this is a real issue? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-26 1:27 ` Linus Torvalds @ 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 9:50 ` Rafael J. Wysocki 2009-02-26 9:50 ` Rafael J. Wysocki 3 siblings, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 1:27 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > Do you plan to fix edge triggered wakeup interrupts? It still looks > like edge triggered wakeup interrupts that occur between > suspend_device_irqs and local_irq_disable will not cause a wakeup. IF we ever see this as a real issue, we can either see it in the IRQ_PENDING flag, or we can mark such interrupts specially. So it would be solvable. That said, I haven't actually heard any real usage cases. Normal wakeup events are _not_ interrupts in the regular "device interrupt controller" sense. So can you actually point to an explicit example of something where this is a real issue? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:27 ` Linus Torvalds @ 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 2:13 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 2:13 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, Feb 25, 2009 at 5:27 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> Do you plan to fix edge triggered wakeup interrupts? It still looks >> like edge triggered wakeup interrupts that occur between >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > IF we ever see this as a real issue, we can either see it in the > IRQ_PENDING flag, or we can mark such interrupts specially. So it would be > solvable. That said, I haven't actually heard any real usage cases. Normal > wakeup events are _not_ interrupts in the regular "device interrupt > controller" sense. > > So can you actually point to an explicit example of something where this > is a real issue? On the msm platform the keyboard driver currently leave the interrupts enabled when suspended. If the interrupt handler is called, we use a wakelock to abort suspend (without wakelocks you would need to set a flag and abort in suspend_late instead). If the interrupt occurs after local_irq_disable, it will still be pending when we get to the suspend enter hook and suspend will be aborted there. As far as I can tell, this change breaks this. If you press a key at the right time, it will be ignored. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 2:13 ` Arve Hjønnevåg @ 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 2:51 ` Linus Torvalds 1 sibling, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 2:51 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > On the msm platform the keyboard driver currently leave the interrupts > enabled when suspended. If the interrupt handler is called, we use a > wakelock to abort suspend (without wakelocks you would need to set a > flag and abort in suspend_late instead). If the interrupt occurs after > local_irq_disable, it will still be pending when we get to the suspend > enter hook and suspend will be aborted there. > > As far as I can tell, this change breaks this. If you press a key at > the right time, it will be ignored. Is the irq on a private non-shared interrupt line? If so, you could just mark it as IRQF_TIMER, and the irq disable logic won't touch it. What keyboard driver does this mfm thing, btw? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 2:51 ` Linus Torvalds @ 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:00 ` Ingo Molnar 1 sibling, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-26 3:00 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > > > On the msm platform the keyboard driver currently leave the interrupts > > enabled when suspended. If the interrupt handler is called, we use a > > wakelock to abort suspend (without wakelocks you would need to set a > > flag and abort in suspend_late instead). If the interrupt occurs after > > local_irq_disable, it will still be pending when we get to the suspend > > enter hook and suspend will be aborted there. > > > > As far as I can tell, this change breaks this. If you press a key at > > the right time, it will be ignored. > > Is the irq on a private non-shared interrupt line? If so, you > could just mark it as IRQF_TIMER, and the irq disable logic > won't touch it. Hm, if that solves the problem then it would be nice to have a new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: ./interrupt.h: * IRQF_TIMER - Flag to mark this interrupt as timer interrupt ./interrupt.h:#define IRQF_TIMER 0x00000200 to express such quirks cleanly. and the suspend code can check the (IRQF_TIMER | IRQF_NO_SUSPEND) mask - so no extra cost. Right now we have a clean enumeration of timer interrupts, would be nice to keep that. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 3:00 ` Ingo Molnar @ 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:31 ` Arve Hjønnevåg 2009-02-26 3:31 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-26 3:00 UTC (permalink / raw) To: Linus Torvalds Cc: Arve Hjønnevåg, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > > > On the msm platform the keyboard driver currently leave the interrupts > > enabled when suspended. If the interrupt handler is called, we use a > > wakelock to abort suspend (without wakelocks you would need to set a > > flag and abort in suspend_late instead). If the interrupt occurs after > > local_irq_disable, it will still be pending when we get to the suspend > > enter hook and suspend will be aborted there. > > > > As far as I can tell, this change breaks this. If you press a key at > > the right time, it will be ignored. > > Is the irq on a private non-shared interrupt line? If so, you > could just mark it as IRQF_TIMER, and the irq disable logic > won't touch it. Hm, if that solves the problem then it would be nice to have a new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: ./interrupt.h: * IRQF_TIMER - Flag to mark this interrupt as timer interrupt ./interrupt.h:#define IRQF_TIMER 0x00000200 to express such quirks cleanly. and the suspend code can check the (IRQF_TIMER | IRQF_NO_SUSPEND) mask - so no extra cost. Right now we have a clean enumeration of timer interrupts, would be nice to keep that. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:00 ` Ingo Molnar @ 2009-02-26 3:31 ` Arve Hjønnevåg 2009-02-26 3:31 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 3:31 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Wed, Feb 25, 2009 at 7:00 PM, Ingo Molnar <mingo@elte.hu> wrote: > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > >> >> >> On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> > >> > On the msm platform the keyboard driver currently leave the interrupts >> > enabled when suspended. If the interrupt handler is called, we use a >> > wakelock to abort suspend (without wakelocks you would need to set a >> > flag and abort in suspend_late instead). If the interrupt occurs after >> > local_irq_disable, it will still be pending when we get to the suspend >> > enter hook and suspend will be aborted there. >> > >> > As far as I can tell, this change breaks this. If you press a key at >> > the right time, it will be ignored. >> >> Is the irq on a private non-shared interrupt line? If so, you >> could just mark it as IRQF_TIMER, and the irq disable logic >> won't touch it. That would not work without wakelocks support, since the interrupt could occur after suspend_late which is the last chance for the driver to abort sleep. (The patch also breaks my current wakelock implementation since I use a suspend_late hook to abort sleep, but this should be easy to fix) > Hm, if that solves the problem then it would be nice to have a > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: I think the right fix is for any interrupt that has IRQ_WAKEUP set to abort suspend if it is pending. I don't know if anyone relies on these interrupts being dropped now though. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:31 ` Arve Hjønnevåg @ 2009-02-26 3:31 ` Arve Hjønnevåg 2009-02-26 3:37 ` Linus Torvalds 1 sibling, 1 reply; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 3:31 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, Feb 25, 2009 at 7:00 PM, Ingo Molnar <mingo@elte.hu> wrote: > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > >> >> >> On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> > >> > On the msm platform the keyboard driver currently leave the interrupts >> > enabled when suspended. If the interrupt handler is called, we use a >> > wakelock to abort suspend (without wakelocks you would need to set a >> > flag and abort in suspend_late instead). If the interrupt occurs after >> > local_irq_disable, it will still be pending when we get to the suspend >> > enter hook and suspend will be aborted there. >> > >> > As far as I can tell, this change breaks this. If you press a key at >> > the right time, it will be ignored. >> >> Is the irq on a private non-shared interrupt line? If so, you >> could just mark it as IRQF_TIMER, and the irq disable logic >> won't touch it. That would not work without wakelocks support, since the interrupt could occur after suspend_late which is the last chance for the driver to abort sleep. (The patch also breaks my current wakelock implementation since I use a suspend_late hook to abort sleep, but this should be easy to fix) > Hm, if that solves the problem then it would be nice to have a > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: I think the right fix is for any interrupt that has IRQ_WAKEUP set to abort suspend if it is pending. I don't know if anyone relies on these interrupts being dropped now though. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:31 ` Arve Hjønnevåg @ 2009-02-26 3:37 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 3:37 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Ingo Molnar, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > That would not work without wakelocks support, since the interrupt > could occur after suspend_late which is the last chance for the driver > to abort sleep. (The patch also breaks my current wakelock > implementation since I use a suspend_late hook to abort sleep, but > this should be easy to fix) Since this must be some very deep arch-specific thing anyway, just make the dang thing be a "sysdev". At that point, its "suspend" function gets called way later (at which point CPU interrupts are off). > > Hm, if that solves the problem then it would be nice to have a > > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: > > I think the right fix is for any interrupt that has IRQ_WAKEUP set to > abort suspend if it is pending. I don't know if anyone relies on these > interrupts being dropped now though. We could add something like that, but quite frankly, I'd hate to unless there is some seriously common case. If it's just an oddball hacky special case, it's easier to just say "hey, you have that crazy system device, you handle it yourself". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-26 3:37 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 3:37 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > That would not work without wakelocks support, since the interrupt > could occur after suspend_late which is the last chance for the driver > to abort sleep. (The patch also breaks my current wakelock > implementation since I use a suspend_late hook to abort sleep, but > this should be easy to fix) Since this must be some very deep arch-specific thing anyway, just make the dang thing be a "sysdev". At that point, its "suspend" function gets called way later (at which point CPU interrupts are off). > > Hm, if that solves the problem then it would be nice to have a > > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: > > I think the right fix is for any interrupt that has IRQ_WAKEUP set to > abort suspend if it is pending. I don't know if anyone relies on these > interrupts being dropped now though. We could add something like that, but quite frankly, I'd hate to unless there is some seriously common case. If it's just an oddball hacky special case, it's easier to just say "hey, you have that crazy system device, you handle it yourself". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:37 ` Linus Torvalds (?) @ 2009-02-26 3:50 ` Arve Hjønnevåg -1 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 3:50 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, Feb 25, 2009 at 7:37 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> That would not work without wakelocks support, since the interrupt >> could occur after suspend_late which is the last chance for the driver >> to abort sleep. (The patch also breaks my current wakelock >> implementation since I use a suspend_late hook to abort sleep, but >> this should be easy to fix) > > Since this must be some very deep arch-specific thing anyway, just make > the dang thing be a "sysdev". At that point, its "suspend" function gets > called way later (at which point CPU interrupts are off). Wakelocks can use a sysdev, but I don't think a keyboard driver should be a sysdev. > >> > Hm, if that solves the problem then it would be nice to have a >> > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: >> >> I think the right fix is for any interrupt that has IRQ_WAKEUP set to >> abort suspend if it is pending. I don't know if anyone relies on these >> interrupts being dropped now though. > > We could add something like that, but quite frankly, I'd hate to unless > there is some seriously common case. If it's just an oddball hacky special > case, it's easier to just say "hey, you have that crazy system device, you > handle it yourself". I don't think this is a oddball case. It is very common to connect keys or keypads to gpios. If these keys are wakeup keys, it is not OK to loose interrupts during the suspend phase. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:37 ` Linus Torvalds (?) (?) @ 2009-02-26 3:50 ` Arve Hjønnevåg 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 3:57 ` Linus Torvalds -1 siblings, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 3:50 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, Feb 25, 2009 at 7:37 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> That would not work without wakelocks support, since the interrupt >> could occur after suspend_late which is the last chance for the driver >> to abort sleep. (The patch also breaks my current wakelock >> implementation since I use a suspend_late hook to abort sleep, but >> this should be easy to fix) > > Since this must be some very deep arch-specific thing anyway, just make > the dang thing be a "sysdev". At that point, its "suspend" function gets > called way later (at which point CPU interrupts are off). Wakelocks can use a sysdev, but I don't think a keyboard driver should be a sysdev. > >> > Hm, if that solves the problem then it would be nice to have a >> > new IRQF_NO_SUSPEND flag for it, in addition to IRQF_TIMER: >> >> I think the right fix is for any interrupt that has IRQ_WAKEUP set to >> abort suspend if it is pending. I don't know if anyone relies on these >> interrupts being dropped now though. > > We could add something like that, but quite frankly, I'd hate to unless > there is some seriously common case. If it's just an oddball hacky special > case, it's easier to just say "hey, you have that crazy system device, you > handle it yourself". I don't think this is a oddball case. It is very common to connect keys or keypads to gpios. If these keys are wakeup keys, it is not OK to loose interrupts during the suspend phase. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:50 ` Arve Hjønnevåg @ 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 3:57 ` Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 3:57 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > I don't think this is a oddball case. It is very common to connect > keys or keypads to gpios. If these keys are wakeup keys, it is not OK > to loose interrupts during the suspend phase. .. and how many drivers is that? Is it one or two "gpio input drivers" or is it a hundred? The "common" is not so much about "how many machines", but "in how many drivers would you actually do this". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:50 ` Arve Hjønnevåg 2009-02-26 3:57 ` Linus Torvalds @ 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:13 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 3:57 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Ingo Molnar, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > I don't think this is a oddball case. It is very common to connect > keys or keypads to gpios. If these keys are wakeup keys, it is not OK > to loose interrupts during the suspend phase. .. and how many drivers is that? Is it one or two "gpio input drivers" or is it a hundred? The "common" is not so much about "how many machines", but "in how many drivers would you actually do this". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:57 ` Linus Torvalds @ 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:13 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 4:13 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, Feb 25, 2009 at 7:57 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> I don't think this is a oddball case. It is very common to connect >> keys or keypads to gpios. If these keys are wakeup keys, it is not OK >> to loose interrupts during the suspend phase. > > .. and how many drivers is that? Is it one or two "gpio input drivers" or > is it a hundred? > > The "common" is not so much about "how many machines", but "in how many > drivers would you actually do this". We only have one gpio input driver, but I don't think is good to loose any wakeup interrupts. Any driver that needs an edge triggered wakeup interrupt will have problems if the hardware does not regenerate the interrupt when the host does not respond. It is not hard to work around this problem in the platform specific interrupt code, but I think it is a generic problem worth fixing for every platform. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 4:13 ` Arve Hjønnevåg @ 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:20 ` Eric W. Biederman 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 4:13 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Rafael J. Wysocki, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, Feb 25, 2009 at 7:57 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> I don't think this is a oddball case. It is very common to connect >> keys or keypads to gpios. If these keys are wakeup keys, it is not OK >> to loose interrupts during the suspend phase. > > .. and how many drivers is that? Is it one or two "gpio input drivers" or > is it a hundred? > > The "common" is not so much about "how many machines", but "in how many > drivers would you actually do this". We only have one gpio input driver, but I don't think is good to loose any wakeup interrupts. Any driver that needs an edge triggered wakeup interrupt will have problems if the hardware does not regenerate the interrupt when the host does not respond. It is not hard to work around this problem in the platform specific interrupt code, but I think it is a generic problem worth fixing for every platform. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 4:13 ` Arve Hjønnevåg @ 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:20 ` Eric W. Biederman 1 sibling, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-26 4:20 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list Arve Hjønnevåg <arve@android.com> writes: > We only have one gpio input driver, but I don't think is good to loose > any wakeup interrupts. Any driver that needs an edge triggered wakeup > interrupt will have problems if the hardware does not regenerate the > interrupt when the host does not respond. We are not loosing interrupts. The normal implementation of disable is a software disable and sets IRQ_PENDING to ensure we don't loose interrupts when the interrupt is disabled. Eric _______________________________________________ linux-pm mailing list linux-pm@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/linux-pm ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:20 ` Eric W. Biederman @ 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:24 ` Arve Hjønnevåg 2009-02-26 4:24 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-26 4:20 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Linus Torvalds, Ingo Molnar, Rafael J. Wysocki, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Arve Hjønnevåg <arve@android.com> writes: > We only have one gpio input driver, but I don't think is good to loose > any wakeup interrupts. Any driver that needs an edge triggered wakeup > interrupt will have problems if the hardware does not regenerate the > interrupt when the host does not respond. We are not loosing interrupts. The normal implementation of disable is a software disable and sets IRQ_PENDING to ensure we don't loose interrupts when the interrupt is disabled. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 4:20 ` Eric W. Biederman @ 2009-02-26 4:24 ` Arve Hjønnevåg 2009-02-26 4:24 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 4:24 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Wed, Feb 25, 2009 at 8:20 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Arve Hjønnevåg <arve@android.com> writes: > >> We only have one gpio input driver, but I don't think is good to loose >> any wakeup interrupts. Any driver that needs an edge triggered wakeup >> interrupt will have problems if the hardware does not regenerate the >> interrupt when the host does not respond. > > We are not loosing interrupts. The normal implementation of disable > is a software disable and sets IRQ_PENDING to ensure we don't loose > interrupts when the interrupt is disabled. We loose the wakeup, but yes, the interrupt will be delivered if the system wakes up for any other reason. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:24 ` Arve Hjønnevåg @ 2009-02-26 4:24 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 4:24 UTC (permalink / raw) To: Eric W. Biederman Cc: Linus Torvalds, Ingo Molnar, Rafael J. Wysocki, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Wed, Feb 25, 2009 at 8:20 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Arve Hjønnevåg <arve@android.com> writes: > >> We only have one gpio input driver, but I don't think is good to loose >> any wakeup interrupts. Any driver that needs an edge triggered wakeup >> interrupt will have problems if the hardware does not regenerate the >> interrupt when the host does not respond. > > We are not loosing interrupts. The normal implementation of disable > is a software disable and sets IRQ_PENDING to ensure we don't loose > interrupts when the interrupt is disabled. We loose the wakeup, but yes, the interrupt will be delivered if the system wakes up for any other reason. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 2:51 ` Linus Torvalds @ 2009-02-26 2:51 ` Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 2:51 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: > > On the msm platform the keyboard driver currently leave the interrupts > enabled when suspended. If the interrupt handler is called, we use a > wakelock to abort suspend (without wakelocks you would need to set a > flag and abort in suspend_late instead). If the interrupt occurs after > local_irq_disable, it will still be pending when we get to the suspend > enter hook and suspend will be aborted there. > > As far as I can tell, this change breaks this. If you press a key at > the right time, it will be ignored. Is the irq on a private non-shared interrupt line? If so, you could just mark it as IRQF_TIMER, and the irq disable logic won't touch it. What keyboard driver does this mfm thing, btw? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 2:13 ` Arve Hjønnevåg @ 2009-02-26 2:13 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 2:13 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Wed, Feb 25, 2009 at 5:27 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 25 Feb 2009, Arve Hjønnevåg wrote: >> >> Do you plan to fix edge triggered wakeup interrupts? It still looks >> like edge triggered wakeup interrupts that occur between >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > IF we ever see this as a real issue, we can either see it in the > IRQ_PENDING flag, or we can mark such interrupts specially. So it would be > solvable. That said, I haven't actually heard any real usage cases. Normal > wakeup events are _not_ interrupts in the regular "device interrupt > controller" sense. > > So can you actually point to an explicit example of something where this > is a real issue? On the msm platform the keyboard driver currently leave the interrupts enabled when suspended. If the interrupt handler is called, we use a wakelock to abort suspend (without wakelocks you would need to set a flag and abort in suspend_late instead). If the interrupt occurs after local_irq_disable, it will still be pending when we get to the suspend enter hook and suspend will be aborted there. As far as I can tell, this change breaks this. If you press a key at the right time, it will be ignored. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 1:27 ` Linus Torvalds @ 2009-02-26 9:50 ` Rafael J. Wysocki 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 9:50 ` Rafael J. Wysocki 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 9:50 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > BTW, appended is the current (3rd) version of the $subject patch with some > > of your comments taken into account. In particular, I did the following: > > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > > - fixed interrupt.h so that their headers are at a better place > > - made enable_irq() clear IRQ_SUSPENDED > > - made device_power_down() and device_power_up() call > > suspend_device_irqs() and resume_device_irqs(), respectively, which > > simplified the callers quite a bit (it changed the Xen code ordering, though, > > but I _think_ it still should work). > > Do you plan to fix edge triggered wakeup interrupts? It still looks > like edge triggered wakeup interrupts that occur between > suspend_device_irqs and local_irq_disable will not cause a wakeup. In the current version of the patch the interrupts that have IRQ_WAKEUP set in status are not disabled. Is this not enough? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 9:50 ` Rafael J. Wysocki @ 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 20:57 ` Benjamin Herrenschmidt ` (2 more replies) 2009-02-26 20:34 ` Arve Hjønnevåg 1 sibling, 3 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 20:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, Feb 26, 2009 at 1:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 26 February 2009, Arve Hjønnevåg wrote: >> On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > BTW, appended is the current (3rd) version of the $subject patch with some >> > of your comments taken into account. In particular, I did the following: >> > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) >> > - fixed interrupt.h so that their headers are at a better place >> > - made enable_irq() clear IRQ_SUSPENDED >> > - made device_power_down() and device_power_up() call >> > suspend_device_irqs() and resume_device_irqs(), respectively, which >> > simplified the callers quite a bit (it changed the Xen code ordering, though, >> > but I _think_ it still should work). >> >> Do you plan to fix edge triggered wakeup interrupts? It still looks >> like edge triggered wakeup interrupts that occur between >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > In the current version of the patch the interrupts that have IRQ_WAKEUP set > in status are not disabled. Is this not enough? That is enough for drivers that use wakelocks to abort suspend (if I fix the wakelock code to not use a platform device as its last abort point). It is not enough if you don't have wakelocks, since the interrupt can occur after suspend_late has been called and the driver has no way to abort suspend. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 20:34 ` Arve Hjønnevåg @ 2009-02-26 20:57 ` Benjamin Herrenschmidt 2009-02-26 21:58 ` Rafael J. Wysocki 2009-02-26 21:58 ` Rafael J. Wysocki 2 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-26 20:57 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: > That is enough for drivers that use wakelocks to abort suspend (if I > fix the wakelock code to not use a platform device as its last abort > point). It is not enough if you don't have wakelocks, since the > interrupt can occur after suspend_late has been called and the driver > has no way to abort suspend. > I still don't quite see how you deal with the race anyway. Ie. Even without Rafael patch, what if the interrupt occurs after your sysdev suspend ? In general, unless they are level sensitive, wakeup interrupts tend to always be somewhat racy. Cheers, Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-26 20:57 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-26 20:57 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: > That is enough for drivers that use wakelocks to abort suspend (if I > fix the wakelock code to not use a platform device as its last abort > point). It is not enough if you don't have wakelocks, since the > interrupt can occur after suspend_late has been called and the driver > has no way to abort suspend. > I still don't quite see how you deal with the race anyway. Ie. Even without Rafael patch, what if the interrupt occurs after your sysdev suspend ? In general, unless they are level sensitive, wakeup interrupts tend to always be somewhat racy. Cheers, Ben. _______________________________________________ linux-pm mailing list linux-pm@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/linux-pm ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 20:57 ` Benjamin Herrenschmidt (?) @ 2009-02-26 21:20 ` Arve Hjønnevåg 2009-02-26 21:49 ` Benjamin Herrenschmidt 2009-02-26 21:49 ` Benjamin Herrenschmidt -1 siblings, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 21:20 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Rafael J. Wysocki, Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, Feb 26, 2009 at 12:57 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: >> That is enough for drivers that use wakelocks to abort suspend (if I >> fix the wakelock code to not use a platform device as its last abort >> point). It is not enough if you don't have wakelocks, since the >> interrupt can occur after suspend_late has been called and the driver >> has no way to abort suspend. >> > I still don't quite see how you deal with the race anyway. Ie. Even > without Rafael patch, what if the interrupt occurs after your sysdev > suspend ? After local_irq_disable has been called, the interrupt will no longer be cleared by Linux when it occurs. This means that is still pending when you get to the low level suspend code which will prevent suspend. > In general, unless they are level sensitive, wakeup interrupts tend to > always be somewhat racy. They don't have to be. If you have a separate hardware component that tracks wakeup interrupts, you need to start this before you stop the main interrupt controller. If any interrupts are pending at this time you abort suspend. After a wakeup you do the reverse. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 21:20 ` Arve Hjønnevåg @ 2009-02-26 21:49 ` Benjamin Herrenschmidt 2009-02-26 21:49 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-26 21:49 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, 2009-02-26 at 13:20 -0800, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 12:57 PM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: > >> That is enough for drivers that use wakelocks to abort suspend (if I > >> fix the wakelock code to not use a platform device as its last abort > >> point). It is not enough if you don't have wakelocks, since the > >> interrupt can occur after suspend_late has been called and the driver > >> has no way to abort suspend. > >> > > I still don't quite see how you deal with the race anyway. Ie. Even > > without Rafael patch, what if the interrupt occurs after your sysdev > > suspend ? > > After local_irq_disable has been called, the interrupt will no longer > be cleared by Linux when it occurs. This means that is still pending > when you get to the low level suspend code which will prevent suspend. Ok so you want this interrupt to stay pending at the PIC level ? So just marking it so the kernel doesn't disable it should do the trick. > > In general, unless they are level sensitive, wakeup interrupts tend to > > always be somewhat racy. > > They don't have to be. If you have a separate hardware component that > tracks wakeup interrupts, you need to start this before you stop the > main interrupt controller. If any interrupts are pending at this time > you abort suspend. After a wakeup you do the reverse. Right but then you can start this earlier and there is no problem. But if you do want the interrupt to remaining pending in the PIC, then you probably need to set that magic flag so we don't disable it, that should do the trick just fine no ? It's hard to tell without more detailed HW specs of course... Ben. _______________________________________________ linux-pm mailing list linux-pm@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/linux-pm ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 21:20 ` Arve Hjønnevåg 2009-02-26 21:49 ` Benjamin Herrenschmidt @ 2009-02-26 21:49 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-26 21:49 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, 2009-02-26 at 13:20 -0800, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 12:57 PM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: > >> That is enough for drivers that use wakelocks to abort suspend (if I > >> fix the wakelock code to not use a platform device as its last abort > >> point). It is not enough if you don't have wakelocks, since the > >> interrupt can occur after suspend_late has been called and the driver > >> has no way to abort suspend. > >> > > I still don't quite see how you deal with the race anyway. Ie. Even > > without Rafael patch, what if the interrupt occurs after your sysdev > > suspend ? > > After local_irq_disable has been called, the interrupt will no longer > be cleared by Linux when it occurs. This means that is still pending > when you get to the low level suspend code which will prevent suspend. Ok so you want this interrupt to stay pending at the PIC level ? So just marking it so the kernel doesn't disable it should do the trick. > > In general, unless they are level sensitive, wakeup interrupts tend to > > always be somewhat racy. > > They don't have to be. If you have a separate hardware component that > tracks wakeup interrupts, you need to start this before you stop the > main interrupt controller. If any interrupts are pending at this time > you abort suspend. After a wakeup you do the reverse. Right but then you can start this earlier and there is no problem. But if you do want the interrupt to remaining pending in the PIC, then you probably need to set that magic flag so we don't disable it, that should do the trick just fine no ? It's hard to tell without more detailed HW specs of course... Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 20:57 ` Benjamin Herrenschmidt (?) (?) @ 2009-02-26 21:20 ` Arve Hjønnevåg -1 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 21:20 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, Feb 26, 2009 at 12:57 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Thu, 2009-02-26 at 12:34 -0800, Arve Hjønnevåg wrote: >> That is enough for drivers that use wakelocks to abort suspend (if I >> fix the wakelock code to not use a platform device as its last abort >> point). It is not enough if you don't have wakelocks, since the >> interrupt can occur after suspend_late has been called and the driver >> has no way to abort suspend. >> > I still don't quite see how you deal with the race anyway. Ie. Even > without Rafael patch, what if the interrupt occurs after your sysdev > suspend ? After local_irq_disable has been called, the interrupt will no longer be cleared by Linux when it occurs. This means that is still pending when you get to the low level suspend code which will prevent suspend. > In general, unless they are level sensitive, wakeup interrupts tend to > always be somewhat racy. They don't have to be. If you have a separate hardware component that tracks wakeup interrupts, you need to start this before you stop the main interrupt controller. If any interrupts are pending at this time you abort suspend. After a wakeup you do the reverse. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 20:57 ` Benjamin Herrenschmidt @ 2009-02-26 21:58 ` Rafael J. Wysocki 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 21:58 ` Rafael J. Wysocki 2 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 21:58 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Ingo Molnar, Linus Torvalds, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 1:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Thursday 26 February 2009, Arve Hjønnevåg wrote: > >> On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > BTW, appended is the current (3rd) version of the $subject patch with some > >> > of your comments taken into account. In particular, I did the following: > >> > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > >> > - fixed interrupt.h so that their headers are at a better place > >> > - made enable_irq() clear IRQ_SUSPENDED > >> > - made device_power_down() and device_power_up() call > >> > suspend_device_irqs() and resume_device_irqs(), respectively, which > >> > simplified the callers quite a bit (it changed the Xen code ordering, though, > >> > but I _think_ it still should work). > >> > >> Do you plan to fix edge triggered wakeup interrupts? It still looks > >> like edge triggered wakeup interrupts that occur between > >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > > > In the current version of the patch the interrupts that have IRQ_WAKEUP set > > in status are not disabled. Is this not enough? > > That is enough for drivers that use wakelocks to abort suspend (if I > fix the wakelock code to not use a platform device as its last abort > point). It is not enough if you don't have wakelocks, since the > interrupt can occur after suspend_late has been called and the driver > has no way to abort suspend. Well, how exactly the $subject patch does cause this problem to happen? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 21:58 ` Rafael J. Wysocki @ 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:10 ` Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 22:10 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > > Well, how exactly the $subject patch does cause this problem to happen? Rafael, the problem is that if an interrupt happens while it's disabled - but before the CPU has actually turned all interrupts off - the CPU will ACK the interrupt (but just set a flag for it being PENDING), so now the chipset logic around it will not see it as pending any more, so now the chipset won't auto-wake the CPU immediately (or more likely, it won't even suspend it). It's trivial to fix multiple ways, so I wouldn't worry. The most trivial way is to just have some sysdev drievr code simply do something like static int sysdev_suspend() { for_each_irq(irq,desc) { if (!(desc->flags & IRQF_WAKE)) continue; if (desc->flags & IRQ_PENDING) return -EBUSY; } return 0; } and that should automatically mean that if any irq is pending, the suspend will fail and we'll immediately wake up again. It looks trivial, and I don't understand why Arve can't just do the sysdev thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 21:58 ` Rafael J. Wysocki 2009-02-26 22:10 ` Linus Torvalds @ 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:30 ` Arve Hjønnevåg ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-26 22:10 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arve Hjønnevåg, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > > Well, how exactly the $subject patch does cause this problem to happen? Rafael, the problem is that if an interrupt happens while it's disabled - but before the CPU has actually turned all interrupts off - the CPU will ACK the interrupt (but just set a flag for it being PENDING), so now the chipset logic around it will not see it as pending any more, so now the chipset won't auto-wake the CPU immediately (or more likely, it won't even suspend it). It's trivial to fix multiple ways, so I wouldn't worry. The most trivial way is to just have some sysdev drievr code simply do something like static int sysdev_suspend() { for_each_irq(irq,desc) { if (!(desc->flags & IRQF_WAKE)) continue; if (desc->flags & IRQ_PENDING) return -EBUSY; } return 0; } and that should automatically mean that if any irq is pending, the suspend will fail and we'll immediately wake up again. It looks trivial, and I don't understand why Arve can't just do the sysdev thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:10 ` Linus Torvalds @ 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 22:30 ` Arve Hjønnevåg ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 22:30 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: >> >> Well, how exactly the $subject patch does cause this problem to happen? > > Rafael, the problem is that if an interrupt happens while it's disabled - > but before the CPU has actually turned all interrupts off - the CPU will > ACK the interrupt (but just set a flag for it being PENDING), so now the > chipset logic around it will not see it as pending any more, so now the > chipset won't auto-wake the CPU immediately (or more likely, it won't > even suspend it). > > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > way is to just have some sysdev drievr code simply do something like > > static int sysdev_suspend() > { > for_each_irq(irq,desc) { > if (!(desc->flags & IRQF_WAKE)) > continue; > if (desc->flags & IRQ_PENDING) > return -EBUSY; > } > return 0; > } > > and that should automatically mean that if any irq is pending, the suspend > will fail and we'll immediately wake up again. > > It looks trivial, and I don't understand why Arve can't just do the sysdev > thing. I can. My point is that the patch breaks our existing code. If anyone else uses edge triggered wakeup interrupt it may break from them as well. The main question if this should be fixed separately for every platform that needs it, or if pending wakeup interrupts should always abort sleep. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:30 ` Arve Hjønnevåg @ 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-26 22:30 ` Rafael J. Wysocki 2009-02-26 22:30 ` Rafael J. Wysocki 3 siblings, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 22:30 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: >> >> Well, how exactly the $subject patch does cause this problem to happen? > > Rafael, the problem is that if an interrupt happens while it's disabled - > but before the CPU has actually turned all interrupts off - the CPU will > ACK the interrupt (but just set a flag for it being PENDING), so now the > chipset logic around it will not see it as pending any more, so now the > chipset won't auto-wake the CPU immediately (or more likely, it won't > even suspend it). > > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > way is to just have some sysdev drievr code simply do something like > > static int sysdev_suspend() > { > for_each_irq(irq,desc) { > if (!(desc->flags & IRQF_WAKE)) > continue; > if (desc->flags & IRQ_PENDING) > return -EBUSY; > } > return 0; > } > > and that should automatically mean that if any irq is pending, the suspend > will fail and we'll immediately wake up again. > > It looks trivial, and I don't understand why Arve can't just do the sysdev > thing. I can. My point is that the patch breaks our existing code. If anyone else uses edge triggered wakeup interrupt it may break from them as well. The main question if this should be fixed separately for every platform that needs it, or if pending wakeup interrupts should always abort sleep. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:30 ` Arve Hjønnevåg @ 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-26 23:10 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 23:10 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > >> > >> Well, how exactly the $subject patch does cause this problem to happen? > > > > Rafael, the problem is that if an interrupt happens while it's disabled - > > but before the CPU has actually turned all interrupts off - the CPU will > > ACK the interrupt (but just set a flag for it being PENDING), so now the > > chipset logic around it will not see it as pending any more, so now the > > chipset won't auto-wake the CPU immediately (or more likely, it won't > > even suspend it). > > > > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > > way is to just have some sysdev drievr code simply do something like > > > > static int sysdev_suspend() > > { > > for_each_irq(irq,desc) { > > if (!(desc->flags & IRQF_WAKE)) > > continue; > > if (desc->flags & IRQ_PENDING) > > return -EBUSY; > > } > > return 0; > > } > > > > and that should automatically mean that if any irq is pending, the suspend > > will fail and we'll immediately wake up again. > > > > It looks trivial, and I don't understand why Arve can't just do the sysdev > > thing. > > I can. My point is that the patch breaks our existing code. Is that a mainline kernel code? > If anyone else uses edge triggered wakeup interrupt it may break from them as > well. The main question if this should be fixed separately for every > platform that needs it, or if pending wakeup interrupts should always > abort sleep. Well, I'm not really sure if this is the problem. In fact the problem is that you have a regular device the interrupt of which can be a wake-up one. I think the problem wouldn't have existed at all if it had been a sysdev. Is that correct? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 23:10 ` Rafael J. Wysocki @ 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-27 0:00 ` Arve Hjønnevåg 2009-02-27 0:00 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 23:10 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > >> > >> Well, how exactly the $subject patch does cause this problem to happen? > > > > Rafael, the problem is that if an interrupt happens while it's disabled - > > but before the CPU has actually turned all interrupts off - the CPU will > > ACK the interrupt (but just set a flag for it being PENDING), so now the > > chipset logic around it will not see it as pending any more, so now the > > chipset won't auto-wake the CPU immediately (or more likely, it won't > > even suspend it). > > > > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > > way is to just have some sysdev drievr code simply do something like > > > > static int sysdev_suspend() > > { > > for_each_irq(irq,desc) { > > if (!(desc->flags & IRQF_WAKE)) > > continue; > > if (desc->flags & IRQ_PENDING) > > return -EBUSY; > > } > > return 0; > > } > > > > and that should automatically mean that if any irq is pending, the suspend > > will fail and we'll immediately wake up again. > > > > It looks trivial, and I don't understand why Arve can't just do the sysdev > > thing. > > I can. My point is that the patch breaks our existing code. Is that a mainline kernel code? > If anyone else uses edge triggered wakeup interrupt it may break from them as > well. The main question if this should be fixed separately for every > platform that needs it, or if pending wakeup interrupts should always > abort sleep. Well, I'm not really sure if this is the problem. In fact the problem is that you have a regular device the interrupt of which can be a wake-up one. I think the problem wouldn't have existed at all if it had been a sysdev. Is that correct? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 23:10 ` Rafael J. Wysocki @ 2009-02-27 0:00 ` Arve Hjønnevåg 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 0:00 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-27 0:00 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, Feb 26, 2009 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 26 February 2009, Arve Hjønnevåg wrote: >> On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >> > >> > >> > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: >> >> >> >> Well, how exactly the $subject patch does cause this problem to happen? >> > >> > Rafael, the problem is that if an interrupt happens while it's disabled - >> > but before the CPU has actually turned all interrupts off - the CPU will >> > ACK the interrupt (but just set a flag for it being PENDING), so now the >> > chipset logic around it will not see it as pending any more, so now the >> > chipset won't auto-wake the CPU immediately (or more likely, it won't >> > even suspend it). >> > >> > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial >> > way is to just have some sysdev drievr code simply do something like >> > >> > static int sysdev_suspend() >> > { >> > for_each_irq(irq,desc) { >> > if (!(desc->flags & IRQF_WAKE)) >> > continue; >> > if (desc->flags & IRQ_PENDING) >> > return -EBUSY; >> > } >> > return 0; >> > } >> > >> > and that should automatically mean that if any irq is pending, the suspend >> > will fail and we'll immediately wake up again. >> > >> > It looks trivial, and I don't understand why Arve can't just do the sysdev >> > thing. >> >> I can. My point is that the patch breaks our existing code. > > Is that a mainline kernel code? No, the msm suspend support has not been merged. > >> If anyone else uses edge triggered wakeup interrupt it may break from them as >> well. The main question if this should be fixed separately for every >> platform that needs it, or if pending wakeup interrupts should always >> abort sleep. > > Well, I'm not really sure if this is the problem. In fact the problem is that > you have a regular device the interrupt of which can be a wake-up one. I think Is that not a common case and what enable_irq_wake is for? > the problem wouldn't have existed at all if it had been a sysdev. Is that > correct? How many sysdevs use interrupts? I found may drivers in the mainline kernel that use enable_irq_wake, but I did not see any that handle this race condition. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 0:00 ` Arve Hjønnevåg @ 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 3:20 ` [linux-pm] " Alan Stern 2009-02-27 3:20 ` Alan Stern 2009-02-27 0:27 ` Linus Torvalds 1 sibling, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 0:27 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thu, 26 Feb 2009, Arve Hjønnevåg wrote: > > How many sysdevs use interrupts? > > I found may drivers in the mainline kernel that use enable_irq_wake, > but I did not see any that handle this race condition. The _only_ driver that does enable_irq_wake() on x86 is the cmos timer driver, and even there it actually doesn't use irq_wake, but ACPI. Why? Because I don't think irq wakeup even _works_ on x86. So the whole enable_irq_wake is largely some embedded ARM platform issue, and a very special case, and doesn't exist anywhere else. Maybe I'm missing something, but it's definitely not the normal case. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 0:27 ` Linus Torvalds @ 2009-02-27 3:20 ` Alan Stern 2009-02-27 4:43 ` Linus Torvalds 2009-02-27 3:20 ` Alan Stern 1 sibling, 1 reply; 373+ messages in thread From: Alan Stern @ 2009-02-27 3:20 UTC (permalink / raw) To: Linus Torvalds Cc: Arve Hjønnevåg, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, 26 Feb 2009, Linus Torvalds wrote: > The _only_ driver that does enable_irq_wake() on x86 is the cmos timer > driver, and even there it actually doesn't use irq_wake, but ACPI. Why? > Because I don't think irq wakeup even _works_ on x86. > > So the whole enable_irq_wake is largely some embedded ARM platform issue, > and a very special case, and doesn't exist anywhere else. > > Maybe I'm missing something, but it's definitely not the normal case. What you're missing is that the embedded world is quite a large one. As any member of CELF will tell you, there are lots more embedded systems around than there are desktop/laptop computers. (I admit, I don't know what the ratio is if you restrict your attention to systems running Linux.) We can't afford to regard them as second-class citizens. Plenty of embedded systems use normal interrupts from GPIO lines as wakeup sources. Don't discount the need for this just because desktop systems don't use them that way. It may not be "normal" in the circles you're accustomed to, but it _is_ normal elsewhere. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 3:20 ` [linux-pm] " Alan Stern @ 2009-02-27 4:43 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 4:43 UTC (permalink / raw) To: Alan Stern Cc: Arve Hjønnevåg, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, 26 Feb 2009, Alan Stern wrote: > > What you're missing is that the embedded world is quite a large one. I'm gpoing to give you one more clue, and if you don't stop sending out these IDIOTIC emails, I'm going to put you into my killfile. Got it? So listen up: - the number of ARM chips sold doesn't matter one F*CKING WHIT. - You need to add ONE SINGLE "sysdev" entry for ARM to take care of this FOR EVERY DAMN SINGLE ONE. - Your inane whining about this AFTER I HAVE TOLD YOU MULTIPLE TIMES HOW TO DO IT, AND AFTER I HAVE TOLD YOU THAT IT'S A SPECIAL CASE, IS F*CKING IRRITATING. Got it? I _grepped_ for that enable_irq_wake() use. It looks like it's only used on ARM and maybe BF. Add the five lines of code (just cut and paste them from my earlier email) to your architecture already, AND STOP WHINING. It's not a generic case. It's not a problem. You can damn well fix it in the ONE SINGLE ARCHITECTURE (or maybe two) that cares. I've told you how. Why is it so damn hard for you to just accept? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-27 4:43 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 4:43 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Thu, 26 Feb 2009, Alan Stern wrote: > > What you're missing is that the embedded world is quite a large one. I'm gpoing to give you one more clue, and if you don't stop sending out these IDIOTIC emails, I'm going to put you into my killfile. Got it? So listen up: - the number of ARM chips sold doesn't matter one F*CKING WHIT. - You need to add ONE SINGLE "sysdev" entry for ARM to take care of this FOR EVERY DAMN SINGLE ONE. - Your inane whining about this AFTER I HAVE TOLD YOU MULTIPLE TIMES HOW TO DO IT, AND AFTER I HAVE TOLD YOU THAT IT'S A SPECIAL CASE, IS F*CKING IRRITATING. Got it? I _grepped_ for that enable_irq_wake() use. It looks like it's only used on ARM and maybe BF. Add the five lines of code (just cut and paste them from my earlier email) to your architecture already, AND STOP WHINING. It's not a generic case. It's not a problem. You can damn well fix it in the ONE SINGLE ARCHITECTURE (or maybe two) that cares. I've told you how. Why is it so damn hard for you to just accept? Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 4:43 ` Linus Torvalds (?) @ 2009-02-27 14:59 ` Alan Stern 2009-02-27 20:30 ` Linus Torvalds 2009-02-27 20:30 ` [linux-pm] " Linus Torvalds -1 siblings, 2 replies; 373+ messages in thread From: Alan Stern @ 2009-02-27 14:59 UTC (permalink / raw) To: Linus Torvalds Cc: Arve Hjønnevåg, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, 26 Feb 2009, Linus Torvalds wrote: > On Thu, 26 Feb 2009, Alan Stern wrote: > > > > What you're missing is that the embedded world is quite a large one. > > I'm gpoing to give you one more clue, and if you don't stop sending out > these IDIOTIC emails, I'm going to put you into my killfile. > > Got it? Whoa!! Hold on there! You got too angry too quickly. I'm Alan Stern, not Arve Hjønnevåg; that was the first email I've sent on this topic. And while perhaps it was idiotic, you shouldn't put the blame for it on Arve. > So listen up: > - the number of ARM chips sold doesn't matter one F*CKING WHIT. > - You need to add ONE SINGLE "sysdev" entry for ARM to take care of this > FOR EVERY DAMN SINGLE ONE. > - Your inane whining about this AFTER I HAVE TOLD YOU MULTIPLE TIMES HOW > TO DO IT, AND AFTER I HAVE TOLD YOU THAT IT'S A SPECIAL CASE, IS > F*CKING IRRITATING. > > Got it? > > I _grepped_ for that enable_irq_wake() use. It looks like it's only used > on ARM and maybe BF. Add the five lines of code (just cut and paste them > from my earlier email) to your architecture already, AND STOP WHINING. Really? Let's see (this is using Greg KH's development tree): $ find . -name '*.[ch]' | xargs grep enable_irq_wake ./drivers/serial/serial_core.c: enable_irq_wake(port->irq); ./drivers/usb/gadget/at91_udc.c: enable_irq_wake(udc->udp_irq); ./drivers/usb/gadget/at91_udc.c: enable_irq_wake(udc->board.vbus_pin); ./drivers/usb/musb/musb_core.c: if (enable_irq_wake(nIrq) == 0) { ./drivers/usb/host/ohci-at91.c: enable_irq_wake(hcd->irq); ./drivers/input/serio/sa1111ps2.c: enable_irq_wake(ps2if->dev->irq[0]); ./drivers/input/keyboard/gpio_keys.c: enable_irq_wake(irq); ./drivers/input/keyboard/pxa27x_keypad.c: enable_irq_wake(keypad->irq); ./drivers/input/keyboard/bf54x-keys.c: enable_irq_wake(bf54x_kpad->irq); ./drivers/pcmcia/at91_cf.c: enable_irq_wake(board->det_pin); ./drivers/pcmcia/at91_cf.c: enable_irq_wake(board->irq_pin); ./drivers/mmc/host/at91_mci.c: enable_irq_wake(host->board->det_pin); ./drivers/mfd/htc-egpio.c: enable_irq_wake(ei->chained_irq); ./drivers/mfd/pcf50633-core.c: if (enable_irq_wake(client->irq) < 0) ./drivers/rtc/rtc-sa1100.c: enable_irq_wake(IRQ_RTCAlrm); ./drivers/rtc/rtc-omap.c: enable_irq_wake(omap_rtc_alarm); ./drivers/rtc/rtc-s3c.c: enable_irq_wake(s3c_rtc_alarmno); ./drivers/rtc/rtc-at91rm9200.c: enable_irq_wake(AT91_ID_SYS); ./drivers/rtc/rtc-cmos.c: enable_irq_wake(cmos->irq); ./drivers/rtc/rtc-bfin.c: enable_irq_wake(IRQ_RTC); ./drivers/rtc/rtc-ds1374.c: enable_irq_wake(client->irq); ./drivers/rtc/rtc-at91sam9.c: enable_irq_wake(AT91_ID_SYS); ./drivers/rtc/rtc-pxa.c: enable_irq_wake(pxa_rtc->irq_Alrm); ./drivers/power/pda_power.c: ac_wakeup_enabled = !enable_irq_wake(ac_irq->start); ./drivers/power/pda_power.c: usb_wakeup_enabled = !enable_irq_wake(usb_irq->start); ./arch/arm/mach-sa1100/neponset.c: enable_irq_wake(IRQ_GPIO25); ./arch/arm/mach-s3c2410/mach-amlm5900.c: enable_irq_wake(IRQ_EINT9); ./arch/arm/mach-omap1/board-osk.c: enable_irq_wake(irq); ./arch/arm/mach-omap1/serial.c: enable_irq_wake(gpio_to_irq(gpio_nr)); ./arch/arm/plat-omap/gpio.c: enable_irq_wake(bank->irq); ./arch/arm/plat-omap/gpio.c: enable_irq_wake(bank->irq); ./arch/arm/plat-omap/gpio.c:/* Use disable_irq_wake() and enable_irq_wake() functions from drivers */ ./include/linux/interrupt.h:static inline int enable_irq_wake(unsigned int irq) ./include/linux/interrupt.h:static inline int enable_irq_wake(unsigned int irq) Perhaps these aren't all the sort of usage you're talking about, but I bet most of them are. It certainly looks like more than just ARM. Maybe not all that much more, but definitely more. And the number will only grow in the future. > It's not a generic case. It's not a problem. You can damn well fix it in > the ONE SINGLE ARCHITECTURE (or maybe two) that cares. I've told you how. I'm not arguing with your suggestion; I'm merely disagreeing with your statement that wakeup interrupts are "definitely not the normal case". Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 14:59 ` [linux-pm] " Alan Stern @ 2009-02-27 20:30 ` Linus Torvalds 2009-02-27 20:30 ` [linux-pm] " Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 20:30 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Fri, 27 Feb 2009, Alan Stern wrote: > > Perhaps these aren't all the sort of usage you're talking about, but I > bet most of them are. It certainly looks like more than just ARM. > Maybe not all that much more, but definitely more. And the number will > only grow in the future. Are you really sure? Because it can't be x86. I'm pretty sure that that is simply not how x86 wake events _work_ - they're not interrupts. And that's the big point that people seem to be missing here: the whole "wake up interrupt" thing is not some generic model in the first place. I strongly suspect that it literally only works on certain architectures. In other words, I'm getting damn tired of people who CLEARLY DON'T EVEN KNOW HOW THE HARDWARE WORKS arguing over this. Here's another hint: that whole "enable_irq_wake()" - have you possibly spent even five seconds to look at what it actually does? I bet you haven't. Because what it does is to call the irq controller "set_wake" function. Now, grep for that. Whay do you find? Like maybe ARM and BlackFin? Oh, and I note one MIPS platform. The point is, that whole "irq wake" really is system dependent. We can do helper functions for it, but anybody who thinks it's anything "generic" is totally mistaken. In other words, making it a sysdev thing is the CORRECT thing to do. It really is not just "here's how you work around something". It really is "this is how the hardware FUNDAMENTALLY WORKS". Please, stop arguing. At least argue only after you understand what the physical hardware actyally does. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 14:59 ` [linux-pm] " Alan Stern 2009-02-27 20:30 ` Linus Torvalds @ 2009-02-27 20:30 ` Linus Torvalds 2009-02-28 3:54 ` Arve Hjønnevåg 2009-02-28 3:54 ` [linux-pm] " Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 20:30 UTC (permalink / raw) To: Alan Stern Cc: Arve Hjønnevåg, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Fri, 27 Feb 2009, Alan Stern wrote: > > Perhaps these aren't all the sort of usage you're talking about, but I > bet most of them are. It certainly looks like more than just ARM. > Maybe not all that much more, but definitely more. And the number will > only grow in the future. Are you really sure? Because it can't be x86. I'm pretty sure that that is simply not how x86 wake events _work_ - they're not interrupts. And that's the big point that people seem to be missing here: the whole "wake up interrupt" thing is not some generic model in the first place. I strongly suspect that it literally only works on certain architectures. In other words, I'm getting damn tired of people who CLEARLY DON'T EVEN KNOW HOW THE HARDWARE WORKS arguing over this. Here's another hint: that whole "enable_irq_wake()" - have you possibly spent even five seconds to look at what it actually does? I bet you haven't. Because what it does is to call the irq controller "set_wake" function. Now, grep for that. Whay do you find? Like maybe ARM and BlackFin? Oh, and I note one MIPS platform. The point is, that whole "irq wake" really is system dependent. We can do helper functions for it, but anybody who thinks it's anything "generic" is totally mistaken. In other words, making it a sysdev thing is the CORRECT thing to do. It really is not just "here's how you work around something". It really is "this is how the hardware FUNDAMENTALLY WORKS". Please, stop arguing. At least argue only after you understand what the physical hardware actyally does. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 20:30 ` [linux-pm] " Linus Torvalds @ 2009-02-28 3:54 ` Arve Hjønnevåg 2009-02-28 3:54 ` [linux-pm] " Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-28 3:54 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Fri, 27 Feb 2009, Alan Stern wrote: >> >> Perhaps these aren't all the sort of usage you're talking about, but I >> bet most of them are. It certainly looks like more than just ARM. >> Maybe not all that much more, but definitely more. And the number will >> only grow in the future. > > Are you really sure? Because it can't be x86. I'm pretty sure that that is > simply not how x86 wake events _work_ - they're not interrupts. They are not interrupts on every arm platform that implements set_wake either, but it is useful to pretend that they are. If the platform code reads the wakeup status and marks the corresponding interrupt pending, the driver does not need to know if the event occurred before or after the system entered the low power state. I don't know if this can be implemented on x86, but it might be worth looking into. > > And that's the big point that people seem to be missing here: the whole > "wake up interrupt" thing is not some generic model in the first place. I > strongly suspect that it literally only works on certain architectures. My point was that it was not specific to our platform. I don't have a problem fixing our platform if this patch is merged, but this is case where a change to the generic code breaks some platforms. I don't think there is a good reason to make the fix arm specific, trivial or not, since any platform implementing set_wake may run into the race condition that this patch introduced. If the platform does not implement set_wake, IRQ_WAKEUP never gets set, and the fix should not have any effect. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 20:30 ` [linux-pm] " Linus Torvalds 2009-02-28 3:54 ` Arve Hjønnevåg @ 2009-02-28 3:54 ` Arve Hjønnevåg 2009-02-28 10:06 ` Rafael J. Wysocki 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-28 3:54 UTC (permalink / raw) To: Linus Torvalds Cc: Alan Stern, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Fri, 27 Feb 2009, Alan Stern wrote: >> >> Perhaps these aren't all the sort of usage you're talking about, but I >> bet most of them are. It certainly looks like more than just ARM. >> Maybe not all that much more, but definitely more. And the number will >> only grow in the future. > > Are you really sure? Because it can't be x86. I'm pretty sure that that is > simply not how x86 wake events _work_ - they're not interrupts. They are not interrupts on every arm platform that implements set_wake either, but it is useful to pretend that they are. If the platform code reads the wakeup status and marks the corresponding interrupt pending, the driver does not need to know if the event occurred before or after the system entered the low power state. I don't know if this can be implemented on x86, but it might be worth looking into. > > And that's the big point that people seem to be missing here: the whole > "wake up interrupt" thing is not some generic model in the first place. I > strongly suspect that it literally only works on certain architectures. My point was that it was not specific to our platform. I don't have a problem fixing our platform if this patch is merged, but this is case where a change to the generic code breaks some platforms. I don't think there is a good reason to make the fix arm specific, trivial or not, since any platform implementing set_wake may run into the race condition that this patch introduced. If the platform does not implement set_wake, IRQ_WAKEUP never gets set, and the fix should not have any effect. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-28 3:54 ` [linux-pm] " Arve Hjønnevåg @ 2009-02-28 10:06 ` Rafael J. Wysocki 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-28 10:06 UTC (permalink / raw) To: Arve Hjønnevåg, Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Saturday 28 February 2009, Arve Hjønnevåg wrote: > On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > > > On Fri, 27 Feb 2009, Alan Stern wrote: > >> > >> Perhaps these aren't all the sort of usage you're talking about, but I > >> bet most of them are. It certainly looks like more than just ARM. > >> Maybe not all that much more, but definitely more. And the number will > >> only grow in the future. > > > > Are you really sure? Because it can't be x86. I'm pretty sure that that is > > simply not how x86 wake events _work_ - they're not interrupts. > > They are not interrupts on every arm platform that implements set_wake > either, but it is useful to pretend that they are. If the platform > code reads the wakeup status and marks the corresponding interrupt > pending, the driver does not need to know if the event occurred before > or after the system entered the low power state. I don't know if this > can be implemented on x86, but it might be worth looking into. That would have been a new feature, no? And I don't think anyone except for you does it. So, what you're saying boils down to "please don't break my new feature that hasn't been merged yet". > > And that's the big point that people seem to be missing here: the whole > > "wake up interrupt" thing is not some generic model in the first place. I > > strongly suspect that it literally only works on certain architectures. > > My point was that it was not specific to our platform. I don't have a > problem fixing our platform if this patch is merged, but this is case > where a change to the generic code breaks some platforms. Quite frankly, I don't really think it will break anything else than your platform. Still, if Linus agrees, I can put the loop suggested by him directly into sysdev_suspend(). Linus? > I don't think there is a good reason to make the fix arm specific, trivial or > not, since any platform implementing set_wake may run into the race > condition that this patch introduced. If the platform does not > implement set_wake, IRQ_WAKEUP never gets set, and the fix should not > have any effect. The point is, if we put anything like this into the generic code, platforms start to rely on this and it will become more and more difficult to change at the generic level if need be. The fact that your platform relies on the generic code to disable IRQs on the CPU at a particular point shows the mechanism very well. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-28 3:54 ` [linux-pm] " Arve Hjønnevåg 2009-02-28 10:06 ` Rafael J. Wysocki @ 2009-02-28 10:06 ` Rafael J. Wysocki 2009-02-28 17:03 ` Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-28 10:06 UTC (permalink / raw) To: Arve Hjønnevåg, Linus Torvalds Cc: Alan Stern, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Saturday 28 February 2009, Arve Hjønnevåg wrote: > On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > > > On Fri, 27 Feb 2009, Alan Stern wrote: > >> > >> Perhaps these aren't all the sort of usage you're talking about, but I > >> bet most of them are. It certainly looks like more than just ARM. > >> Maybe not all that much more, but definitely more. And the number will > >> only grow in the future. > > > > Are you really sure? Because it can't be x86. I'm pretty sure that that is > > simply not how x86 wake events _work_ - they're not interrupts. > > They are not interrupts on every arm platform that implements set_wake > either, but it is useful to pretend that they are. If the platform > code reads the wakeup status and marks the corresponding interrupt > pending, the driver does not need to know if the event occurred before > or after the system entered the low power state. I don't know if this > can be implemented on x86, but it might be worth looking into. That would have been a new feature, no? And I don't think anyone except for you does it. So, what you're saying boils down to "please don't break my new feature that hasn't been merged yet". > > And that's the big point that people seem to be missing here: the whole > > "wake up interrupt" thing is not some generic model in the first place. I > > strongly suspect that it literally only works on certain architectures. > > My point was that it was not specific to our platform. I don't have a > problem fixing our platform if this patch is merged, but this is case > where a change to the generic code breaks some platforms. Quite frankly, I don't really think it will break anything else than your platform. Still, if Linus agrees, I can put the loop suggested by him directly into sysdev_suspend(). Linus? > I don't think there is a good reason to make the fix arm specific, trivial or > not, since any platform implementing set_wake may run into the race > condition that this patch introduced. If the platform does not > implement set_wake, IRQ_WAKEUP never gets set, and the fix should not > have any effect. The point is, if we put anything like this into the generic code, platforms start to rely on this and it will become more and more difficult to change at the generic level if need be. The fact that your platform relies on the generic code to disable IRQs on the CPU at a particular point shows the mechanism very well. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki @ 2009-02-28 17:03 ` Linus Torvalds 2009-02-28 22:15 ` [linux-pm] " Arve Hjønnevåg 2009-02-28 22:15 ` Arve Hjønnevåg 2 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-28 17:03 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arve Hjønnevåg, Alan Stern, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sat, 28 Feb 2009, Rafael J. Wysocki wrote: > > Still, if Linus agrees, I can put the loop suggested by him directly into > sysdev_suspend(). Linus? I don't much care - it's going to be a no-op on architectures that don't have that kind of "turn an interrupt into a wakeup event" capability. So it's not going to break for things like x86, and it's not like going over the irq list one more time is going to be so expensive as to be noticeable, even if that architecture doesn't ever get any advantage of it. However - my main worry is that we will notice that different architectures (and possibly even different platforms _within_ the same architecture - depending on which kind of interrupt/pm controller they have) will want to do different things, and actually do something to the interrupt controller itself too at that point. But we can certainly try starting out with just the generic "if a wakeup interrupt is pending, sysdev_suspend() returns an error immediately". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-28 17:03 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-28 17:03 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Sat, 28 Feb 2009, Rafael J. Wysocki wrote: > > Still, if Linus agrees, I can put the loop suggested by him directly into > sysdev_suspend(). Linus? I don't much care - it's going to be a no-op on architectures that don't have that kind of "turn an interrupt into a wakeup event" capability. So it's not going to break for things like x86, and it's not like going over the irq list one more time is going to be so expensive as to be noticeable, even if that architecture doesn't ever get any advantage of it. However - my main worry is that we will notice that different architectures (and possibly even different platforms _within_ the same architecture - depending on which kind of interrupt/pm controller they have) will want to do different things, and actually do something to the interrupt controller itself too at that point. But we can certainly try starting out with just the generic "if a wakeup interrupt is pending, sysdev_suspend() returns an error immediately". Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki 2009-02-28 17:03 ` Linus Torvalds @ 2009-02-28 22:15 ` Arve Hjønnevåg 2009-02-28 22:15 ` Arve Hjønnevåg 2 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-28 22:15 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Alan Stern, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sat, Feb 28, 2009 at 2:06 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Saturday 28 February 2009, Arve Hjønnevåg wrote: >> On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >> > >> > >> > On Fri, 27 Feb 2009, Alan Stern wrote: >> >> >> >> Perhaps these aren't all the sort of usage you're talking about, but I >> >> bet most of them are. It certainly looks like more than just ARM. >> >> Maybe not all that much more, but definitely more. And the number will >> >> only grow in the future. >> > >> > Are you really sure? Because it can't be x86. I'm pretty sure that that is >> > simply not how x86 wake events _work_ - they're not interrupts. >> >> They are not interrupts on every arm platform that implements set_wake >> either, but it is useful to pretend that they are. If the platform >> code reads the wakeup status and marks the corresponding interrupt >> pending, the driver does not need to know if the event occurred before >> or after the system entered the low power state. I don't know if this >> can be implemented on x86, but it might be worth looking into. > > That would have been a new feature, no? And I don't think anyone except for > you does it. So, what you're saying boils down to "please don't break my new > feature that hasn't been merged yet". I was not referring to our platform, so not this is not a new feature. >> > And that's the big point that people seem to be missing here: the whole >> > "wake up interrupt" thing is not some generic model in the first place. I >> > strongly suspect that it literally only works on certain architectures. >> >> My point was that it was not specific to our platform. I don't have a >> problem fixing our platform if this patch is merged, but this is case >> where a change to the generic code breaks some platforms. > > Quite frankly, I don't really think it will break anything else than your > platform. That is quite possible, but do other platforms not break because they are already broken? I saw no attempt to avoid race conditions on suspend in the drivers I looked at. > Still, if Linus agrees, I can put the loop suggested by him directly into > sysdev_suspend(). Linus? I vote for this. > >> I don't think there is a good reason to make the fix arm specific, trivial or >> not, since any platform implementing set_wake may run into the race >> condition that this patch introduced. If the platform does not >> implement set_wake, IRQ_WAKEUP never gets set, and the fix should not >> have any effect. > > The point is, if we put anything like this into the generic code, platforms > start to rely on this and it will become more and more difficult to change at > the generic level if need be. > > The fact that your platform relies on the generic code to disable IRQs on > the CPU at a particular point shows the mechanism very well. :-) If the generic code did not clear the interrupt I think this would be a stronger point. Since our hardware does not have a mask register, only enable, I find this feature (leave the interrupt enabled, and mask it only if it triggers) of the generic interrupt code quite useful. If the generic code did not do this, the platform code would have to. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki 2009-02-28 17:03 ` Linus Torvalds 2009-02-28 22:15 ` [linux-pm] " Arve Hjønnevåg @ 2009-02-28 22:15 ` Arve Hjønnevåg 2 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-28 22:15 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Sat, Feb 28, 2009 at 2:06 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Saturday 28 February 2009, Arve Hjønnevåg wrote: >> On Fri, Feb 27, 2009 at 12:30 PM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >> > >> > >> > On Fri, 27 Feb 2009, Alan Stern wrote: >> >> >> >> Perhaps these aren't all the sort of usage you're talking about, but I >> >> bet most of them are. It certainly looks like more than just ARM. >> >> Maybe not all that much more, but definitely more. And the number will >> >> only grow in the future. >> > >> > Are you really sure? Because it can't be x86. I'm pretty sure that that is >> > simply not how x86 wake events _work_ - they're not interrupts. >> >> They are not interrupts on every arm platform that implements set_wake >> either, but it is useful to pretend that they are. If the platform >> code reads the wakeup status and marks the corresponding interrupt >> pending, the driver does not need to know if the event occurred before >> or after the system entered the low power state. I don't know if this >> can be implemented on x86, but it might be worth looking into. > > That would have been a new feature, no? And I don't think anyone except for > you does it. So, what you're saying boils down to "please don't break my new > feature that hasn't been merged yet". I was not referring to our platform, so not this is not a new feature. >> > And that's the big point that people seem to be missing here: the whole >> > "wake up interrupt" thing is not some generic model in the first place. I >> > strongly suspect that it literally only works on certain architectures. >> >> My point was that it was not specific to our platform. I don't have a >> problem fixing our platform if this patch is merged, but this is case >> where a change to the generic code breaks some platforms. > > Quite frankly, I don't really think it will break anything else than your > platform. That is quite possible, but do other platforms not break because they are already broken? I saw no attempt to avoid race conditions on suspend in the drivers I looked at. > Still, if Linus agrees, I can put the loop suggested by him directly into > sysdev_suspend(). Linus? I vote for this. > >> I don't think there is a good reason to make the fix arm specific, trivial or >> not, since any platform implementing set_wake may run into the race >> condition that this patch introduced. If the platform does not >> implement set_wake, IRQ_WAKEUP never gets set, and the fix should not >> have any effect. > > The point is, if we put anything like this into the generic code, platforms > start to rely on this and it will become more and more difficult to change at > the generic level if need be. > > The fact that your platform relies on the generic code to disable IRQs on > the CPU at a particular point shows the mechanism very well. :-) If the generic code did not clear the interrupt I think this would be a stronger point. Since our hardware does not have a mask register, only enable, I find this feature (leave the interrupt enabled, and mask it only if it triggers) of the generic interrupt code quite useful. If the generic code did not do this, the platform code would have to. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 4:43 ` Linus Torvalds (?) (?) @ 2009-02-27 14:59 ` Alan Stern -1 siblings, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-02-27 14:59 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Thu, 26 Feb 2009, Linus Torvalds wrote: > On Thu, 26 Feb 2009, Alan Stern wrote: > > > > What you're missing is that the embedded world is quite a large one. > > I'm gpoing to give you one more clue, and if you don't stop sending out > these IDIOTIC emails, I'm going to put you into my killfile. > > Got it? Whoa!! Hold on there! You got too angry too quickly. I'm Alan Stern, not Arve Hjønnevåg; that was the first email I've sent on this topic. And while perhaps it was idiotic, you shouldn't put the blame for it on Arve. > So listen up: > - the number of ARM chips sold doesn't matter one F*CKING WHIT. > - You need to add ONE SINGLE "sysdev" entry for ARM to take care of this > FOR EVERY DAMN SINGLE ONE. > - Your inane whining about this AFTER I HAVE TOLD YOU MULTIPLE TIMES HOW > TO DO IT, AND AFTER I HAVE TOLD YOU THAT IT'S A SPECIAL CASE, IS > F*CKING IRRITATING. > > Got it? > > I _grepped_ for that enable_irq_wake() use. It looks like it's only used > on ARM and maybe BF. Add the five lines of code (just cut and paste them > from my earlier email) to your architecture already, AND STOP WHINING. Really? Let's see (this is using Greg KH's development tree): $ find . -name '*.[ch]' | xargs grep enable_irq_wake ./drivers/serial/serial_core.c: enable_irq_wake(port->irq); ./drivers/usb/gadget/at91_udc.c: enable_irq_wake(udc->udp_irq); ./drivers/usb/gadget/at91_udc.c: enable_irq_wake(udc->board.vbus_pin); ./drivers/usb/musb/musb_core.c: if (enable_irq_wake(nIrq) == 0) { ./drivers/usb/host/ohci-at91.c: enable_irq_wake(hcd->irq); ./drivers/input/serio/sa1111ps2.c: enable_irq_wake(ps2if->dev->irq[0]); ./drivers/input/keyboard/gpio_keys.c: enable_irq_wake(irq); ./drivers/input/keyboard/pxa27x_keypad.c: enable_irq_wake(keypad->irq); ./drivers/input/keyboard/bf54x-keys.c: enable_irq_wake(bf54x_kpad->irq); ./drivers/pcmcia/at91_cf.c: enable_irq_wake(board->det_pin); ./drivers/pcmcia/at91_cf.c: enable_irq_wake(board->irq_pin); ./drivers/mmc/host/at91_mci.c: enable_irq_wake(host->board->det_pin); ./drivers/mfd/htc-egpio.c: enable_irq_wake(ei->chained_irq); ./drivers/mfd/pcf50633-core.c: if (enable_irq_wake(client->irq) < 0) ./drivers/rtc/rtc-sa1100.c: enable_irq_wake(IRQ_RTCAlrm); ./drivers/rtc/rtc-omap.c: enable_irq_wake(omap_rtc_alarm); ./drivers/rtc/rtc-s3c.c: enable_irq_wake(s3c_rtc_alarmno); ./drivers/rtc/rtc-at91rm9200.c: enable_irq_wake(AT91_ID_SYS); ./drivers/rtc/rtc-cmos.c: enable_irq_wake(cmos->irq); ./drivers/rtc/rtc-bfin.c: enable_irq_wake(IRQ_RTC); ./drivers/rtc/rtc-ds1374.c: enable_irq_wake(client->irq); ./drivers/rtc/rtc-at91sam9.c: enable_irq_wake(AT91_ID_SYS); ./drivers/rtc/rtc-pxa.c: enable_irq_wake(pxa_rtc->irq_Alrm); ./drivers/power/pda_power.c: ac_wakeup_enabled = !enable_irq_wake(ac_irq->start); ./drivers/power/pda_power.c: usb_wakeup_enabled = !enable_irq_wake(usb_irq->start); ./arch/arm/mach-sa1100/neponset.c: enable_irq_wake(IRQ_GPIO25); ./arch/arm/mach-s3c2410/mach-amlm5900.c: enable_irq_wake(IRQ_EINT9); ./arch/arm/mach-omap1/board-osk.c: enable_irq_wake(irq); ./arch/arm/mach-omap1/serial.c: enable_irq_wake(gpio_to_irq(gpio_nr)); ./arch/arm/plat-omap/gpio.c: enable_irq_wake(bank->irq); ./arch/arm/plat-omap/gpio.c: enable_irq_wake(bank->irq); ./arch/arm/plat-omap/gpio.c:/* Use disable_irq_wake() and enable_irq_wake() functions from drivers */ ./include/linux/interrupt.h:static inline int enable_irq_wake(unsigned int irq) ./include/linux/interrupt.h:static inline int enable_irq_wake(unsigned int irq) Perhaps these aren't all the sort of usage you're talking about, but I bet most of them are. It certainly looks like more than just ARM. Maybe not all that much more, but definitely more. And the number will only grow in the future. > It's not a generic case. It's not a problem. You can damn well fix it in > the ONE SINGLE ARCHITECTURE (or maybe two) that cares. I've told you how. I'm not arguing with your suggestion; I'm merely disagreeing with your statement that wakeup interrupts are "definitely not the normal case". Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 3:20 ` [linux-pm] " Alan Stern @ 2009-02-27 3:20 ` Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-02-27 3:20 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Thu, 26 Feb 2009, Linus Torvalds wrote: > The _only_ driver that does enable_irq_wake() on x86 is the cmos timer > driver, and even there it actually doesn't use irq_wake, but ACPI. Why? > Because I don't think irq wakeup even _works_ on x86. > > So the whole enable_irq_wake is largely some embedded ARM platform issue, > and a very special case, and doesn't exist anywhere else. > > Maybe I'm missing something, but it's definitely not the normal case. What you're missing is that the embedded world is quite a large one. As any member of CELF will tell you, there are lots more embedded systems around than there are desktop/laptop computers. (I admit, I don't know what the ratio is if you restrict your attention to systems running Linux.) We can't afford to regard them as second-class citizens. Plenty of embedded systems use normal interrupts from GPIO lines as wakeup sources. Don't discount the need for this just because desktop systems don't use them that way. It may not be "normal" in the circles you're accustomed to, but it _is_ normal elsewhere. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-27 0:00 ` Arve Hjønnevåg 2009-02-27 0:27 ` Linus Torvalds @ 2009-02-27 0:27 ` Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-27 0:27 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, 26 Feb 2009, Arve Hjønnevåg wrote: > > How many sysdevs use interrupts? > > I found may drivers in the mainline kernel that use enable_irq_wake, > but I did not see any that handle this race condition. The _only_ driver that does enable_irq_wake() on x86 is the cmos timer driver, and even there it actually doesn't use irq_wake, but ACPI. Why? Because I don't think irq wakeup even _works_ on x86. So the whole enable_irq_wake is largely some embedded ARM platform issue, and a very special case, and doesn't exist anywhere else. Maybe I'm missing something, but it's definitely not the normal case. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-27 0:00 ` Arve Hjønnevåg @ 2009-02-27 0:00 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-27 0:00 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, Feb 26, 2009 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 26 February 2009, Arve Hjønnevåg wrote: >> On Thu, Feb 26, 2009 at 2:10 PM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >> > >> > >> > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: >> >> >> >> Well, how exactly the $subject patch does cause this problem to happen? >> > >> > Rafael, the problem is that if an interrupt happens while it's disabled - >> > but before the CPU has actually turned all interrupts off - the CPU will >> > ACK the interrupt (but just set a flag for it being PENDING), so now the >> > chipset logic around it will not see it as pending any more, so now the >> > chipset won't auto-wake the CPU immediately (or more likely, it won't >> > even suspend it). >> > >> > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial >> > way is to just have some sysdev drievr code simply do something like >> > >> > static int sysdev_suspend() >> > { >> > for_each_irq(irq,desc) { >> > if (!(desc->flags & IRQF_WAKE)) >> > continue; >> > if (desc->flags & IRQ_PENDING) >> > return -EBUSY; >> > } >> > return 0; >> > } >> > >> > and that should automatically mean that if any irq is pending, the suspend >> > will fail and we'll immediately wake up again. >> > >> > It looks trivial, and I don't understand why Arve can't just do the sysdev >> > thing. >> >> I can. My point is that the patch breaks our existing code. > > Is that a mainline kernel code? No, the msm suspend support has not been merged. > >> If anyone else uses edge triggered wakeup interrupt it may break from them as >> well. The main question if this should be fixed separately for every >> platform that needs it, or if pending wakeup interrupts should always >> abort sleep. > > Well, I'm not really sure if this is the problem. In fact the problem is that > you have a regular device the interrupt of which can be a wake-up one. I think Is that not a common case and what enable_irq_wake is for? > the problem wouldn't have existed at all if it had been a sysdev. Is that > correct? How many sysdevs use interrupts? I found may drivers in the mainline kernel that use enable_irq_wake, but I did not see any that handle this race condition. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 22:30 ` Arve Hjønnevåg @ 2009-02-26 22:30 ` Rafael J. Wysocki 2009-02-26 22:30 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 22:30 UTC (permalink / raw) To: Linus Torvalds Cc: Arve Hjønnevåg, Ingo Molnar, Eric W. Biederman, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Thursday 26 February 2009, Linus Torvalds wrote: > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > > > > Well, how exactly the $subject patch does cause this problem to happen? > > Rafael, the problem is that if an interrupt happens while it's disabled - > but before the CPU has actually turned all interrupts off - the CPU will > ACK the interrupt (but just set a flag for it being PENDING), so now the > chipset logic around it will not see it as pending any more, so now the > chipset won't auto-wake the CPU immediately (or more likely, it won't > even suspend it). Ah, I see now, thanks. > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > way is to just have some sysdev drievr code simply do something like > > static int sysdev_suspend() > { > for_each_irq(irq,desc) { > if (!(desc->flags & IRQF_WAKE)) > continue; > if (desc->flags & IRQ_PENDING) > return -EBUSY; > } > return 0; > } > > and that should automatically mean that if any irq is pending, the suspend > will fail and we'll immediately wake up again. Yeah. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 22:10 ` Linus Torvalds ` (2 preceding siblings ...) 2009-02-26 22:30 ` Rafael J. Wysocki @ 2009-02-26 22:30 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 22:30 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thursday 26 February 2009, Linus Torvalds wrote: > > On Thu, 26 Feb 2009, Rafael J. Wysocki wrote: > > > > Well, how exactly the $subject patch does cause this problem to happen? > > Rafael, the problem is that if an interrupt happens while it's disabled - > but before the CPU has actually turned all interrupts off - the CPU will > ACK the interrupt (but just set a flag for it being PENDING), so now the > chipset logic around it will not see it as pending any more, so now the > chipset won't auto-wake the CPU immediately (or more likely, it won't > even suspend it). Ah, I see now, thanks. > It's trivial to fix multiple ways, so I wouldn't worry. The most trivial > way is to just have some sysdev drievr code simply do something like > > static int sysdev_suspend() > { > for_each_irq(irq,desc) { > if (!(desc->flags & IRQF_WAKE)) > continue; > if (desc->flags & IRQ_PENDING) > return -EBUSY; > } > return 0; > } > > and that should automatically mean that if any irq is pending, the suspend > will fail and we'll immediately wake up again. Yeah. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 20:57 ` Benjamin Herrenschmidt 2009-02-26 21:58 ` Rafael J. Wysocki @ 2009-02-26 21:58 ` Rafael J. Wysocki 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 21:58 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Thu, Feb 26, 2009 at 1:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Thursday 26 February 2009, Arve Hjønnevåg wrote: > >> On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > BTW, appended is the current (3rd) version of the $subject patch with some > >> > of your comments taken into account. In particular, I did the following: > >> > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > >> > - fixed interrupt.h so that their headers are at a better place > >> > - made enable_irq() clear IRQ_SUSPENDED > >> > - made device_power_down() and device_power_up() call > >> > suspend_device_irqs() and resume_device_irqs(), respectively, which > >> > simplified the callers quite a bit (it changed the Xen code ordering, though, > >> > but I _think_ it still should work). > >> > >> Do you plan to fix edge triggered wakeup interrupts? It still looks > >> like edge triggered wakeup interrupts that occur between > >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > > > In the current version of the patch the interrupts that have IRQ_WAKEUP set > > in status are not disabled. Is this not enough? > > That is enough for drivers that use wakelocks to abort suspend (if I > fix the wakelock code to not use a platform device as its last abort > point). It is not enough if you don't have wakelocks, since the > interrupt can occur after suspend_late has been called and the driver > has no way to abort suspend. Well, how exactly the $subject patch does cause this problem to happen? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 9:50 ` Rafael J. Wysocki 2009-02-26 20:34 ` Arve Hjønnevåg @ 2009-02-26 20:34 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 20:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, Feb 26, 2009 at 1:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 26 February 2009, Arve Hjønnevåg wrote: >> On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > BTW, appended is the current (3rd) version of the $subject patch with some >> > of your comments taken into account. In particular, I did the following: >> > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) >> > - fixed interrupt.h so that their headers are at a better place >> > - made enable_irq() clear IRQ_SUSPENDED >> > - made device_power_down() and device_power_up() call >> > suspend_device_irqs() and resume_device_irqs(), respectively, which >> > simplified the callers quite a bit (it changed the Xen code ordering, though, >> > but I _think_ it still should work). >> >> Do you plan to fix edge triggered wakeup interrupts? It still looks >> like edge triggered wakeup interrupts that occur between >> suspend_device_irqs and local_irq_disable will not cause a wakeup. > > In the current version of the patch the interrupts that have IRQ_WAKEUP set > in status are not disabled. Is this not enough? That is enough for drivers that use wakelocks to abort suspend (if I fix the wakelock code to not use a platform device as its last abort point). It is not enough if you don't have wakelocks, since the interrupt can occur after suspend_late has been called and the driver has no way to abort suspend. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-26 1:17 ` Arve Hjønnevåg ` (2 preceding siblings ...) 2009-02-26 9:50 ` Rafael J. Wysocki @ 2009-02-26 9:50 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-26 9:50 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thursday 26 February 2009, Arve Hjønnevåg wrote: > On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > BTW, appended is the current (3rd) version of the $subject patch with some > > of your comments taken into account. In particular, I did the following: > > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > > - fixed interrupt.h so that their headers are at a better place > > - made enable_irq() clear IRQ_SUSPENDED > > - made device_power_down() and device_power_up() call > > suspend_device_irqs() and resume_device_irqs(), respectively, which > > simplified the callers quite a bit (it changed the Xen code ordering, though, > > but I _think_ it still should work). > > Do you plan to fix edge triggered wakeup interrupts? It still looks > like edge triggered wakeup interrupts that occur between > suspend_device_irqs and local_irq_disable will not cause a wakeup. In the current version of the patch the interrupts that have IRQ_WAKEUP set in status are not disabled. Is this not enough? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:29 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-02-26 1:17 ` Arve Hjønnevåg @ 2009-02-26 1:17 ` Arve Hjønnevåg 3 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-26 1:17 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Tue, Feb 24, 2009 at 3:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > BTW, appended is the current (3rd) version of the $subject patch with some > of your comments taken into account. In particular, I did the following: > - moved [suspend|resume]_device_irqs() to a separate file (pm.c) > - fixed interrupt.h so that their headers are at a better place > - made enable_irq() clear IRQ_SUSPENDED > - made device_power_down() and device_power_up() call > suspend_device_irqs() and resume_device_irqs(), respectively, which > simplified the callers quite a bit (it changed the Xen code ordering, though, > but I _think_ it still should work). Do you plan to fix edge triggered wakeup interrupts? It still looks like edge triggered wakeup interrupts that occur between suspend_device_irqs and local_irq_disable will not cause a wakeup. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-24 23:09 ` Ingo Molnar @ 2009-02-24 23:09 ` Ingo Molnar 1 sibling, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-24 23:09 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 24 February 2009, Linus Torvalds wrote: > > > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > > outside of irq context is for the driver to program the hardware to > > > > not generate an irq. > > > > > > Well, that changes things quite a bit, because it means we can't change the > > > suspend-resume sequence in a way we thought we could without fixing all > > > drivers first, but this is exactly what we'd like to avoid by changing the > > > core. > > > > Calling "disable_irq()" is perfectly fine. > > > > What is not possible on that broken IO-APIC (among other > > things) is to actually turn the interrupts off at the apic > > (ie the whole ->shutdown() thing). But that's not what we > > even want to do. What we care about is just disabling the > > interrupt from a drievr perspective. > > > > IOW, the patches I have seen are fine, and all the comments > > from Eric are just confusion about what we want done. > > Ah, OK. Thanks for the explanation, I got confused too. > > > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may > > happen later, but that's totally unrelated to this whole > > "suspend_device_irq()" thing. > > Yeah. We definitely dont want to turn off x86 IO-APICs - the timer IRQ goes via one of them. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 23:07 ` Rafael J. Wysocki @ 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-25 4:16 ` Eric W. Biederman 2009-02-25 4:16 ` Eric W. Biederman 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 23:07 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Tuesday 24 February 2009, Linus Torvalds wrote: > > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > > > The only safe way on x86 to shutdown a level triggered ioapic irq > > > outside of irq context is for the driver to program the hardware to > > > not generate an irq. > > > > Well, that changes things quite a bit, because it means we can't change the > > suspend-resume sequence in a way we thought we could without fixing all > > drivers first, but this is exactly what we'd like to avoid by changing the > > core. > > Calling "disable_irq()" is perfectly fine. > > What is not possible on that broken IO-APIC (among other things) is to > actually turn the interrupts off at the apic (ie the whole ->shutdown() > thing). But that's not what we even want to do. What we care about is > just disabling the interrupt from a drievr perspective. > > IOW, the patches I have seen are fine, and all the comments from Eric are > just confusion about what we want done. Ah, OK. Thanks for the explanation, I got confused too. > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but > that's totally unrelated to this whole "suspend_device_irq()" thing. Yeah. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-24 23:07 ` Rafael J. Wysocki @ 2009-02-25 4:16 ` Eric W. Biederman 2009-02-25 4:26 ` Linus Torvalds 2009-02-25 4:16 ` Eric W. Biederman 3 siblings, 1 reply; 373+ messages in thread From: Eric W. Biederman @ 2009-02-25 4:16 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Ingo Molnar, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: >> >> > The only safe way on x86 to shutdown a level triggered ioapic irq >> > outside of irq context is for the driver to program the hardware to >> > not generate an irq. >> >> Well, that changes things quite a bit, because it means we can't change the >> suspend-resume sequence in a way we thought we could without fixing all >> drivers first, but this is exactly what we'd like to avoid by changing the >> core. > > Calling "disable_irq()" is perfectly fine. Agreed, I did not mean to indicate otherwise. > What is not possible on that broken IO-APIC (among other things) is to > actually turn the interrupts off at the apic (ie the whole ->shutdown() > thing). But that's not what we even want to do. What we care about is > just disabling the interrupt from a drievr perspective. > > IOW, the patches I have seen are fine, and all the comments from Eric are > just confusion about what we want done. Largely yes. > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but > that's totally unrelated to this whole "suspend_device_irq()" thing. Right. The question I was asking is: Can we get the broken cpu hotunplug code out of the suspend path? If we can get the devices into a low power state and not generating interrupts by the time we disable cpus then we do not need to migrate irqs from process context and risk hitting the ioapic bugs. While related safely suspending cpus is a different problem and a different patch. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-25 4:16 ` Eric W. Biederman @ 2009-02-25 4:26 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-25 4:26 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, Ingo Molnar, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Tue, 24 Feb 2009, Eric W. Biederman wrote: > The question I was asking is: > Can we get the broken cpu hotunplug code out of the suspend path? I think we can move it around. I don't think we can get rid of it. > If we can get the devices into a low power state and not generating > interrupts by the time we disable cpus then we do not need to migrate > irqs from process context and risk hitting the ioapic bugs. At least one issue is that the actual final "go to sleep" is something that has to happen on just one CPU. And I'm pretty sure the others have to have gone through the shutdown sequence before that. And knowing ACPI, the ordering requirements will boil down to something insane, like "you have to turn off the other CPU's _before_ you turn off some od the core devices, because turning off the other CPU's may involve them". So if what you would _want_ to do is to move the "turn off CPU's" into the very innermost layer, so that different architectures can then decide whether they even need to go through that whole thing or not (because turning off one core will automatically turn off all the others, simply because the power was turned off), I suspect the answer is "no". So you were probably hoping to never have to have that whole horrible issue with moving interrupts around. I'm afraid I'm not seeing it happen. But maybe we can have it happen after we've disabled all the non-system devices, so that in practice there simply won't be any new interrupts coming in any more. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-25 4:26 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-25 4:26 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, pm list On Tue, 24 Feb 2009, Eric W. Biederman wrote: > The question I was asking is: > Can we get the broken cpu hotunplug code out of the suspend path? I think we can move it around. I don't think we can get rid of it. > If we can get the devices into a low power state and not generating > interrupts by the time we disable cpus then we do not need to migrate > irqs from process context and risk hitting the ioapic bugs. At least one issue is that the actual final "go to sleep" is something that has to happen on just one CPU. And I'm pretty sure the others have to have gone through the shutdown sequence before that. And knowing ACPI, the ordering requirements will boil down to something insane, like "you have to turn off the other CPU's _before_ you turn off some od the core devices, because turning off the other CPU's may involve them". So if what you would _want_ to do is to move the "turn off CPU's" into the very innermost layer, so that different architectures can then decide whether they even need to go through that whole thing or not (because turning off one core will automatically turn off all the others, simply because the power was turned off), I suspect the answer is "no". So you were probably hoping to never have to have that whole horrible issue with moving interrupts around. I'm afraid I'm not seeing it happen. But maybe we can have it happen after we've disabled all the non-system devices, so that in practice there simply won't be any new interrupts coming in any more. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-25 4:26 ` Linus Torvalds (?) @ 2009-02-25 4:59 ` Eric W. Biederman -1 siblings, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-25 4:59 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Ingo Molnar, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 24 Feb 2009, Eric W. Biederman wrote: >> The question I was asking is: >> Can we get the broken cpu hotunplug code out of the suspend path? > > I think we can move it around. I don't think we can get rid of it. > >> If we can get the devices into a low power state and not generating >> interrupts by the time we disable cpus then we do not need to migrate >> irqs from process context and risk hitting the ioapic bugs. > > At least one issue is that the actual final "go to sleep" is something > that has to happen on just one CPU. And I'm pretty sure the others have to > have gone through the shutdown sequence before that. > > And knowing ACPI, the ordering requirements will boil down to something > insane, like "you have to turn off the other CPU's _before_ you turn off > some od the core devices, because turning off the other CPU's may involve > them". > > So if what you would _want_ to do is to move the "turn off CPU's" into the > very innermost layer, so that different architectures can then decide > whether they even need to go through that whole thing or not (because > turning off one core will automatically turn off all the others, simply > because the power was turned off), I suspect the answer is "no". > > So you were probably hoping to never have to have that whole horrible > issue with moving interrupts around. I'm afraid I'm not seeing it happen. > But maybe we can have it happen after we've disabled all the non-system > devices, so that in practice there simply won't be any new interrupts > coming in any more. Right. That is what I am hoping for. No device interrupts coming into the cpus at the time we turn them off. We can disable the devices and thus disable the interrupts the devices are sending before we disable the cpus. That should make cpu disable on suspend much easier to get solid then general x86 cpu hot-unplug. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-25 4:26 ` Linus Torvalds (?) (?) @ 2009-02-25 4:59 ` Eric W. Biederman -1 siblings, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-25 4:59 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, pm list Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 24 Feb 2009, Eric W. Biederman wrote: >> The question I was asking is: >> Can we get the broken cpu hotunplug code out of the suspend path? > > I think we can move it around. I don't think we can get rid of it. > >> If we can get the devices into a low power state and not generating >> interrupts by the time we disable cpus then we do not need to migrate >> irqs from process context and risk hitting the ioapic bugs. > > At least one issue is that the actual final "go to sleep" is something > that has to happen on just one CPU. And I'm pretty sure the others have to > have gone through the shutdown sequence before that. > > And knowing ACPI, the ordering requirements will boil down to something > insane, like "you have to turn off the other CPU's _before_ you turn off > some od the core devices, because turning off the other CPU's may involve > them". > > So if what you would _want_ to do is to move the "turn off CPU's" into the > very innermost layer, so that different architectures can then decide > whether they even need to go through that whole thing or not (because > turning off one core will automatically turn off all the others, simply > because the power was turned off), I suspect the answer is "no". > > So you were probably hoping to never have to have that whole horrible > issue with moving interrupts around. I'm afraid I'm not seeing it happen. > But maybe we can have it happen after we've disabled all the non-system > devices, so that in practice there simply won't be any new interrupts > coming in any more. Right. That is what I am hoping for. No device interrupts coming into the cpus at the time we turn them off. We can disable the devices and thus disable the interrupts the devices are sending before we disable the cpus. That should make cpu disable on suspend much easier to get solid then general x86 cpu hot-unplug. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:51 ` Linus Torvalds ` (2 preceding siblings ...) 2009-02-25 4:16 ` Eric W. Biederman @ 2009-02-25 4:16 ` Eric W. Biederman 3 siblings, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-25 4:16 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, pm list Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: >> >> > The only safe way on x86 to shutdown a level triggered ioapic irq >> > outside of irq context is for the driver to program the hardware to >> > not generate an irq. >> >> Well, that changes things quite a bit, because it means we can't change the >> suspend-resume sequence in a way we thought we could without fixing all >> drivers first, but this is exactly what we'd like to avoid by changing the >> core. > > Calling "disable_irq()" is perfectly fine. Agreed, I did not mean to indicate otherwise. > What is not possible on that broken IO-APIC (among other things) is to > actually turn the interrupts off at the apic (ie the whole ->shutdown() > thing). But that's not what we even want to do. What we care about is > just disabling the interrupt from a drievr perspective. > > IOW, the patches I have seen are fine, and all the comments from Eric are > just confusion about what we want done. Largely yes. > WE DO NOT WANT TO TURN OFF THE IO-APIC. That may or may happen later, but > that's totally unrelated to this whole "suspend_device_irq()" thing. Right. The question I was asking is: Can we get the broken cpu hotunplug code out of the suspend path? If we can get the devices into a low power state and not generating interrupts by the time we disable cpus then we do not need to migrate irqs from process context and risk hitting the ioapic bugs. While related safely suspending cpus is a different problem and a different patch. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 22:51 ` Linus Torvalds @ 2009-02-25 15:32 ` Alan Stern 2009-02-25 15:32 ` [linux-pm] " Alan Stern 3 siblings, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-02-25 15:32 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar, Linus Torvalds, Thomas Gleixner On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > I think the most important source of level triggered interrupts are PCI > devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device > Control register to prevent devices from generating INTx after the drivers' > suspend routines have been executed? I wish that were true. As I recall, the original PCI specification did not define this bit, and older PCI devices don't support it. So you can't count on being able to supress interrupt generation this way. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 22:42 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-02-25 15:32 ` Alan Stern @ 2009-02-25 15:32 ` Alan Stern 2009-02-25 16:19 ` Linus Torvalds 3 siblings, 1 reply; 373+ messages in thread From: Alan Stern @ 2009-02-25 15:32 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Eric W. Biederman, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > I think the most important source of level triggered interrupts are PCI > devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device > Control register to prevent devices from generating INTx after the drivers' > suspend routines have been executed? I wish that were true. As I recall, the original PCI specification did not define this bit, and older PCI devices don't support it. So you can't count on being able to supress interrupt generation this way. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-25 15:32 ` [linux-pm] " Alan Stern @ 2009-02-25 16:19 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-25 16:19 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Eric W. Biederman, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, pm list On Wed, 25 Feb 2009, Alan Stern wrote: > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > I think the most important source of level triggered interrupts are PCI > > devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device > > Control register to prevent devices from generating INTx after the drivers' > > suspend routines have been executed? > > I wish that were true. As I recall, the original PCI specification did > not define this bit, and older PCI devices don't support it. So you > can't count on being able to supress interrupt generation this way. It's definitely a new feature. In fact, I think even the current one makes it optional, so even for "new" devices it's very unclear how many of them actually support that bit. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-25 16:19 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-25 16:19 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Wed, 25 Feb 2009, Alan Stern wrote: > On Tue, 24 Feb 2009, Rafael J. Wysocki wrote: > > > I think the most important source of level triggered interrupts are PCI > > devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device > > Control register to prevent devices from generating INTx after the drivers' > > suspend routines have been executed? > > I wish that were true. As I recall, the original PCI specification did > not define this bit, and older PCI devices don't support it. So you > can't count on being able to supress interrupt generation this way. It's definitely a new feature. In fact, I think even the current one makes it optional, so even for "new" devices it's very unclear how many of them actually support that bit. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-24 3:30 ` Eric W. Biederman 2009-02-24 22:42 ` Rafael J. Wysocki @ 2009-02-24 22:42 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-24 22:42 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Tuesday 24 February 2009, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > On Monday 23 February 2009, Eric W. Biederman wrote: > >> "Rafael J. Wysocki" <rjw@sisk.pl> writes: > >> > >> >> I don't know where in the state machine this is getting called but > >> >> I would suggest doing this before we shutdown cpus. > >> > > >> > This is the plan. In fact, I'm going to do this in the next patch after the > >> > $subject one has been tested and found acceptable. > >> > >> Good to hear. Then let's please get a version of the irq disable that calls > >> shutdown, so we can be certain we don't have hardware irqs in flight. > >> > >> For the drivers it should not matter for clean cpu shutdown it will. > > > > OK, I will. > > My apologies I was wrong. Calling shutdown is not safe. > > I just remembered that masking an ioapic from anywhere besides the > irq handler can lock the ioapic state machine, and lead to non-recoverable > interrupts. It is rare but I have seen it happen. I wanted to figure out > how to migrate interrupts outside of interrupt context and this was what > prevented me. A suspend/resume cycle might be enough of a reset to > get the ioapic out of that state but I don't know. > > The only safe way on x86 to shutdown a level triggered ioapic irq > outside of irq context is for the driver to program the hardware to > not generate an irq. Well, that changes things quite a bit, because it means we can't change the suspend-resume sequence in a way we thought we could without fixing all drivers first, but this is exactly what we'd like to avoid by changing the core. I think the most important source of level triggered interrupts are PCI devices, so perhaps we can make the PCI PM core use bit 10 of the PCI Device Control register to prevent devices from generating INTx after the drivers' suspend routines have been executed? > Therefore doing anything with the irqs at the point where we are > suspending them is a formality, and perhaps simply code that ensures > in-flight irqs don't make it past a certain point. > > I believe we just need to call disable() and print a big nasty warning > if any irq comes in after the suspend stage. At the moment we're safe, since PCI devices are put into low power states in the suspend stage. However, we'd like to make that happen in the "late suspend" stage to avoid a problem with a shared interrupt occuring after one of the devices using it has been suspended and its driver's irq handler can't cope with that. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-24 3:30 ` Eric W. Biederman @ 2009-02-24 3:30 ` Eric W. Biederman 1 sibling, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-24 3:30 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list "Rafael J. Wysocki" <rjw@sisk.pl> writes: > On Monday 23 February 2009, Eric W. Biederman wrote: >> "Rafael J. Wysocki" <rjw@sisk.pl> writes: >> >> >> I don't know where in the state machine this is getting called but >> >> I would suggest doing this before we shutdown cpus. >> > >> > This is the plan. In fact, I'm going to do this in the next patch after the >> > $subject one has been tested and found acceptable. >> >> Good to hear. Then let's please get a version of the irq disable that calls >> shutdown, so we can be certain we don't have hardware irqs in flight. >> >> For the drivers it should not matter for clean cpu shutdown it will. > > OK, I will. My apologies I was wrong. Calling shutdown is not safe. I just remembered that masking an ioapic from anywhere besides the irq handler can lock the ioapic state machine, and lead to non-recoverable interrupts. It is rare but I have seen it happen. I wanted to figure out how to migrate interrupts outside of interrupt context and this was what prevented me. A suspend/resume cycle might be enough of a reset to get the ioapic out of that state but I don't know. The only safe way on x86 to shutdown a level triggered ioapic irq outside of irq context is for the driver to program the hardware to not generate an irq. Therefore doing anything with the irqs at the point where we are suspending them is a formality, and perhaps simply code that ensures in-flight irqs don't make it past a certain point. I believe we just need to call disable() and print a big nasty warning if any irq comes in after the suspend stage. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki @ 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 11:04 ` Ingo Molnar 2009-02-23 11:04 ` Ingo Molnar 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 11:03 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Monday 23 February 2009, Eric W. Biederman wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > >> Ingo Molnar <mingo@elte.hu> writes: > >> > >> > I think this aspect has been well-understood during the > >> > discussion of this topic and it's just a slightly misleading > >> > changelog. > >> > >> As I was a member of that discussion I did not see that. > >> > >> It took me several passes through the patches to realize the > >> goal is to allow drivers to be able to sleep while they are in > >> their late pm shutdown routines. > >> > >> Why we want this I don't know. But it seems simple enough to > >> implement, and it makes it harder to get the late pm suspend > >> routines wrong, which is always good. > > > > That's not the only goal. The other goal is to further shrink a > > particular window of suspend fragility: the irqs-disabled stage > > of the suspend/resume sequence. > > > > Since suspend/resume is a mini-reboot sequence, there's a large > > amount of code executed - and the variety of code is large as > > well. We had repeat cases of random drivers re-enabling > > interrupts and thus breaking other drivers - and these are nasty > > to debug. > > > > So this patchset disables device IRQs centrally and serializes > > with pending work - so there's no races with pending IRQs > > anymore. > > > > The fact that we keep the timer irq running is two-fold: firstly > > the timer code is special and not really part of the regular > > suspend/resume sequence. > > > > Drivers want to take timestamps, sometimes they even want to do > > a small usleep(), etc. Ideally the suspend/resume code is pretty > > much _the same_ as a regular bootup (and shutdown) code - so we > > want to provide a similar environment to how drivers initialize > > and deinitialize, and we want to enable them to share code > > between bootup/shutdown and suspend/resume agressively. > > > > So the more generic kernel environment we give these fragile > > handlers, the better we are off in the end. Since we already had > > IRQS_TIMER, that was just the natural thing to do. > > I am all for sharing code, especially if we can factor if > we can find common factors that do the same thing. > > I don't know how many times I have found drivers doing something > weird in their shutdown routines that they don't know how > to get the device out of. The e1000 driver has shown up several > times because it likes to suspend the device on shutdown. > > The fact that the methods exposed to drivers were only defined > to be usable on the s2ram/hibernate path is something I have > brought up on more than one occasion as a bad choice. > > I'm really not convinced that the rational for separating > out the shutdown methods from the remove methods has > been very good. That of we don't need to clean up the in-kernel > data structures on reboot so why do something extra that can > introduce instability. > > So having been watching a smaller form of this drama on the > reboot path for several years. Having had a device method > with fixed semantics, and not the dwm sematics of the historical > suspend routing. I expect there is still a ways to go before > it is simple and easy for drivers to figure out what they need > to implement out of the confusing variety of possible device > methods. > > >> > The new suspend code does not rely on truly disabling IRQs > >> > on the low level. The purpose is to not get IRQs to drivers > >> > - which might crash/hang/race/misbehave. > >> > >> Reasonable. I expect one of the problems with drivers getting > >> it wrong is that the interface is too complex for mortal > >> humans to understand. > > > > The suspend/resume state machine certainly used to be a piece of > > code that makes a seasoned kernel developer weep in fear. > > > > That has changed drastically in the past few months. The > > suspend+hibernation logic got unified (at least as far as driver > > methods go), and all the flow and ordering has been cleaned up > > and has been made more robust. > > I will have to look again. My impression is that overloading > a single method is part of what got us into this mess in the > first place. > > No that I don't see things getting better. > > > What makes s2ram fragile is not human failure but the > > combination of a handful of physical property: > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > a deceivingly 'simple' action to the user. But under the > > hood, a ton of stuff happens: we deinitialize a lot of > > things, we go through _all hardware state_, and we do so in a > > serial fashion. If just one piece fails to do the right > > thing, the box might not resume. Still, the user expects this > > 'simple' thing to just work, all the time. No excuses > > accepted. > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > runs through tens of thousands of lines of code. Code which > > never gets executed on a normal box - only if we s2ram. If > > just one step fails, we get a hung box. > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > making any bugs hard to debug. Furthermore we have no > > meaningful persistent storage either for kernel bug messages. > > The RTC trick of PM_DEBUG works but is a very narrow channel > > of information and it takes a lot of time to debug a bug via > > that method. > > Yep that is an issue. > > > The combination of these factors really makes up for a perfect > > storm in terms of kernel technology: we have this > > very-deceivingly-simple-looking but complex-and-rarely-executed > > piece of code, which is very hard to debug. > > And much of this as you are finding with this piece of code > is how the software was designed rather then how the software > needed to be. > > > Even just one of these factors would be enough to make an > > otherwise healthy subsystem fragile - no wonder s2ram has been a > > problem ever since it existed in the upstream kernel. > > > > So now we need just one thing: patience and more of the same > > good stuff that happened lately. > > I think there has been some good progress, and so I am happy > to be patient. I will still mention on occasion what it > seems we are doing wrong. Unfortunately I don't have time > to do a lot more than that. > > >> > Still, it might make sense to not just use the ->disable > >> > sequence but primarily the ->shutdown irqchip method (when > >> > it's available in the irqchip). > >> > >> Disable seems fine to me. This is interesting in the context > >> of all of the irqs that will when masked show up somewhere > >> else (think boot interrupts). > >> > >> > While we obviously cannot turn off the PIC that delivers > >> > timer IRQs at this stage - there's no theoretical reason why > >> > the suspend sequence couldnt power down some secondary PICs > >> > as well - in some arch code, or maybe even in the generic > >> > driver suspend sequence if the device tree is structured > >> > carefully enough so that the PIC gets turned off last. > >> > >> If the point is simply to prevent deliver of irqs to the > >> drivers I don't see the point of anything more than what the > >> patch does now. > > > > ... except for the usecase i described above. Say some PIC sits > > on a piece of silicon which gets turned off. I'm not talking > > about x86 but some custom device. We really dont want that IRQ > > line to send half of an IRQ message (un-ACK-ed) when it gets > > turned off. So physically 'suspending' all IRQ lines does make a > > certain level of long-term sense. > > Good point. We will loose both level and edge triggered events > that occur between suspending the irqs and restoring them but > that is inevitable. So we might as well call shutdown and totally > turn off the irqs if we can. > > I don't know where in the state machine this is getting called but > I would suggest doing this before we shutdown cpus. This is the plan. In fact, I'm going to do this in the next patch after the $subject one has been tested and found acceptable. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 11:03 ` Rafael J. Wysocki @ 2009-02-23 11:04 ` Ingo Molnar 2009-02-23 14:45 ` Rafael J. Wysocki 2009-02-23 14:45 ` Rafael J. Wysocki 2009-02-23 11:04 ` Ingo Molnar 3 siblings, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 11:04 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > > What makes s2ram fragile is not human failure but the > > combination of a handful of physical property: > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > a deceivingly 'simple' action to the user. But under the > > hood, a ton of stuff happens: we deinitialize a lot of > > things, we go through _all hardware state_, and we do so in a > > serial fashion. If just one piece fails to do the right > > thing, the box might not resume. Still, the user expects this > > 'simple' thing to just work, all the time. No excuses > > accepted. > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > runs through tens of thousands of lines of code. Code which > > never gets executed on a normal box - only if we s2ram. If > > just one step fails, we get a hung box. > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > making any bugs hard to debug. Furthermore we have no > > meaningful persistent storage either for kernel bug messages. > > The RTC trick of PM_DEBUG works but is a very narrow channel > > of information and it takes a lot of time to debug a bug via > > that method. > > Yep that is an issue. I'd also like to add #4: 4) One more thing that makes s2ram special is that when the resume path finds hardware often in an even more deinitialized form than during normal bootup. During normal bootup the BIOS/firmware has at least done some minimal bootstrap (to get the kernel loaded), which makes life easier for the kernel. At s2ram stage we've got a completely pure hardware init state, with very minimal firmware activation. So many of the init and deinit problems and bugs we only hit in the s2ram path - which dynamics is again not helpful. > > The combination of these factors really makes up for a > > perfect storm in terms of kernel technology: we have this > > very-deceivingly-simple-looking but > > complex-and-rarely-executed piece of code, which is very > > hard to debug. > > And much of this as you are finding with this piece of code is > how the software was designed rather then how the software > needed to be. Well most of the 4 problems above are externalities and cannot go away just by fixing the kernel. #1 will always be with us. #3 needs the hardware to change. It's happening, but slowly. #4 will be with us as long as there's non-Linux BIOSes #2 is the only thing where we can make a realistic difference, but there's just so much we can do there. And that still leaves the other three items: each of which is powerful enough of a force to give a bad name to any normal subsystem. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:04 ` Ingo Molnar @ 2009-02-23 14:45 ` Rafael J. Wysocki 2009-02-23 15:06 ` Ingo Molnar 2009-02-23 14:45 ` Rafael J. Wysocki 1 sibling, 1 reply; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 14:45 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > What makes s2ram fragile is not human failure but the > > > combination of a handful of physical property: > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > a deceivingly 'simple' action to the user. But under the > > > hood, a ton of stuff happens: we deinitialize a lot of > > > things, we go through _all hardware state_, and we do so in a > > > serial fashion. If just one piece fails to do the right > > > thing, the box might not resume. Still, the user expects this > > > 'simple' thing to just work, all the time. No excuses > > > accepted. > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > runs through tens of thousands of lines of code. Code which > > > never gets executed on a normal box - only if we s2ram. If > > > just one step fails, we get a hung box. > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > making any bugs hard to debug. Furthermore we have no > > > meaningful persistent storage either for kernel bug messages. > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > of information and it takes a lot of time to debug a bug via > > > that method. > > > > Yep that is an issue. > > I'd also like to add #4: > > 4) One more thing that makes s2ram special is that when the > resume path finds hardware often in an even more > deinitialized form than during normal bootup. During > normal bootup the BIOS/firmware has at least done some > minimal bootstrap (to get the kernel loaded), which > makes life easier for the kernel. > > At s2ram stage we've got a completely pure hardware > init state, with very minimal firmware activation. This is very true and at least in some cases done on purpose, AFAICS, due to some timing constraints forced on HW vendors by M$, for example. > So many of the init and deinit problems and bugs we only > hit in the s2ram path - which dynamics is again not > helpful. Plus ACPI requires us to do additional things during suspend-resume that are not done on boot-shutdown and which have their own ordering requirements (not necessarily stated directly, but such that we have do discover experimentally). That also change from one BIOS to another. > > > The combination of these factors really makes up for a > > > perfect storm in terms of kernel technology: we have this > > > very-deceivingly-simple-looking but > > > complex-and-rarely-executed piece of code, which is very > > > hard to debug. > > > > And much of this as you are finding with this piece of code is > > how the software was designed rather then how the software > > needed to be. > > Well most of the 4 problems above are externalities and cannot > go away just by fixing the kernel. > > #1 will always be with us. > #3 needs the hardware to change. It's happening, but slowly. > #4 will be with us as long as there's non-Linux BIOSes > > #2 is the only thing where we can make a realistic difference, > but there's just so much we can do there. > > And that still leaves the other three items: each of which is > powerful enough of a force to give a bad name to any normal > subsystem. Agreed. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 14:45 ` Rafael J. Wysocki @ 2009-02-23 15:06 ` Ingo Molnar 0 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 15:06 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Eric W. Biederman, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Monday 23 February 2009, Ingo Molnar wrote: > > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > > > What makes s2ram fragile is not human failure but the > > > > combination of a handful of physical property: > > > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > > a deceivingly 'simple' action to the user. But under the > > > > hood, a ton of stuff happens: we deinitialize a lot of > > > > things, we go through _all hardware state_, and we do so in a > > > > serial fashion. If just one piece fails to do the right > > > > thing, the box might not resume. Still, the user expects this > > > > 'simple' thing to just work, all the time. No excuses > > > > accepted. > > > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > > runs through tens of thousands of lines of code. Code which > > > > never gets executed on a normal box - only if we s2ram. If > > > > just one step fails, we get a hung box. > > > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > > making any bugs hard to debug. Furthermore we have no > > > > meaningful persistent storage either for kernel bug messages. > > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > > of information and it takes a lot of time to debug a bug via > > > > that method. > > > > > > Yep that is an issue. > > > > I'd also like to add #4: > > > > 4) One more thing that makes s2ram special is that when the > > resume path finds hardware often in an even more > > deinitialized form than during normal bootup. During > > normal bootup the BIOS/firmware has at least done some > > minimal bootstrap (to get the kernel loaded), which > > makes life easier for the kernel. > > > > At s2ram stage we've got a completely pure hardware > > init state, with very minimal firmware activation. > > This is very true and at least in some cases done on purpose, > AFAICS, due to some timing constraints forced on HW vendors by > M$, for example. IMHO i think it's the technically sane thing to do. Personally i trust the quirks of bare metal much more than the combined quirks of firmware _and_ bare metal. > > So many of the init and deinit problems and bugs we > > only hit in the s2ram path - which dynamics is again > > not helpful. > > Plus ACPI requires us to do additional things during > suspend-resume that are not done on boot-shutdown and which > have their own ordering requirements (not necessarily stated > directly, but such that we have do discover experimentally). > That also change from one BIOS to another. We could perhaps do a few things here to trigger bugs sooner. For example at driver init, instead of executing just ->driver_open(), we could execute: ->driver_open() ->driver_suspend() ->driver_resume() I.e. we'd simulate a suspend+resume mini-step. This makes it sure that the basic driver callbacks are sane. It is also supposed to work because the driver is just being initialized. This way certain types of bugs would not show up as difficult to debug s2ram regressions - but would show up as 'boot hang' or 'boot crash' bugs. This does not simulate the "big picture" resume machinery (the dependencies, etc.), nor does it trigger any of the "hardware got really turned off" effects that true resume will trigger - but at least it offloads a portion of the testing space from 's2ram' to 'bootup' testing. What's your feeling - what percentage of all s2ram regressions in the last year or so could have been triggered this way? Lets assume we had 100 regressions in that timeframe - would it be in the 10 bugs range? Or much lower or much higher? Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 15:06 ` Ingo Molnar 0 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 15:06 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Monday 23 February 2009, Ingo Molnar wrote: > > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > > > What makes s2ram fragile is not human failure but the > > > > combination of a handful of physical property: > > > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > > a deceivingly 'simple' action to the user. But under the > > > > hood, a ton of stuff happens: we deinitialize a lot of > > > > things, we go through _all hardware state_, and we do so in a > > > > serial fashion. If just one piece fails to do the right > > > > thing, the box might not resume. Still, the user expects this > > > > 'simple' thing to just work, all the time. No excuses > > > > accepted. > > > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > > runs through tens of thousands of lines of code. Code which > > > > never gets executed on a normal box - only if we s2ram. If > > > > just one step fails, we get a hung box. > > > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > > making any bugs hard to debug. Furthermore we have no > > > > meaningful persistent storage either for kernel bug messages. > > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > > of information and it takes a lot of time to debug a bug via > > > > that method. > > > > > > Yep that is an issue. > > > > I'd also like to add #4: > > > > 4) One more thing that makes s2ram special is that when the > > resume path finds hardware often in an even more > > deinitialized form than during normal bootup. During > > normal bootup the BIOS/firmware has at least done some > > minimal bootstrap (to get the kernel loaded), which > > makes life easier for the kernel. > > > > At s2ram stage we've got a completely pure hardware > > init state, with very minimal firmware activation. > > This is very true and at least in some cases done on purpose, > AFAICS, due to some timing constraints forced on HW vendors by > M$, for example. IMHO i think it's the technically sane thing to do. Personally i trust the quirks of bare metal much more than the combined quirks of firmware _and_ bare metal. > > So many of the init and deinit problems and bugs we > > only hit in the s2ram path - which dynamics is again > > not helpful. > > Plus ACPI requires us to do additional things during > suspend-resume that are not done on boot-shutdown and which > have their own ordering requirements (not necessarily stated > directly, but such that we have do discover experimentally). > That also change from one BIOS to another. We could perhaps do a few things here to trigger bugs sooner. For example at driver init, instead of executing just ->driver_open(), we could execute: ->driver_open() ->driver_suspend() ->driver_resume() I.e. we'd simulate a suspend+resume mini-step. This makes it sure that the basic driver callbacks are sane. It is also supposed to work because the driver is just being initialized. This way certain types of bugs would not show up as difficult to debug s2ram regressions - but would show up as 'boot hang' or 'boot crash' bugs. This does not simulate the "big picture" resume machinery (the dependencies, etc.), nor does it trigger any of the "hardware got really turned off" effects that true resume will trigger - but at least it offloads a portion of the testing space from 's2ram' to 'bootup' testing. What's your feeling - what percentage of all s2ram regressions in the last year or so could have been triggered this way? Lets assume we had 100 regressions in that timeframe - would it be in the 10 bugs range? Or much lower or much higher? Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 15:06 ` Ingo Molnar @ 2009-02-23 21:59 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 21:59 UTC (permalink / raw) To: Ingo Molnar Cc: Eric W. Biederman, Linus Torvalds, LKML, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Monday 23 February 2009, Ingo Molnar wrote: > > > > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > > > > > What makes s2ram fragile is not human failure but the > > > > > combination of a handful of physical property: > > > > > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > > > a deceivingly 'simple' action to the user. But under the > > > > > hood, a ton of stuff happens: we deinitialize a lot of > > > > > things, we go through _all hardware state_, and we do so in a > > > > > serial fashion. If just one piece fails to do the right > > > > > thing, the box might not resume. Still, the user expects this > > > > > 'simple' thing to just work, all the time. No excuses > > > > > accepted. > > > > > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > > > runs through tens of thousands of lines of code. Code which > > > > > never gets executed on a normal box - only if we s2ram. If > > > > > just one step fails, we get a hung box. > > > > > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > > > making any bugs hard to debug. Furthermore we have no > > > > > meaningful persistent storage either for kernel bug messages. > > > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > > > of information and it takes a lot of time to debug a bug via > > > > > that method. > > > > > > > > Yep that is an issue. > > > > > > I'd also like to add #4: > > > > > > 4) One more thing that makes s2ram special is that when the > > > resume path finds hardware often in an even more > > > deinitialized form than during normal bootup. During > > > normal bootup the BIOS/firmware has at least done some > > > minimal bootstrap (to get the kernel loaded), which > > > makes life easier for the kernel. > > > > > > At s2ram stage we've got a completely pure hardware > > > init state, with very minimal firmware activation. > > > > This is very true and at least in some cases done on purpose, > > AFAICS, due to some timing constraints forced on HW vendors by > > M$, for example. > > IMHO i think it's the technically sane thing to do. Personally i > trust the quirks of bare metal much more than the combined > quirks of firmware _and_ bare metal. > > > > So many of the init and deinit problems and bugs we > > > only hit in the s2ram path - which dynamics is again > > > not helpful. > > > > Plus ACPI requires us to do additional things during > > suspend-resume that are not done on boot-shutdown and which > > have their own ordering requirements (not necessarily stated > > directly, but such that we have do discover experimentally). > > That also change from one BIOS to another. > > We could perhaps do a few things here to trigger bugs sooner. > > For example at driver init, instead of executing just > ->driver_open(), we could execute: > > ->driver_open() > ->driver_suspend() > ->driver_resume() I'm not sure. On PCI we run some code apart from the driver's suspend and resume callbacks, especially in the new framework, and the bus type executes the driver callbacks. > I.e. we'd simulate a suspend+resume mini-step. This makes it > sure that the basic driver callbacks are sane. It is also > supposed to work because the driver is just being initialized. > > This way certain types of bugs would not show up as difficult to > debug s2ram regressions - but would show up as 'boot hang' or > 'boot crash' bugs. There is a testing facility exactly for this (/sys/power/pm_test) that allows you to simulate the entire suspend sequence without suspending as well as some separate pieces of it. Still, it doesn't work very well, because the conditions in which the resume callbacks are being run differ substantially from the conditions right after we get control from the BIOS. For one example, if ->suspend() puts the device into D3, then your simulated ->resume() will get the device in D3, while the BIOS would probably put it into D0 (at least as far as PCI devices are concerned). > This does not simulate the "big picture" resume machinery (the > dependencies, etc.), nor does it trigger any of the "hardware > got really turned off" effects that true resume will trigger - > but at least it offloads a portion of the testing space from > 's2ram' to 'bootup' testing. > > What's your feeling - what percentage of all s2ram regressions > in the last year or so could have been triggered this way? Lets > assume we had 100 regressions in that timeframe - would it be in > the 10 bugs range? Or much lower or much higher? Very small number of actual bugs with rather a lot of false positives. IMO there are three basic sources of recent suspend regressions: 1) Arch-dependent changes (x86 mostly) and low-level changes affecting suspend (like PCI bus enumeration, IOMMU etc.), where people didn't realize their modifications would have a broader effect. 2) PM core changes where we weren't sure what was the best way to go (probably I'm to blame for the majority of these). 3) Changes related to graphics (this has always been difficult, but is getting much better now). Driver regressions, other than the graphics-related, are really a very small fraction. Well, there still are some known problems unsolved, but that's a different matter. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 21:59 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 21:59 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Monday 23 February 2009, Ingo Molnar wrote: > > > > > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > > > > > What makes s2ram fragile is not human failure but the > > > > > combination of a handful of physical property: > > > > > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > > > a deceivingly 'simple' action to the user. But under the > > > > > hood, a ton of stuff happens: we deinitialize a lot of > > > > > things, we go through _all hardware state_, and we do so in a > > > > > serial fashion. If just one piece fails to do the right > > > > > thing, the box might not resume. Still, the user expects this > > > > > 'simple' thing to just work, all the time. No excuses > > > > > accepted. > > > > > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > > > runs through tens of thousands of lines of code. Code which > > > > > never gets executed on a normal box - only if we s2ram. If > > > > > just one step fails, we get a hung box. > > > > > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > > > making any bugs hard to debug. Furthermore we have no > > > > > meaningful persistent storage either for kernel bug messages. > > > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > > > of information and it takes a lot of time to debug a bug via > > > > > that method. > > > > > > > > Yep that is an issue. > > > > > > I'd also like to add #4: > > > > > > 4) One more thing that makes s2ram special is that when the > > > resume path finds hardware often in an even more > > > deinitialized form than during normal bootup. During > > > normal bootup the BIOS/firmware has at least done some > > > minimal bootstrap (to get the kernel loaded), which > > > makes life easier for the kernel. > > > > > > At s2ram stage we've got a completely pure hardware > > > init state, with very minimal firmware activation. > > > > This is very true and at least in some cases done on purpose, > > AFAICS, due to some timing constraints forced on HW vendors by > > M$, for example. > > IMHO i think it's the technically sane thing to do. Personally i > trust the quirks of bare metal much more than the combined > quirks of firmware _and_ bare metal. > > > > So many of the init and deinit problems and bugs we > > > only hit in the s2ram path - which dynamics is again > > > not helpful. > > > > Plus ACPI requires us to do additional things during > > suspend-resume that are not done on boot-shutdown and which > > have their own ordering requirements (not necessarily stated > > directly, but such that we have do discover experimentally). > > That also change from one BIOS to another. > > We could perhaps do a few things here to trigger bugs sooner. > > For example at driver init, instead of executing just > ->driver_open(), we could execute: > > ->driver_open() > ->driver_suspend() > ->driver_resume() I'm not sure. On PCI we run some code apart from the driver's suspend and resume callbacks, especially in the new framework, and the bus type executes the driver callbacks. > I.e. we'd simulate a suspend+resume mini-step. This makes it > sure that the basic driver callbacks are sane. It is also > supposed to work because the driver is just being initialized. > > This way certain types of bugs would not show up as difficult to > debug s2ram regressions - but would show up as 'boot hang' or > 'boot crash' bugs. There is a testing facility exactly for this (/sys/power/pm_test) that allows you to simulate the entire suspend sequence without suspending as well as some separate pieces of it. Still, it doesn't work very well, because the conditions in which the resume callbacks are being run differ substantially from the conditions right after we get control from the BIOS. For one example, if ->suspend() puts the device into D3, then your simulated ->resume() will get the device in D3, while the BIOS would probably put it into D0 (at least as far as PCI devices are concerned). > This does not simulate the "big picture" resume machinery (the > dependencies, etc.), nor does it trigger any of the "hardware > got really turned off" effects that true resume will trigger - > but at least it offloads a portion of the testing space from > 's2ram' to 'bootup' testing. > > What's your feeling - what percentage of all s2ram regressions > in the last year or so could have been triggered this way? Lets > assume we had 100 regressions in that timeframe - would it be in > the 10 bugs range? Or much lower or much higher? Very small number of actual bugs with rather a lot of false positives. IMO there are three basic sources of recent suspend regressions: 1) Arch-dependent changes (x86 mostly) and low-level changes affecting suspend (like PCI bus enumeration, IOMMU etc.), where people didn't realize their modifications would have a broader effect. 2) PM core changes where we weren't sure what was the best way to go (probably I'm to blame for the majority of these). 3) Changes related to graphics (this has always been difficult, but is getting much better now). Driver regressions, other than the graphics-related, are really a very small fraction. Well, there still are some known problems unsolved, but that's a different matter. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:04 ` Ingo Molnar 2009-02-23 14:45 ` Rafael J. Wysocki @ 2009-02-23 14:45 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 14:45 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Eric W. Biederman <ebiederm@xmission.com> wrote: > > > > What makes s2ram fragile is not human failure but the > > > combination of a handful of physical property: > > > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > > a deceivingly 'simple' action to the user. But under the > > > hood, a ton of stuff happens: we deinitialize a lot of > > > things, we go through _all hardware state_, and we do so in a > > > serial fashion. If just one piece fails to do the right > > > thing, the box might not resume. Still, the user expects this > > > 'simple' thing to just work, all the time. No excuses > > > accepted. > > > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > > runs through tens of thousands of lines of code. Code which > > > never gets executed on a normal box - only if we s2ram. If > > > just one step fails, we get a hung box. > > > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > > making any bugs hard to debug. Furthermore we have no > > > meaningful persistent storage either for kernel bug messages. > > > The RTC trick of PM_DEBUG works but is a very narrow channel > > > of information and it takes a lot of time to debug a bug via > > > that method. > > > > Yep that is an issue. > > I'd also like to add #4: > > 4) One more thing that makes s2ram special is that when the > resume path finds hardware often in an even more > deinitialized form than during normal bootup. During > normal bootup the BIOS/firmware has at least done some > minimal bootstrap (to get the kernel loaded), which > makes life easier for the kernel. > > At s2ram stage we've got a completely pure hardware > init state, with very minimal firmware activation. This is very true and at least in some cases done on purpose, AFAICS, due to some timing constraints forced on HW vendors by M$, for example. > So many of the init and deinit problems and bugs we only > hit in the s2ram path - which dynamics is again not > helpful. Plus ACPI requires us to do additional things during suspend-resume that are not done on boot-shutdown and which have their own ordering requirements (not necessarily stated directly, but such that we have do discover experimentally). That also change from one BIOS to another. > > > The combination of these factors really makes up for a > > > perfect storm in terms of kernel technology: we have this > > > very-deceivingly-simple-looking but > > > complex-and-rarely-executed piece of code, which is very > > > hard to debug. > > > > And much of this as you are finding with this piece of code is > > how the software was designed rather then how the software > > needed to be. > > Well most of the 4 problems above are externalities and cannot > go away just by fixing the kernel. > > #1 will always be with us. > #3 needs the hardware to change. It's happening, but slowly. > #4 will be with us as long as there's non-Linux BIOSes > > #2 is the only thing where we can make a realistic difference, > but there's just so much we can do there. > > And that still leaves the other three items: each of which is > powerful enough of a force to give a bad name to any normal > subsystem. Agreed. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 10:42 ` Eric W. Biederman ` (2 preceding siblings ...) 2009-02-23 11:04 ` Ingo Molnar @ 2009-02-23 11:04 ` Ingo Molnar 3 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 11:04 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, pm list, Linus Torvalds, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > > What makes s2ram fragile is not human failure but the > > combination of a handful of physical property: > > > > 1) Psychology: shutting the lid or pushing the suspend button is > > a deceivingly 'simple' action to the user. But under the > > hood, a ton of stuff happens: we deinitialize a lot of > > things, we go through _all hardware state_, and we do so in a > > serial fashion. If just one piece fails to do the right > > thing, the box might not resume. Still, the user expects this > > 'simple' thing to just work, all the time. No excuses > > accepted. > > > > 2) Length of code: To get a successful s2ram sequence the kernel > > runs through tens of thousands of lines of code. Code which > > never gets executed on a normal box - only if we s2ram. If > > just one step fails, we get a hung box. > > > > 3) Debuggability: a lot of s2ram code runs with the console off, > > making any bugs hard to debug. Furthermore we have no > > meaningful persistent storage either for kernel bug messages. > > The RTC trick of PM_DEBUG works but is a very narrow channel > > of information and it takes a lot of time to debug a bug via > > that method. > > Yep that is an issue. I'd also like to add #4: 4) One more thing that makes s2ram special is that when the resume path finds hardware often in an even more deinitialized form than during normal bootup. During normal bootup the BIOS/firmware has at least done some minimal bootstrap (to get the kernel loaded), which makes life easier for the kernel. At s2ram stage we've got a completely pure hardware init state, with very minimal firmware activation. So many of the init and deinit problems and bugs we only hit in the s2ram path - which dynamics is again not helpful. > > The combination of these factors really makes up for a > > perfect storm in terms of kernel technology: we have this > > very-deceivingly-simple-looking but > > complex-and-rarely-executed piece of code, which is very > > hard to debug. > > And much of this as you are finding with this piece of code is > how the software was designed rather then how the software > needed to be. Well most of the 4 problems above are externalities and cannot go away just by fixing the kernel. #1 will always be with us. #3 needs the hardware to change. It's happening, but slowly. #4 will be with us as long as there's non-Linux BIOSes #2 is the only thing where we can make a realistic difference, but there's just so much we can do there. And that still leaves the other three items: each of which is powerful enough of a force to give a bad name to any normal subsystem. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 10:42 ` Eric W. Biederman @ 2009-02-23 10:42 ` Eric W. Biederman 1 sibling, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 10:42 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, pm list, Linus Torvalds, Thomas Gleixner Ingo Molnar <mingo@elte.hu> writes: > * Eric W. Biederman <ebiederm@xmission.com> wrote: > >> Ingo Molnar <mingo@elte.hu> writes: >> >> > I think this aspect has been well-understood during the >> > discussion of this topic and it's just a slightly misleading >> > changelog. >> >> As I was a member of that discussion I did not see that. >> >> It took me several passes through the patches to realize the >> goal is to allow drivers to be able to sleep while they are in >> their late pm shutdown routines. >> >> Why we want this I don't know. But it seems simple enough to >> implement, and it makes it harder to get the late pm suspend >> routines wrong, which is always good. > > That's not the only goal. The other goal is to further shrink a > particular window of suspend fragility: the irqs-disabled stage > of the suspend/resume sequence. > > Since suspend/resume is a mini-reboot sequence, there's a large > amount of code executed - and the variety of code is large as > well. We had repeat cases of random drivers re-enabling > interrupts and thus breaking other drivers - and these are nasty > to debug. > > So this patchset disables device IRQs centrally and serializes > with pending work - so there's no races with pending IRQs > anymore. > > The fact that we keep the timer irq running is two-fold: firstly > the timer code is special and not really part of the regular > suspend/resume sequence. > > Drivers want to take timestamps, sometimes they even want to do > a small usleep(), etc. Ideally the suspend/resume code is pretty > much _the same_ as a regular bootup (and shutdown) code - so we > want to provide a similar environment to how drivers initialize > and deinitialize, and we want to enable them to share code > between bootup/shutdown and suspend/resume agressively. > > So the more generic kernel environment we give these fragile > handlers, the better we are off in the end. Since we already had > IRQS_TIMER, that was just the natural thing to do. I am all for sharing code, especially if we can factor if we can find common factors that do the same thing. I don't know how many times I have found drivers doing something weird in their shutdown routines that they don't know how to get the device out of. The e1000 driver has shown up several times because it likes to suspend the device on shutdown. The fact that the methods exposed to drivers were only defined to be usable on the s2ram/hibernate path is something I have brought up on more than one occasion as a bad choice. I'm really not convinced that the rational for separating out the shutdown methods from the remove methods has been very good. That of we don't need to clean up the in-kernel data structures on reboot so why do something extra that can introduce instability. So having been watching a smaller form of this drama on the reboot path for several years. Having had a device method with fixed semantics, and not the dwm sematics of the historical suspend routing. I expect there is still a ways to go before it is simple and easy for drivers to figure out what they need to implement out of the confusing variety of possible device methods. >> > The new suspend code does not rely on truly disabling IRQs >> > on the low level. The purpose is to not get IRQs to drivers >> > - which might crash/hang/race/misbehave. >> >> Reasonable. I expect one of the problems with drivers getting >> it wrong is that the interface is too complex for mortal >> humans to understand. > > The suspend/resume state machine certainly used to be a piece of > code that makes a seasoned kernel developer weep in fear. > > That has changed drastically in the past few months. The > suspend+hibernation logic got unified (at least as far as driver > methods go), and all the flow and ordering has been cleaned up > and has been made more robust. I will have to look again. My impression is that overloading a single method is part of what got us into this mess in the first place. No that I don't see things getting better. > What makes s2ram fragile is not human failure but the > combination of a handful of physical property: > > 1) Psychology: shutting the lid or pushing the suspend button is > a deceivingly 'simple' action to the user. But under the > hood, a ton of stuff happens: we deinitialize a lot of > things, we go through _all hardware state_, and we do so in a > serial fashion. If just one piece fails to do the right > thing, the box might not resume. Still, the user expects this > 'simple' thing to just work, all the time. No excuses > accepted. > > 2) Length of code: To get a successful s2ram sequence the kernel > runs through tens of thousands of lines of code. Code which > never gets executed on a normal box - only if we s2ram. If > just one step fails, we get a hung box. > > 3) Debuggability: a lot of s2ram code runs with the console off, > making any bugs hard to debug. Furthermore we have no > meaningful persistent storage either for kernel bug messages. > The RTC trick of PM_DEBUG works but is a very narrow channel > of information and it takes a lot of time to debug a bug via > that method. Yep that is an issue. > The combination of these factors really makes up for a perfect > storm in terms of kernel technology: we have this > very-deceivingly-simple-looking but complex-and-rarely-executed > piece of code, which is very hard to debug. And much of this as you are finding with this piece of code is how the software was designed rather then how the software needed to be. > Even just one of these factors would be enough to make an > otherwise healthy subsystem fragile - no wonder s2ram has been a > problem ever since it existed in the upstream kernel. > > So now we need just one thing: patience and more of the same > good stuff that happened lately. I think there has been some good progress, and so I am happy to be patient. I will still mention on occasion what it seems we are doing wrong. Unfortunately I don't have time to do a lot more than that. >> > Still, it might make sense to not just use the ->disable >> > sequence but primarily the ->shutdown irqchip method (when >> > it's available in the irqchip). >> >> Disable seems fine to me. This is interesting in the context >> of all of the irqs that will when masked show up somewhere >> else (think boot interrupts). >> >> > While we obviously cannot turn off the PIC that delivers >> > timer IRQs at this stage - there's no theoretical reason why >> > the suspend sequence couldnt power down some secondary PICs >> > as well - in some arch code, or maybe even in the generic >> > driver suspend sequence if the device tree is structured >> > carefully enough so that the PIC gets turned off last. >> >> If the point is simply to prevent deliver of irqs to the >> drivers I don't see the point of anything more than what the >> patch does now. > > ... except for the usecase i described above. Say some PIC sits > on a piece of silicon which gets turned off. I'm not talking > about x86 but some custom device. We really dont want that IRQ > line to send half of an IRQ message (un-ACK-ed) when it gets > turned off. So physically 'suspending' all IRQ lines does make a > certain level of long-term sense. Good point. We will loose both level and edge triggered events that occur between suspending the irqs and restoring them but that is inevitable. So we might as well call shutdown and totally turn off the irqs if we can. I don't know where in the state machine this is getting called but I would suggest doing this before we shutdown cpus. We are quickly reaching the point where laptops will exceed the 8 core limit, of lowest priority delivery mode. And only in lowest priority delivery mode is it possible to migrate irqs outside of the interrupt handlers. That plus if we suspend the irqs before shutting down the cpus it means we can safely support more vectors than a single cpu can catch. I was a little a worried about the shutdown code path because it requires in the worst case acking a level triggered irq when we have it disabled, but looking at ack_apic_level that appears to be a well tested code path. We just can't reprogram the vector. > There _might_ be one downside: overhead of ->shutdown() methods. > With a typical IRQ count on the typical netbook i doubt it's > more than ~50 usecs combined. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar @ 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 10:13 ` Benjamin Herrenschmidt 2009-02-23 10:13 ` Benjamin Herrenschmidt 3 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 9:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, pm list, Linus Torvalds, Thomas Gleixner * Eric W. Biederman <ebiederm@xmission.com> wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > I think this aspect has been well-understood during the > > discussion of this topic and it's just a slightly misleading > > changelog. > > As I was a member of that discussion I did not see that. > > It took me several passes through the patches to realize the > goal is to allow drivers to be able to sleep while they are in > their late pm shutdown routines. > > Why we want this I don't know. But it seems simple enough to > implement, and it makes it harder to get the late pm suspend > routines wrong, which is always good. That's not the only goal. The other goal is to further shrink a particular window of suspend fragility: the irqs-disabled stage of the suspend/resume sequence. Since suspend/resume is a mini-reboot sequence, there's a large amount of code executed - and the variety of code is large as well. We had repeat cases of random drivers re-enabling interrupts and thus breaking other drivers - and these are nasty to debug. So this patchset disables device IRQs centrally and serializes with pending work - so there's no races with pending IRQs anymore. The fact that we keep the timer irq running is two-fold: firstly the timer code is special and not really part of the regular suspend/resume sequence. Drivers want to take timestamps, sometimes they even want to do a small usleep(), etc. Ideally the suspend/resume code is pretty much _the same_ as a regular bootup (and shutdown) code - so we want to provide a similar environment to how drivers initialize and deinitialize, and we want to enable them to share code between bootup/shutdown and suspend/resume agressively. So the more generic kernel environment we give these fragile handlers, the better we are off in the end. Since we already had IRQS_TIMER, that was just the natural thing to do. > > The new suspend code does not rely on truly disabling IRQs > > on the low level. The purpose is to not get IRQs to drivers > > - which might crash/hang/race/misbehave. > > Reasonable. I expect one of the problems with drivers getting > it wrong is that the interface is too complex for mortal > humans to understand. The suspend/resume state machine certainly used to be a piece of code that makes a seasoned kernel developer weep in fear. That has changed drastically in the past few months. The suspend+hibernation logic got unified (at least as far as driver methods go), and all the flow and ordering has been cleaned up and has been made more robust. What makes s2ram fragile is not human failure but the combination of a handful of physical property: 1) Psychology: shutting the lid or pushing the suspend button is a deceivingly 'simple' action to the user. But under the hood, a ton of stuff happens: we deinitialize a lot of things, we go through _all hardware state_, and we do so in a serial fashion. If just one piece fails to do the right thing, the box might not resume. Still, the user expects this 'simple' thing to just work, all the time. No excuses accepted. 2) Length of code: To get a successful s2ram sequence the kernel runs through tens of thousands of lines of code. Code which never gets executed on a normal box - only if we s2ram. If just one step fails, we get a hung box. 3) Debuggability: a lot of s2ram code runs with the console off, making any bugs hard to debug. Furthermore we have no meaningful persistent storage either for kernel bug messages. The RTC trick of PM_DEBUG works but is a very narrow channel of information and it takes a lot of time to debug a bug via that method. The combination of these factors really makes up for a perfect storm in terms of kernel technology: we have this very-deceivingly-simple-looking but complex-and-rarely-executed piece of code, which is very hard to debug. Even just one of these factors would be enough to make an otherwise healthy subsystem fragile - no wonder s2ram has been a problem ever since it existed in the upstream kernel. So now we need just one thing: patience and more of the same good stuff that happened lately. > > Still, it might make sense to not just use the ->disable > > sequence but primarily the ->shutdown irqchip method (when > > it's available in the irqchip). > > Disable seems fine to me. This is interesting in the context > of all of the irqs that will when masked show up somewhere > else (think boot interrupts). > > > While we obviously cannot turn off the PIC that delivers > > timer IRQs at this stage - there's no theoretical reason why > > the suspend sequence couldnt power down some secondary PICs > > as well - in some arch code, or maybe even in the generic > > driver suspend sequence if the device tree is structured > > carefully enough so that the PIC gets turned off last. > > If the point is simply to prevent deliver of irqs to the > drivers I don't see the point of anything more than what the > patch does now. ... except for the usecase i described above. Say some PIC sits on a piece of silicon which gets turned off. I'm not talking about x86 but some custom device. We really dont want that IRQ line to send half of an IRQ message (un-ACK-ed) when it gets turned off. So physically 'suspending' all IRQ lines does make a certain level of long-term sense. Especially if it's just 3 extra lines of code to the existing patch. There _might_ be one downside: overhead of ->shutdown() methods. With a typical IRQ count on the typical netbook i doubt it's more than ~50 usecs combined. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 9:44 ` Ingo Molnar @ 2009-02-23 10:13 ` Benjamin Herrenschmidt 2009-02-23 10:13 ` Benjamin Herrenschmidt 3 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-23 10:13 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Mon, 2009-02-23 at 01:22 -0800, Eric W. Biederman wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > > I think this aspect has been well-understood during the > > discussion of this topic and it's just a slightly misleading > > changelog. > > As I was a member of that discussion I did not see that. > > It took me several passes through the patches to realize > the goal is to allow drivers to be able to sleep while they > are in their late pm shutdown routines. > > Why we want this I don't know. But it seems simple enough > to implement, and it makes it harder to get the late pm > suspend routines wrong, which is always good. To simplify (it's really all in the discussion we had the last few weeks) It boils down to being able to do the proper ACPI calls (which require core interrupts to be on, ie, ACPI uses mutexes, sleeps, etc...) after we have saved and before we restore the PCI config space, in the late suspend or early resume stages of devices. Cheers, Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 9:22 ` Eric W. Biederman ` (2 preceding siblings ...) 2009-02-23 10:13 ` Benjamin Herrenschmidt @ 2009-02-23 10:13 ` Benjamin Herrenschmidt 3 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-23 10:13 UTC (permalink / raw) To: Eric W. Biederman Cc: Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, LKML, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Mon, 2009-02-23 at 01:22 -0800, Eric W. Biederman wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > > I think this aspect has been well-understood during the > > discussion of this topic and it's just a slightly misleading > > changelog. > > As I was a member of that discussion I did not see that. > > It took me several passes through the patches to realize > the goal is to allow drivers to be able to sleep while they > are in their late pm shutdown routines. > > Why we want this I don't know. But it seems simple enough > to implement, and it makes it harder to get the late pm > suspend routines wrong, which is always good. To simplify (it's really all in the discussion we had the last few weeks) It boils down to being able to do the proper ACPI calls (which require core interrupts to be on, ie, ACPI uses mutexes, sleeps, etc...) after we have saved and before we restore the PCI config space, in the late suspend or early resume stages of devices. Cheers, Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-02-23 3:04 ` Eric W. Biederman @ 2009-02-23 3:04 ` Eric W. Biederman 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 8:36 ` Ingo Molnar 5 siblings, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-23 3:04 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list "Rafael J. Wysocki" <rjw@sisk.pl> writes: > On Sunday 22 February 2009, Rafael J. Wysocki wrote: >> On Sunday 22 February 2009, Linus Torvalds wrote: >> > >> > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > [--snip--] >> >> Thanks a lot for your comments, I'll send an updated patch shortly. > > The updated patch is appended. > > It has been initially tested, but requires more testing, especially with APM, > XEN, kexec jump etc. > > Thanks, > Rafael > > --- > From: Rafael J. Wysocki <rjw@sisk.pl> > Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) > > Introduce two helper functions allowing us to disable device > interrupts (at the IO-APIC level) during suspend or hibernation > and enable them during the subsequent resume, respectively, so that > the timer interrupts are enabled while "late" suspend callbacks and > "early" resume callbacks provided by device drivers are being > executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. I don't have an issue with the code, but I do have an issue with this description of it. Calling disable especially for ioapics does nothing directly. It simply arranges for the irq to be marked pending and for the irq to be masked if the irq happens. So what you are doing is arranging so that no interrupts will be delivered to drivers. Not really disabling interrupts at the IO-APIC level. In addition not all interrupts (even on x86) go through an IO-APIC anymore so describing the patch in terms of an IO-APIC makes it a bit hard to understand what your intent actually is. Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-02-23 3:04 ` Eric W. Biederman @ 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 8:36 ` Ingo Molnar 5 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 8:36 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > [--snip--] > > > > Thanks a lot for your comments, I'll send an updated patch shortly. > > The updated patch is appended. > > It has been initially tested, but requires more testing, > especially with APM, XEN, kexec jump etc. > arch/x86/kernel/apm_32.c | 20 ++++++++++++---- > drivers/xen/manage.c | 32 +++++++++++++++---------- > include/linux/interrupt.h | 3 ++ > include/linux/irq.h | 1 > kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ > kernel/kexec.c | 10 ++++---- > kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- > kernel/power/main.c | 20 +++++++++++----- > 8 files changed, 152 insertions(+), 37 deletions(-) > > Index: linux-2.6/kernel/irq/manage.c > =================================================================== > --- linux-2.6.orig/kernel/irq/manage.c > +++ linux-2.6/kernel/irq/manage.c > @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha > return retval; > } > EXPORT_SYMBOL(request_irq); > + > +#ifdef CONFIG_PM_SLEEP > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines Code placement nit: please dont put new #ifdef blocks into the core IRQ code, add a kernel/irq/power.c file instead and make the kbuild rule depend on PM_SLEEP. The new suspend_device_irqs() and resume_device_irqs() doesnt use any manage.c internals so this should work straight away. > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this > + * purpose. It disables all interrupt lines that are enabled at the > + * moment and sets the IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + spin_lock_irqsave(&desc->lock, flags); > + > + if (!desc->depth && desc->action > + && !(desc->action->flags & IRQF_TIMER)) { > + desc->depth++; > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > + desc->chip->disable(irq); > + } > + > + spin_unlock_irqrestore(&desc->lock, flags); > + } > + > + for_each_irq_desc(irq, desc) { > + if (desc->status & IRQ_SUSPENDED) > + synchronize_irq(irq); > + } Optimization/code-flow nit: a possibility might be to do a single loop, i.e. i think it's safe to couple the disable+sync bits [as in 99.99% of the cases there will be no in-execution irq handlers when we execute this.] Something like: int do_sync = 0; spin_lock_irqsave(&desc->lock, flags); if (!desc->depth && desc->action && !(desc->action->flags & IRQF_TIMER)) { desc->depth++; desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; desc->chip->disable(irq); do_sync = 1; } spin_unlock_irqrestore(&desc->lock, flags); if (do_sync) synchronize_irq(irq); In fact i'd suggest to factor out this logic into a separate __suspend_irq(irq) / __resume_irq(irq) inline helper functions. (They should be inline for the time being as they are not shared-irq-safe so they shouldnt really be exposed to drivers in such a singular capacity.) Doing so will also fix the line-break ugliness of the first branch - as in a standalone function the condition fits into a single line. There's a performance reason as well: especially when we have a lot of IRQ descriptors that will be about twice as fast. (with a large iteration scope this function is cachemiss-limited and doing this passes doubles the cachemiss rate.) > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > + > +/** > + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() > + * that have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + desc->status &= ~IRQ_SUSPENDED; > + enable_irq(irq); > + } Robustness+optimization nit: this will work but could be done in a nicer way: enable_irq() should auto-clear IRQ_SUSPENDED. (We already clear flags there so it's even a tiny bit faster this way.) We definitely dont want IRQ_SUSPENDED to 'leak' out into an enabled line, should something call enable_irq() on a suspended line. So either make it auto-unsuspend in enable_irq(), or add an extra WARN_ON() to enable_irq(), to make sure IRQ_SUSPENDED is always off by that time. > + arch_suspend_disable_irqs(); > + BUG_ON(!irqs_disabled()); Please. We just disabled all devices - a BUG_ON() is a very counter-productive thing to do here - chances are the user will never see anything but a hang. So please turn this into a nice WARN_ONCE(). > --- linux-2.6.orig/include/linux/interrupt.h > +++ linux-2.6/include/linux/interrupt.h > @@ -470,4 +470,7 @@ extern int early_irq_init(void); > extern int arch_early_irq_init(void); > extern int arch_init_chip_data(struct irq_desc *desc, int cpu); > > +extern void suspend_device_irqs(void); > +extern void resume_device_irqs(void); Header cleanliness nit: please dont just throw new prototypes to the tail of headers, but think about where they fit in best, logically. These two new prototypes should go straight after the normal irq line state management functions: extern void disable_irq_nosync(unsigned int irq); extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); Perhaps also with a comment like this: /* * Note: dont use these functions in driver code - they are for * core kernel use only. */ > +++ linux-2.6/kernel/power/main.c [...] > + > + Unlock: > + resume_device_irqs(); Small drive-by style nit: while at it could you please fix the capitalization and the naming of the label (and all labels in this file)? The standard label is "out_unlock". [and "err_unlock" for failure cases - but this isnt a failure case.] There's 43 such bad label names in kernel/power/*.c, see the output of: git grep '^ [A-Z][a-z].*:$' kernel/power/ > Index: linux-2.6/arch/x86/kernel/apm_32.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > +++ linux-2.6/arch/x86/kernel/apm_32.c > + > + suspend_device_irqs(); > device_power_down(PMSG_SUSPEND); > + > + local_irq_disable(); hm, this is a very repetitive pattern, all around the various suspend/resume variants. Might make sense to make: device_power_down(PMSG_SUSPEND); do the irq line disabling plus the local irq disabling automatically. That also means it cannot be forgotten. The symmetric action should happen for PMSG_RESUME. Is there ever a case where we want a different pattern? > Index: linux-2.6/drivers/xen/manage.c > =================================================================== > --- linux-2.6.orig/drivers/xen/manage.c > +++ linux-2.6/drivers/xen/manage.c > @@ -39,12 +39,6 @@ static int xen_suspend(void *data) > - if (!*cancelled) { > - xen_irq_resume(); > - xen_console_resume(); > - xen_timer_resume(); This change needs a second look. xen_suspend() is a stop_machine() handler and as such executes on specific CPUs, and your change modifies this. OTOH, i had a look at these handlers and it all looks safe. Jeremy? > +resume_devices: > + resume_device_irqs(); Small style nit: labels should start with a space character. I.e. it should be: > + resume_devices: > + resume_device_irqs(); > +++ linux-2.6/kernel/kexec.c > @@ -1454,7 +1454,7 @@ int kernel_kexec(void) > if (error) > goto Resume_devices; > device_pm_lock(); > - local_irq_disable(); > + suspend_device_irqs(); > /* At this point, device_suspend() has been called, > * but *not* device_power_down(). We *must* > * device_power_down() now. Otherwise, drivers for > @@ -1464,8 +1464,9 @@ int kernel_kexec(void) > */ > error = device_power_down(PMSG_FREEZE); > if (error) > - goto Enable_irqs; > + goto Resume_irqs; > > + local_irq_disable(); > /* Suspend system devices */ > error = sysdev_suspend(PMSG_FREEZE); > if (error) > @@ -1484,9 +1485,10 @@ int kernel_kexec(void) > if (kexec_image->preserve_context) { > sysdev_resume(); > Power_up_devices: > - device_power_up(PMSG_RESTORE); > - Enable_irqs: > local_irq_enable(); > + device_power_up(PMSG_RESTORE); > + Resume_irqs: > + resume_device_irqs(); > device_pm_unlock(); > enable_nonboot_cpus(); > Resume_devices: (same comment about label style applies here too.) > Index: linux-2.6/include/linux/irq.h > =================================================================== > --- linux-2.6.orig/include/linux/irq.h > +++ linux-2.6/include/linux/irq.h > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > #ifdef CONFIG_IRQ_PER_CPU > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Note, you should probably make PM_SLEEP depend on GENERIC_HARDIRQS - as this change will break the build on all non-genirq architectures. (sparc, alpha, etc.) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 23:48 ` Rafael J. Wysocki ` (4 preceding siblings ...) 2009-02-23 8:36 ` Ingo Molnar @ 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-23 11:29 ` Rafael J. Wysocki 5 siblings, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 8:36 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > [--snip--] > > > > Thanks a lot for your comments, I'll send an updated patch shortly. > > The updated patch is appended. > > It has been initially tested, but requires more testing, > especially with APM, XEN, kexec jump etc. > arch/x86/kernel/apm_32.c | 20 ++++++++++++---- > drivers/xen/manage.c | 32 +++++++++++++++---------- > include/linux/interrupt.h | 3 ++ > include/linux/irq.h | 1 > kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ > kernel/kexec.c | 10 ++++---- > kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- > kernel/power/main.c | 20 +++++++++++----- > 8 files changed, 152 insertions(+), 37 deletions(-) > > Index: linux-2.6/kernel/irq/manage.c > =================================================================== > --- linux-2.6.orig/kernel/irq/manage.c > +++ linux-2.6/kernel/irq/manage.c > @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha > return retval; > } > EXPORT_SYMBOL(request_irq); > + > +#ifdef CONFIG_PM_SLEEP > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines Code placement nit: please dont put new #ifdef blocks into the core IRQ code, add a kernel/irq/power.c file instead and make the kbuild rule depend on PM_SLEEP. The new suspend_device_irqs() and resume_device_irqs() doesnt use any manage.c internals so this should work straight away. > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this > + * purpose. It disables all interrupt lines that are enabled at the > + * moment and sets the IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + spin_lock_irqsave(&desc->lock, flags); > + > + if (!desc->depth && desc->action > + && !(desc->action->flags & IRQF_TIMER)) { > + desc->depth++; > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > + desc->chip->disable(irq); > + } > + > + spin_unlock_irqrestore(&desc->lock, flags); > + } > + > + for_each_irq_desc(irq, desc) { > + if (desc->status & IRQ_SUSPENDED) > + synchronize_irq(irq); > + } Optimization/code-flow nit: a possibility might be to do a single loop, i.e. i think it's safe to couple the disable+sync bits [as in 99.99% of the cases there will be no in-execution irq handlers when we execute this.] Something like: int do_sync = 0; spin_lock_irqsave(&desc->lock, flags); if (!desc->depth && desc->action && !(desc->action->flags & IRQF_TIMER)) { desc->depth++; desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; desc->chip->disable(irq); do_sync = 1; } spin_unlock_irqrestore(&desc->lock, flags); if (do_sync) synchronize_irq(irq); In fact i'd suggest to factor out this logic into a separate __suspend_irq(irq) / __resume_irq(irq) inline helper functions. (They should be inline for the time being as they are not shared-irq-safe so they shouldnt really be exposed to drivers in such a singular capacity.) Doing so will also fix the line-break ugliness of the first branch - as in a standalone function the condition fits into a single line. There's a performance reason as well: especially when we have a lot of IRQ descriptors that will be about twice as fast. (with a large iteration scope this function is cachemiss-limited and doing this passes doubles the cachemiss rate.) > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > + > +/** > + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() > + * that have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + desc->status &= ~IRQ_SUSPENDED; > + enable_irq(irq); > + } Robustness+optimization nit: this will work but could be done in a nicer way: enable_irq() should auto-clear IRQ_SUSPENDED. (We already clear flags there so it's even a tiny bit faster this way.) We definitely dont want IRQ_SUSPENDED to 'leak' out into an enabled line, should something call enable_irq() on a suspended line. So either make it auto-unsuspend in enable_irq(), or add an extra WARN_ON() to enable_irq(), to make sure IRQ_SUSPENDED is always off by that time. > + arch_suspend_disable_irqs(); > + BUG_ON(!irqs_disabled()); Please. We just disabled all devices - a BUG_ON() is a very counter-productive thing to do here - chances are the user will never see anything but a hang. So please turn this into a nice WARN_ONCE(). > --- linux-2.6.orig/include/linux/interrupt.h > +++ linux-2.6/include/linux/interrupt.h > @@ -470,4 +470,7 @@ extern int early_irq_init(void); > extern int arch_early_irq_init(void); > extern int arch_init_chip_data(struct irq_desc *desc, int cpu); > > +extern void suspend_device_irqs(void); > +extern void resume_device_irqs(void); Header cleanliness nit: please dont just throw new prototypes to the tail of headers, but think about where they fit in best, logically. These two new prototypes should go straight after the normal irq line state management functions: extern void disable_irq_nosync(unsigned int irq); extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); Perhaps also with a comment like this: /* * Note: dont use these functions in driver code - they are for * core kernel use only. */ > +++ linux-2.6/kernel/power/main.c [...] > + > + Unlock: > + resume_device_irqs(); Small drive-by style nit: while at it could you please fix the capitalization and the naming of the label (and all labels in this file)? The standard label is "out_unlock". [and "err_unlock" for failure cases - but this isnt a failure case.] There's 43 such bad label names in kernel/power/*.c, see the output of: git grep '^ [A-Z][a-z].*:$' kernel/power/ > Index: linux-2.6/arch/x86/kernel/apm_32.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > +++ linux-2.6/arch/x86/kernel/apm_32.c > + > + suspend_device_irqs(); > device_power_down(PMSG_SUSPEND); > + > + local_irq_disable(); hm, this is a very repetitive pattern, all around the various suspend/resume variants. Might make sense to make: device_power_down(PMSG_SUSPEND); do the irq line disabling plus the local irq disabling automatically. That also means it cannot be forgotten. The symmetric action should happen for PMSG_RESUME. Is there ever a case where we want a different pattern? > Index: linux-2.6/drivers/xen/manage.c > =================================================================== > --- linux-2.6.orig/drivers/xen/manage.c > +++ linux-2.6/drivers/xen/manage.c > @@ -39,12 +39,6 @@ static int xen_suspend(void *data) > - if (!*cancelled) { > - xen_irq_resume(); > - xen_console_resume(); > - xen_timer_resume(); This change needs a second look. xen_suspend() is a stop_machine() handler and as such executes on specific CPUs, and your change modifies this. OTOH, i had a look at these handlers and it all looks safe. Jeremy? > +resume_devices: > + resume_device_irqs(); Small style nit: labels should start with a space character. I.e. it should be: > + resume_devices: > + resume_device_irqs(); > +++ linux-2.6/kernel/kexec.c > @@ -1454,7 +1454,7 @@ int kernel_kexec(void) > if (error) > goto Resume_devices; > device_pm_lock(); > - local_irq_disable(); > + suspend_device_irqs(); > /* At this point, device_suspend() has been called, > * but *not* device_power_down(). We *must* > * device_power_down() now. Otherwise, drivers for > @@ -1464,8 +1464,9 @@ int kernel_kexec(void) > */ > error = device_power_down(PMSG_FREEZE); > if (error) > - goto Enable_irqs; > + goto Resume_irqs; > > + local_irq_disable(); > /* Suspend system devices */ > error = sysdev_suspend(PMSG_FREEZE); > if (error) > @@ -1484,9 +1485,10 @@ int kernel_kexec(void) > if (kexec_image->preserve_context) { > sysdev_resume(); > Power_up_devices: > - device_power_up(PMSG_RESTORE); > - Enable_irqs: > local_irq_enable(); > + device_power_up(PMSG_RESTORE); > + Resume_irqs: > + resume_device_irqs(); > device_pm_unlock(); > enable_nonboot_cpus(); > Resume_devices: (same comment about label style applies here too.) > Index: linux-2.6/include/linux/irq.h > =================================================================== > --- linux-2.6.orig/include/linux/irq.h > +++ linux-2.6/include/linux/irq.h > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > #ifdef CONFIG_IRQ_PER_CPU > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Note, you should probably make PM_SLEEP depend on GENERIC_HARDIRQS - as this change will break the build on all non-genirq architectures. (sparc, alpha, etc.) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 8:36 ` Ingo Molnar @ 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-23 12:28 ` Ingo Molnar ` (7 more replies) 2009-02-23 11:29 ` Rafael J. Wysocki 1 sibling, 8 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 11:29 UTC (permalink / raw) To: Ingo Molnar, Johannes Berg Cc: Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > > > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > [--snip--] > > > > > > Thanks a lot for your comments, I'll send an updated patch shortly. > > > > The updated patch is appended. > > > > It has been initially tested, but requires more testing, > > especially with APM, XEN, kexec jump etc. > > > arch/x86/kernel/apm_32.c | 20 ++++++++++++---- > > drivers/xen/manage.c | 32 +++++++++++++++---------- > > include/linux/interrupt.h | 3 ++ > > include/linux/irq.h | 1 > > kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ > > kernel/kexec.c | 10 ++++---- > > kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- > > kernel/power/main.c | 20 +++++++++++----- > > 8 files changed, 152 insertions(+), 37 deletions(-) > > > > Index: linux-2.6/kernel/irq/manage.c > > =================================================================== > > --- linux-2.6.orig/kernel/irq/manage.c > > +++ linux-2.6/kernel/irq/manage.c > > @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha > > return retval; > > } > > EXPORT_SYMBOL(request_irq); > > + > > +#ifdef CONFIG_PM_SLEEP > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > Code placement nit: please dont put new #ifdef blocks into the > core IRQ code, add a kernel/irq/power.c file instead and make > the kbuild rule depend on PM_SLEEP. > > The new suspend_device_irqs() and resume_device_irqs() doesnt > use any manage.c internals so this should work straight away. OK, I'll do that. > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this > > + * purpose. It disables all interrupt lines that are enabled at the > > + * moment and sets the IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (!desc->depth && desc->action > > + && !(desc->action->flags & IRQF_TIMER)) { > > + desc->depth++; > > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > > + desc->chip->disable(irq); > > + } > > + > > + spin_unlock_irqrestore(&desc->lock, flags); > > + } > > + > > + for_each_irq_desc(irq, desc) { > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > + } > > Optimization/code-flow nit: a possibility might be to do a > single loop, i.e. i think it's safe to couple the disable+sync > bits [as in 99.99% of the cases there will be no in-execution > irq handlers when we execute this.] Well, Linus suggested to do it in a separate loop. I'm fine with both ways. > Something like: > > int do_sync = 0; > > spin_lock_irqsave(&desc->lock, flags); > > if (!desc->depth && desc->action > && !(desc->action->flags & IRQF_TIMER)) { > > desc->depth++; > desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > desc->chip->disable(irq); > do_sync = 1; > } > > spin_unlock_irqrestore(&desc->lock, flags); > > if (do_sync) > synchronize_irq(irq); > > In fact i'd suggest to factor out this logic into a separate > __suspend_irq(irq) / __resume_irq(irq) inline helper functions. > (They should be inline for the time being as they are not > shared-irq-safe so they shouldnt really be exposed to drivers in > such a singular capacity.) Good idea, I'll do it. > Doing so will also fix the line-break ugliness of the first > branch - as in a standalone function the condition fits into a > single line. > > There's a performance reason as well: especially when we have a > lot of IRQ descriptors that will be about twice as fast. (with a > large iteration scope this function is cachemiss-limited and > doing this passes doubles the cachemiss rate.) > > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() > > + * that have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + desc->status &= ~IRQ_SUSPENDED; > > + enable_irq(irq); > > + } > > Robustness+optimization nit: this will work but could be done in > a nicer way: enable_irq() should auto-clear IRQ_SUSPENDED. (We > already clear flags there so it's even a tiny bit faster this > way.) OK > We definitely dont want IRQ_SUSPENDED to 'leak' out into an > enabled line, should something call enable_irq() on a suspended > line. So either make it auto-unsuspend in enable_irq(), or add > an extra WARN_ON() to enable_irq(), to make sure IRQ_SUSPENDED > is always off by that time. > > > + arch_suspend_disable_irqs(); > > + BUG_ON(!irqs_disabled()); > > Please. We just disabled all devices - a BUG_ON() is a very > counter-productive thing to do here - chances are the user will > never see anything but a hang. So please turn this into a nice > WARN_ONCE(). This is just moving code. Also, the BUG_ON() can only affect powerpc and it's there on purpose AFAICS (Johannes?). Anyway, changing that would be a separate patch. > > --- linux-2.6.orig/include/linux/interrupt.h > > +++ linux-2.6/include/linux/interrupt.h > > @@ -470,4 +470,7 @@ extern int early_irq_init(void); > > extern int arch_early_irq_init(void); > > extern int arch_init_chip_data(struct irq_desc *desc, int cpu); > > > > +extern void suspend_device_irqs(void); > > +extern void resume_device_irqs(void); > > Header cleanliness nit: please dont just throw new prototypes to > the tail of headers, but think about where they fit in best, > logically. > > These two new prototypes should go straight after the normal irq > line state management functions: > > extern void disable_irq_nosync(unsigned int irq); > extern void disable_irq(unsigned int irq); > extern void enable_irq(unsigned int irq); > > Perhaps also with a comment like this: > > /* > * Note: dont use these functions in driver code - they are for > * core kernel use only. > */ OK, I'll put them in there. > > +++ linux-2.6/kernel/power/main.c > [...] > > + > > + Unlock: > > + resume_device_irqs(); > > Small drive-by style nit: while at it could you please fix the > capitalization and the naming of the label (and all labels in > this file)? I don't think they are wrong. They are uniform accross the file and it's clear what they mean. > The standard label is "out_unlock". [and "err_unlock" for failure cases > - but this isnt a failure case.] Where exactly is this standard defined? > There's 43 such bad label names in kernel/power/*.c, see the > output of: > > git grep '^ [A-Z][a-z].*:$' kernel/power/ If you think they are bad, please send a patch to change them. > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > =================================================================== > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > + > > + suspend_device_irqs(); > > device_power_down(PMSG_SUSPEND); > > + > > + local_irq_disable(); > > hm, this is a very repetitive pattern, all around the various > suspend/resume variants. Might make sense to make: > > device_power_down(PMSG_SUSPEND); > > do the irq line disabling plus the local irq disabling > automatically. That also means it cannot be forgotten. The > symmetric action should happen for PMSG_RESUME. > > Is there ever a case where we want a different pattern? Even if there's no such case, I prefer to call local_irq_disable() explicitly in here, so that it's clearly known where it happens to anyone reading this code. Doing the "late" suspend of devices and disabling interrupts on the CPU are separate logical steps. > > Index: linux-2.6/drivers/xen/manage.c > > =================================================================== > > --- linux-2.6.orig/drivers/xen/manage.c > > +++ linux-2.6/drivers/xen/manage.c > > @@ -39,12 +39,6 @@ static int xen_suspend(void *data) > > > - if (!*cancelled) { > > - xen_irq_resume(); > > - xen_console_resume(); > > - xen_timer_resume(); > > This change needs a second look. xen_suspend() is a > stop_machine() handler and as such executes on specific CPUs, > and your change modifies this. OTOH, i had a look at these > handlers and it all looks safe. Jeremy? > > > +resume_devices: > > + resume_device_irqs(); > > Small style nit: labels should start with a space character. > I.e. it should be: I know, but the second label in there starts without a space character and IMO keeping a uniform coding style i a single file is more important than trying to adjust it to a broader set of rules FWIW. I also think that coding style changes shouldn't be mixed with functional changes as far as reasonably possible. > > + resume_devices: > > + resume_device_irqs(); > > > +++ linux-2.6/kernel/kexec.c > > @@ -1454,7 +1454,7 @@ int kernel_kexec(void) > > if (error) > > goto Resume_devices; > > device_pm_lock(); > > - local_irq_disable(); > > + suspend_device_irqs(); > > /* At this point, device_suspend() has been called, > > * but *not* device_power_down(). We *must* > > * device_power_down() now. Otherwise, drivers for > > @@ -1464,8 +1464,9 @@ int kernel_kexec(void) > > */ > > error = device_power_down(PMSG_FREEZE); > > if (error) > > - goto Enable_irqs; > > + goto Resume_irqs; > > > > + local_irq_disable(); > > /* Suspend system devices */ > > error = sysdev_suspend(PMSG_FREEZE); > > if (error) > > @@ -1484,9 +1485,10 @@ int kernel_kexec(void) > > if (kexec_image->preserve_context) { > > sysdev_resume(); > > Power_up_devices: > > - device_power_up(PMSG_RESTORE); > > - Enable_irqs: > > local_irq_enable(); > > + device_power_up(PMSG_RESTORE); > > + Resume_irqs: > > + resume_device_irqs(); > > device_pm_unlock(); > > enable_nonboot_cpus(); > > Resume_devices: > > (same comment about label style applies here too.) > > > Index: linux-2.6/include/linux/irq.h > > =================================================================== > > --- linux-2.6.orig/include/linux/irq.h > > +++ linux-2.6/include/linux/irq.h > > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig > > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > > > #ifdef CONFIG_IRQ_PER_CPU > > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) > > Note, you should probably make PM_SLEEP depend on > GENERIC_HARDIRQS - as this change will break the build on all > non-genirq architectures. (sparc, alpha, etc.) PM_SLEEP depends on ARCH_SUSPEND_POSSIBLE || ARCH_HIBERNATION_POSSIBLE, which I don't think is set on these architectures. Thanlks a lot for your comments. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki @ 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 14:48 ` Rafael J. Wysocki ` (2 more replies) 2009-02-23 12:28 ` Ingo Molnar ` (6 subsequent siblings) 7 siblings, 3 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 12:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > > =================================================================== > > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > > > + > > > + suspend_device_irqs(); > > > device_power_down(PMSG_SUSPEND); > > > + > > > + local_irq_disable(); > > > > hm, this is a very repetitive pattern, all around the various > > suspend/resume variants. Might make sense to make: > > > > device_power_down(PMSG_SUSPEND); > > > > do the irq line disabling plus the local irq disabling > > automatically. That also means it cannot be forgotten. The > > symmetric action should happen for PMSG_RESUME. > > > > Is there ever a case where we want a different pattern? > > Even if there's no such case, I prefer to call > local_irq_disable() explicitly in here, so that it's clearly > known where it happens to anyone reading this code. That property can be implied in the function name: device_power_down_irq_disable(PMSG_SUSPEND); Open-coding it, if it looks the same in all the cases just increases the chances that someone somewhere copies them incorrectly. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 12:28 ` Ingo Molnar @ 2009-02-23 14:48 ` Rafael J. Wysocki 2009-02-23 20:49 ` Benjamin Herrenschmidt 2009-02-23 20:49 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 14:48 UTC (permalink / raw) To: Ingo Molnar Cc: Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > > > =================================================================== > > > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > > > > > + > > > > + suspend_device_irqs(); > > > > device_power_down(PMSG_SUSPEND); > > > > + > > > > + local_irq_disable(); > > > > > > hm, this is a very repetitive pattern, all around the various > > > suspend/resume variants. Might make sense to make: > > > > > > device_power_down(PMSG_SUSPEND); > > > > > > do the irq line disabling plus the local irq disabling > > > automatically. That also means it cannot be forgotten. The > > > symmetric action should happen for PMSG_RESUME. > > > > > > Is there ever a case where we want a different pattern? > > > > Even if there's no such case, I prefer to call > > local_irq_disable() explicitly in here, so that it's clearly > > known where it happens to anyone reading this code. > > That property can be implied in the function name: > > device_power_down_irq_disable(PMSG_SUSPEND); > > Open-coding it, if it looks the same in all the cases just > increases the chances that someone somewhere copies them > incorrectly. Well, I see your point, but in that case I'd rather couple the disabling of local interrupts on the CPU with sysdev_suspend and the disabling (or whatever Eric would like to call that) of device interrupts with device_power_down(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 14:48 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 14:48 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > > > =================================================================== > > > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > > > > > + > > > > + suspend_device_irqs(); > > > > device_power_down(PMSG_SUSPEND); > > > > + > > > > + local_irq_disable(); > > > > > > hm, this is a very repetitive pattern, all around the various > > > suspend/resume variants. Might make sense to make: > > > > > > device_power_down(PMSG_SUSPEND); > > > > > > do the irq line disabling plus the local irq disabling > > > automatically. That also means it cannot be forgotten. The > > > symmetric action should happen for PMSG_RESUME. > > > > > > Is there ever a case where we want a different pattern? > > > > Even if there's no such case, I prefer to call > > local_irq_disable() explicitly in here, so that it's clearly > > known where it happens to anyone reading this code. > > That property can be implied in the function name: > > device_power_down_irq_disable(PMSG_SUSPEND); > > Open-coding it, if it looks the same in all the cases just > increases the chances that someone somewhere copies them > incorrectly. Well, I see your point, but in that case I'd rather couple the disabling of local interrupts on the CPU with sysdev_suspend and the disabling (or whatever Eric would like to call that) of device interrupts with device_power_down(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 14:48 ` Rafael J. Wysocki @ 2009-02-23 20:49 ` Benjamin Herrenschmidt 2009-02-23 20:49 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-23 20:49 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list > That property can be implied in the function name: > > device_power_down_irq_disable(PMSG_SUSPEND); > > Open-coding it, if it looks the same in all the cases just > increases the chances that someone somewhere copies them > incorrectly. No. Some archs need to do "special" things at the irq disable point, leave it open coded in the caller please. Cheers, Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 14:48 ` Rafael J. Wysocki 2009-02-23 20:49 ` Benjamin Herrenschmidt @ 2009-02-23 20:49 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-23 20:49 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner > That property can be implied in the function name: > > device_power_down_irq_disable(PMSG_SUSPEND); > > Open-coding it, if it looks the same in all the cases just > increases the chances that someone somewhere copies them > incorrectly. No. Some archs need to do "special" things at the irq disable point, leave it open coded in the caller please. Cheers, Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-23 12:28 ` Ingo Molnar @ 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 12:45 ` Ingo Molnar ` (5 subsequent siblings) 7 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 12:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > > =================================================================== > > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > > > + > > > + suspend_device_irqs(); > > > device_power_down(PMSG_SUSPEND); > > > + > > > + local_irq_disable(); > > > > hm, this is a very repetitive pattern, all around the various > > suspend/resume variants. Might make sense to make: > > > > device_power_down(PMSG_SUSPEND); > > > > do the irq line disabling plus the local irq disabling > > automatically. That also means it cannot be forgotten. The > > symmetric action should happen for PMSG_RESUME. > > > > Is there ever a case where we want a different pattern? > > Even if there's no such case, I prefer to call > local_irq_disable() explicitly in here, so that it's clearly > known where it happens to anyone reading this code. That property can be implied in the function name: device_power_down_irq_disable(PMSG_SUSPEND); Open-coding it, if it looks the same in all the cases just increases the chances that someone somewhere copies them incorrectly. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 12:28 ` Ingo Molnar @ 2009-02-23 12:45 ` Ingo Molnar 2009-02-23 15:07 ` Rafael J. Wysocki 2009-02-23 15:07 ` Rafael J. Wysocki 2009-02-23 12:45 ` Ingo Molnar ` (4 subsequent siblings) 7 siblings, 2 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 12:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > +resume_devices: > > > + resume_device_irqs(); > > > > Small style nit: labels should start with a space character. > > I.e. it should be: > > I know, but the second label in there starts without a space > character and IMO keeping a uniform coding style i a single > file is more important than trying to adjust it to a broader > set of rules FWIW. [...] Even though it's just a very small and insignificant detail (nowhere described in the CodingStyle), barely worth the mention (and i already regret having brought it up at all), what you say is wrong on a conceptual level and that alarms me a bit ;-) It is exactly these kinds of "my code, my style!" world view that results in a crappy overall kernel style. For a single file to look consistent is just the first (and required) step, what matters even more is for files to have similar coding patterns, to make the style as helpful to the general kernel developer/reviewer/bug-fixer/maintainer as possible. "code with a helpful style" here means two things: 1) it should understand and adhere to basic style principles. This is just an (often arbitrary) subset of the infinite set of reasonable style guides. The most common-sense ones are written down in Documentation/CodingStyle. There's a lot of leeway, as long as the basic principle of "be helpful" is understood and followed. 2) it should carry meta information outside of the language syntax and it should build expectations about a code's purpose and general structure. That is essential so that we can find bugs during review. If each file has a slightly different style to express labels then that means we insert extra entropy and degrades and obfuscates the true meat of the code and hurts the overall reviewability of the code. In practical terms: i noticed that weird label - otherwise i would not have commented on it. I noticed it because it had the pattern of a comment block (most comment blocks start with capital letters, and for that good reason). It was completely unnecessary for me to notice that label - it carries no information about the patch itself. Ergo, it would be better in the long run if code does not raise unnecessary mental exceptions. We have a limited set of exceptions we are able to handle during review, lets make sure we use them sparingly. Sure, there will always be borderline cases where we'll have to agree to disagree, even if we agree about the general principle. But this is not one of those cases - having a "Suspend:" capitalized label is not something you added to enhance the basic coding style - it is something very uncommon and self-serving which you added in _spite_ of the general principles i believe. It has no other message beyond "I do this because i can!". I.e. it is not helpful at all. When it comes to coding style the kernel is not a democracy at all. > [...] I also think that coding style changes shouldn't be > mixed with functional changes as far as reasonably possible. Sure, you got that drive-by review for free, by virtue of context diffs ;-) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 12:45 ` Ingo Molnar @ 2009-02-23 15:07 ` Rafael J. Wysocki 2009-02-23 15:07 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 15:07 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > +resume_devices: > > > > + resume_device_irqs(); > > > > > > Small style nit: labels should start with a space character. > > > I.e. it should be: > > > > I know, but the second label in there starts without a space > > character and IMO keeping a uniform coding style i a single > > file is more important than trying to adjust it to a broader > > set of rules FWIW. [...] > > Even though it's just a very small and insignificant detail > (nowhere described in the CodingStyle), barely worth the mention > (and i already regret having brought it up at all), what you say > is wrong on a conceptual level and that alarms me a bit ;-) > > It is exactly these kinds of "my code, my style!" world view > that results in a crappy overall kernel style. > > For a single file to look consistent is just the first (and > required) step, what matters even more is for files to have > similar coding patterns, to make the style as helpful to the > general kernel developer/reviewer/bug-fixer/maintainer as > possible. > > "code with a helpful style" here means two things: > > 1) it should understand and adhere to basic style principles. > This is just an (often arbitrary) subset of the infinite set > of reasonable style guides. The most common-sense ones are > written down in Documentation/CodingStyle. There's a lot of > leeway, as long as the basic principle of "be helpful" is > understood and followed. > > 2) it should carry meta information outside of the language > syntax and it should build expectations about a code's > purpose and general structure. > > That is essential so that we can find bugs during review. > > If each file has a slightly different style to express labels > then that means we insert extra entropy and degrades and > obfuscates the true meat of the code and hurts the overall > reviewability of the code. > > In practical terms: i noticed that weird label - otherwise i > would not have commented on it. I noticed it because it had the > pattern of a comment block (most comment blocks start with > capital letters, and for that good reason). > > It was completely unnecessary for me to notice that label - it > carries no information about the patch itself. Ergo, it would be > better in the long run if code does not raise unnecessary mental > exceptions. We have a limited set of exceptions we are able to > handle during review, lets make sure we use them sparingly. Just to clarify, I have nothing against labels that are not capitalized etc., actually I can live with whatever style of labels is considered as appropriate and/or helpful. However, if specific style of labels was chosen for given file in the past and it is consistent over the entire file, I don't think it should be changed in a patch that does a different thing, regardless of who's maintaining the file or who's written the code in question. It should be changed in a separate patch with a changelog describing why this change is being made. I don't really have the time to write such a patch at the moment and I don't really think it's _that_ important. YMMV. > Sure, there will always be borderline cases where we'll have to > agree to disagree, even if we agree about the general principle. > > But this is not one of those cases - having a "Suspend:" > capitalized label is not something you added to enhance the > basic coding style - it is something very uncommon and > self-serving which you added in _spite_ of the general > principles i believe. It has no other message beyond "I do this > because i can!". > > I.e. it is not helpful at all. When it comes to coding style the > kernel is not a democracy at all. > > > [...] I also think that coding style changes shouldn't be > > mixed with functional changes as far as reasonably possible. > > Sure, you got that drive-by review for free, by virtue of > context diffs ;-) Well, OK. :-) Still, IMHO it's more helpful if the comments related to the changes that belong to the patch in question are not mixed with comments related to the coding style of the files being modified. Perhaps I'm picky ... Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 12:45 ` Ingo Molnar 2009-02-23 15:07 ` Rafael J. Wysocki @ 2009-02-23 15:07 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 15:07 UTC (permalink / raw) To: Ingo Molnar Cc: Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > +resume_devices: > > > > + resume_device_irqs(); > > > > > > Small style nit: labels should start with a space character. > > > I.e. it should be: > > > > I know, but the second label in there starts without a space > > character and IMO keeping a uniform coding style i a single > > file is more important than trying to adjust it to a broader > > set of rules FWIW. [...] > > Even though it's just a very small and insignificant detail > (nowhere described in the CodingStyle), barely worth the mention > (and i already regret having brought it up at all), what you say > is wrong on a conceptual level and that alarms me a bit ;-) > > It is exactly these kinds of "my code, my style!" world view > that results in a crappy overall kernel style. > > For a single file to look consistent is just the first (and > required) step, what matters even more is for files to have > similar coding patterns, to make the style as helpful to the > general kernel developer/reviewer/bug-fixer/maintainer as > possible. > > "code with a helpful style" here means two things: > > 1) it should understand and adhere to basic style principles. > This is just an (often arbitrary) subset of the infinite set > of reasonable style guides. The most common-sense ones are > written down in Documentation/CodingStyle. There's a lot of > leeway, as long as the basic principle of "be helpful" is > understood and followed. > > 2) it should carry meta information outside of the language > syntax and it should build expectations about a code's > purpose and general structure. > > That is essential so that we can find bugs during review. > > If each file has a slightly different style to express labels > then that means we insert extra entropy and degrades and > obfuscates the true meat of the code and hurts the overall > reviewability of the code. > > In practical terms: i noticed that weird label - otherwise i > would not have commented on it. I noticed it because it had the > pattern of a comment block (most comment blocks start with > capital letters, and for that good reason). > > It was completely unnecessary for me to notice that label - it > carries no information about the patch itself. Ergo, it would be > better in the long run if code does not raise unnecessary mental > exceptions. We have a limited set of exceptions we are able to > handle during review, lets make sure we use them sparingly. Just to clarify, I have nothing against labels that are not capitalized etc., actually I can live with whatever style of labels is considered as appropriate and/or helpful. However, if specific style of labels was chosen for given file in the past and it is consistent over the entire file, I don't think it should be changed in a patch that does a different thing, regardless of who's maintaining the file or who's written the code in question. It should be changed in a separate patch with a changelog describing why this change is being made. I don't really have the time to write such a patch at the moment and I don't really think it's _that_ important. YMMV. > Sure, there will always be borderline cases where we'll have to > agree to disagree, even if we agree about the general principle. > > But this is not one of those cases - having a "Suspend:" > capitalized label is not something you added to enhance the > basic coding style - it is something very uncommon and > self-serving which you added in _spite_ of the general > principles i believe. It has no other message beyond "I do this > because i can!". > > I.e. it is not helpful at all. When it comes to coding style the > kernel is not a democracy at all. > > > [...] I also think that coding style changes shouldn't be > > mixed with functional changes as far as reasonably possible. > > Sure, you got that drive-by review for free, by virtue of > context diffs ;-) Well, OK. :-) Still, IMHO it's more helpful if the comments related to the changes that belong to the patch in question are not mixed with comments related to the coding style of the files being modified. Perhaps I'm picky ... Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-02-23 12:45 ` Ingo Molnar @ 2009-02-23 12:45 ` Ingo Molnar 2009-02-23 15:52 ` Johannes Berg ` (3 subsequent siblings) 7 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 12:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > +resume_devices: > > > + resume_device_irqs(); > > > > Small style nit: labels should start with a space character. > > I.e. it should be: > > I know, but the second label in there starts without a space > character and IMO keeping a uniform coding style i a single > file is more important than trying to adjust it to a broader > set of rules FWIW. [...] Even though it's just a very small and insignificant detail (nowhere described in the CodingStyle), barely worth the mention (and i already regret having brought it up at all), what you say is wrong on a conceptual level and that alarms me a bit ;-) It is exactly these kinds of "my code, my style!" world view that results in a crappy overall kernel style. For a single file to look consistent is just the first (and required) step, what matters even more is for files to have similar coding patterns, to make the style as helpful to the general kernel developer/reviewer/bug-fixer/maintainer as possible. "code with a helpful style" here means two things: 1) it should understand and adhere to basic style principles. This is just an (often arbitrary) subset of the infinite set of reasonable style guides. The most common-sense ones are written down in Documentation/CodingStyle. There's a lot of leeway, as long as the basic principle of "be helpful" is understood and followed. 2) it should carry meta information outside of the language syntax and it should build expectations about a code's purpose and general structure. That is essential so that we can find bugs during review. If each file has a slightly different style to express labels then that means we insert extra entropy and degrades and obfuscates the true meat of the code and hurts the overall reviewability of the code. In practical terms: i noticed that weird label - otherwise i would not have commented on it. I noticed it because it had the pattern of a comment block (most comment blocks start with capital letters, and for that good reason). It was completely unnecessary for me to notice that label - it carries no information about the patch itself. Ergo, it would be better in the long run if code does not raise unnecessary mental exceptions. We have a limited set of exceptions we are able to handle during review, lets make sure we use them sparingly. Sure, there will always be borderline cases where we'll have to agree to disagree, even if we agree about the general principle. But this is not one of those cases - having a "Suspend:" capitalized label is not something you added to enhance the basic coding style - it is something very uncommon and self-serving which you added in _spite_ of the general principles i believe. It has no other message beyond "I do this because i can!". I.e. it is not helpful at all. When it comes to coding style the kernel is not a democracy at all. > [...] I also think that coding style changes shouldn't be > mixed with functional changes as far as reasonably possible. Sure, you got that drive-by review for free, by virtue of context diffs ;-) Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-02-23 12:45 ` Ingo Molnar @ 2009-02-23 15:52 ` Johannes Berg 2009-02-23 15:52 ` Johannes Berg ` (2 subsequent siblings) 7 siblings, 0 replies; 373+ messages in thread From: Johannes Berg @ 2009-02-23 15:52 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list [-- Attachment #1.1: Type: text/plain, Size: 1047 bytes --] On Mon, 2009-02-23 at 12:29 +0100, Rafael J. Wysocki wrote: > > > + arch_suspend_disable_irqs(); > > > + BUG_ON(!irqs_disabled()); > > > > Please. We just disabled all devices - a BUG_ON() is a very > > counter-productive thing to do here - chances are the user will > > never see anything but a hang. So please turn this into a nice > > WARN_ONCE(). > > This is just moving code. Also, the BUG_ON() can only affect powerpc and it's > there on purpose AFAICS (Johannes?). Anyway, changing that would be a separate > patch. It can affect any platform that overrides the weak symbol arch_suspend_disable_irqs(), and I think that if you're writing this low-level code you better have a way to debug. As such, I don't think it needs changing, because you can only ever see that while implementing arch_suspend_disable_irqs(). OTOH, since it can only trigger then, a WARN_ON is probably fine as well since you'll be getting your machine into inconsistent states all the time while implementing this ;) johannes [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki ` (4 preceding siblings ...) 2009-02-23 15:52 ` Johannes Berg @ 2009-02-23 15:52 ` Johannes Berg 2009-02-23 17:16 ` Ingo Molnar 2009-02-23 17:16 ` Ingo Molnar 7 siblings, 0 replies; 373+ messages in thread From: Johannes Berg @ 2009-02-23 15:52 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner [-- Attachment #1: Type: text/plain, Size: 1047 bytes --] On Mon, 2009-02-23 at 12:29 +0100, Rafael J. Wysocki wrote: > > > + arch_suspend_disable_irqs(); > > > + BUG_ON(!irqs_disabled()); > > > > Please. We just disabled all devices - a BUG_ON() is a very > > counter-productive thing to do here - chances are the user will > > never see anything but a hang. So please turn this into a nice > > WARN_ONCE(). > > This is just moving code. Also, the BUG_ON() can only affect powerpc and it's > there on purpose AFAICS (Johannes?). Anyway, changing that would be a separate > patch. It can affect any platform that overrides the weak symbol arch_suspend_disable_irqs(), and I think that if you're writing this low-level code you better have a way to debug. As such, I don't think it needs changing, because you can only ever see that while implementing arch_suspend_disable_irqs(). OTOH, since it can only trigger then, a WARN_ON is probably fine as well since you'll be getting your machine into inconsistent states all the time while implementing this ;) johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki ` (5 preceding siblings ...) 2009-02-23 15:52 ` Johannes Berg @ 2009-02-23 17:16 ` Ingo Molnar 2009-02-23 17:16 ` Ingo Molnar 7 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 17:16 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, Linus Torvalds, pm list * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > +void suspend_device_irqs(void) > > > +{ > > > + struct irq_desc *desc; > > > + int irq; > > > + > > > + for_each_irq_desc(irq, desc) { > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&desc->lock, flags); > > > + > > > + if (!desc->depth && desc->action > > > + && !(desc->action->flags & IRQF_TIMER)) { > > > + desc->depth++; > > > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > > > + desc->chip->disable(irq); > > > + } > > > + > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > + } > > > + > > > + for_each_irq_desc(irq, desc) { > > > + if (desc->status & IRQ_SUSPENDED) > > > + synchronize_irq(irq); > > > + } > > > > Optimization/code-flow nit: a possibility might be to do a > > single loop, i.e. i think it's safe to couple the > > disable+sync bits [as in 99.99% of the cases there will be > > no in-execution irq handlers when we execute this.] > > Well, Linus suggested to do it in a separate loop. I'm fine > with both ways. Linus, do you have a strong opinion about which variant we should use? The two approaches are not completely equivalent, the variant suggested by Linus is a bit more 'atomic' - in that it first turns off everything, then looks for everything that needs to be synchronized. OTOH, it _shouldnt_ make much of a difference on a correctly working system - we ought to be able to disable the irqs one by one and wait on any pending ones on the spot. Maybe if there was some implicit dependency between irq sources it would be more robust to do Linus's version. Dunno ... no strong feelings either way. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 11:29 ` Rafael J. Wysocki ` (6 preceding siblings ...) 2009-02-23 17:16 ` Ingo Molnar @ 2009-02-23 17:16 ` Ingo Molnar 2009-02-23 17:28 ` Linus Torvalds 7 siblings, 1 reply; 373+ messages in thread From: Ingo Molnar @ 2009-02-23 17:16 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Johannes Berg, Linus Torvalds, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > +void suspend_device_irqs(void) > > > +{ > > > + struct irq_desc *desc; > > > + int irq; > > > + > > > + for_each_irq_desc(irq, desc) { > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&desc->lock, flags); > > > + > > > + if (!desc->depth && desc->action > > > + && !(desc->action->flags & IRQF_TIMER)) { > > > + desc->depth++; > > > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > > > + desc->chip->disable(irq); > > > + } > > > + > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > + } > > > + > > > + for_each_irq_desc(irq, desc) { > > > + if (desc->status & IRQ_SUSPENDED) > > > + synchronize_irq(irq); > > > + } > > > > Optimization/code-flow nit: a possibility might be to do a > > single loop, i.e. i think it's safe to couple the > > disable+sync bits [as in 99.99% of the cases there will be > > no in-execution irq handlers when we execute this.] > > Well, Linus suggested to do it in a separate loop. I'm fine > with both ways. Linus, do you have a strong opinion about which variant we should use? The two approaches are not completely equivalent, the variant suggested by Linus is a bit more 'atomic' - in that it first turns off everything, then looks for everything that needs to be synchronized. OTOH, it _shouldnt_ make much of a difference on a correctly working system - we ought to be able to disable the irqs one by one and wait on any pending ones on the spot. Maybe if there was some implicit dependency between irq sources it would be more robust to do Linus's version. Dunno ... no strong feelings either way. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 17:16 ` Ingo Molnar @ 2009-02-23 17:28 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 17:28 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Johannes Berg, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Mon, 23 Feb 2009, Ingo Molnar wrote: > > Linus, do you have a strong opinion about which variant we > should use? Strong? No. I think mine is better just because _if_ another CPU is busy handling an interrupt that we're just now disabling, we'll just go on to the next interrupt. Waiting for them all at the end is always more efficient. But does it really matter? No. In this case I think we've shut down all other CPU's anyway, so the whole "serialize_irq()" should probably not even be needed. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 17:28 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-23 17:28 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Johannes Berg, pm list On Mon, 23 Feb 2009, Ingo Molnar wrote: > > Linus, do you have a strong opinion about which variant we > should use? Strong? No. I think mine is better just because _if_ another CPU is busy handling an interrupt that we're just now disabling, we'll just go on to the next interrupt. Waiting for them all at the end is always more efficient. But does it really matter? No. In this case I think we've shut down all other CPU's anyway, so the whole "serialize_irq()" should probably not even be needed. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 17:28 ` Linus Torvalds (?) @ 2009-02-23 22:11 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 22:11 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Johannes Berg, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Linus Torvalds wrote: > > On Mon, 23 Feb 2009, Ingo Molnar wrote: > > > > Linus, do you have a strong opinion about which variant we > > should use? > > Strong? No. I think mine is better just because _if_ another CPU is busy > handling an interrupt that we're just now disabling, we'll just go on to > the next interrupt. Waiting for them all at the end is always more > efficient. > > But does it really matter? No. In this case I think we've shut down all > other CPU's anyway, so the whole "serialize_irq()" should probably not > even be needed. But we're going to move the shutting down of the other CPUs after this point. Finally, the sequence is going to be: - "normal" suspend of devices - disable device interrupts - "late" suspend of devices - _PTS - disable nonboot CPUs - local_irq_disable - sysdev_suspend [This is because ACPI wants us to put devices into low power states before doing the _PTS, which in turn is supposed to be done before the disabling of nonboot CPUs, and we want to put devices into low power states during "late" suspend. Of course, analogously for the resume part.] So, I think your version is _really_ better. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 17:28 ` Linus Torvalds (?) (?) @ 2009-02-23 22:11 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 22:11 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, pm list, Ingo Molnar, Johannes Berg On Monday 23 February 2009, Linus Torvalds wrote: > > On Mon, 23 Feb 2009, Ingo Molnar wrote: > > > > Linus, do you have a strong opinion about which variant we > > should use? > > Strong? No. I think mine is better just because _if_ another CPU is busy > handling an interrupt that we're just now disabling, we'll just go on to > the next interrupt. Waiting for them all at the end is always more > efficient. > > But does it really matter? No. In this case I think we've shut down all > other CPU's anyway, so the whole "serialize_irq()" should probably not > even be needed. But we're going to move the shutting down of the other CPUs after this point. Finally, the sequence is going to be: - "normal" suspend of devices - disable device interrupts - "late" suspend of devices - _PTS - disable nonboot CPUs - local_irq_disable - sysdev_suspend [This is because ACPI wants us to put devices into low power states before doing the _PTS, which in turn is supposed to be done before the disabling of nonboot CPUs, and we want to put devices into low power states during "late" suspend. Of course, analogously for the resume part.] So, I think your version is _really_ better. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 11:29 ` Rafael J. Wysocki @ 2009-02-23 11:29 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 11:29 UTC (permalink / raw) To: Ingo Molnar, Johannes Berg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Monday 23 February 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Sunday 22 February 2009, Rafael J. Wysocki wrote: > > > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > [--snip--] > > > > > > Thanks a lot for your comments, I'll send an updated patch shortly. > > > > The updated patch is appended. > > > > It has been initially tested, but requires more testing, > > especially with APM, XEN, kexec jump etc. > > > arch/x86/kernel/apm_32.c | 20 ++++++++++++---- > > drivers/xen/manage.c | 32 +++++++++++++++---------- > > include/linux/interrupt.h | 3 ++ > > include/linux/irq.h | 1 > > kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ > > kernel/kexec.c | 10 ++++---- > > kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- > > kernel/power/main.c | 20 +++++++++++----- > > 8 files changed, 152 insertions(+), 37 deletions(-) > > > > Index: linux-2.6/kernel/irq/manage.c > > =================================================================== > > --- linux-2.6.orig/kernel/irq/manage.c > > +++ linux-2.6/kernel/irq/manage.c > > @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha > > return retval; > > } > > EXPORT_SYMBOL(request_irq); > > + > > +#ifdef CONFIG_PM_SLEEP > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > Code placement nit: please dont put new #ifdef blocks into the > core IRQ code, add a kernel/irq/power.c file instead and make > the kbuild rule depend on PM_SLEEP. > > The new suspend_device_irqs() and resume_device_irqs() doesnt > use any manage.c internals so this should work straight away. OK, I'll do that. > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this > > + * purpose. It disables all interrupt lines that are enabled at the > > + * moment and sets the IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (!desc->depth && desc->action > > + && !(desc->action->flags & IRQF_TIMER)) { > > + desc->depth++; > > + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > > + desc->chip->disable(irq); > > + } > > + > > + spin_unlock_irqrestore(&desc->lock, flags); > > + } > > + > > + for_each_irq_desc(irq, desc) { > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > + } > > Optimization/code-flow nit: a possibility might be to do a > single loop, i.e. i think it's safe to couple the disable+sync > bits [as in 99.99% of the cases there will be no in-execution > irq handlers when we execute this.] Well, Linus suggested to do it in a separate loop. I'm fine with both ways. > Something like: > > int do_sync = 0; > > spin_lock_irqsave(&desc->lock, flags); > > if (!desc->depth && desc->action > && !(desc->action->flags & IRQF_TIMER)) { > > desc->depth++; > desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > desc->chip->disable(irq); > do_sync = 1; > } > > spin_unlock_irqrestore(&desc->lock, flags); > > if (do_sync) > synchronize_irq(irq); > > In fact i'd suggest to factor out this logic into a separate > __suspend_irq(irq) / __resume_irq(irq) inline helper functions. > (They should be inline for the time being as they are not > shared-irq-safe so they shouldnt really be exposed to drivers in > such a singular capacity.) Good idea, I'll do it. > Doing so will also fix the line-break ugliness of the first > branch - as in a standalone function the condition fits into a > single line. > > There's a performance reason as well: especially when we have a > lot of IRQ descriptors that will be about twice as fast. (with a > large iteration scope this function is cachemiss-limited and > doing this passes doubles the cachemiss rate.) > > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() > > + * that have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + desc->status &= ~IRQ_SUSPENDED; > > + enable_irq(irq); > > + } > > Robustness+optimization nit: this will work but could be done in > a nicer way: enable_irq() should auto-clear IRQ_SUSPENDED. (We > already clear flags there so it's even a tiny bit faster this > way.) OK > We definitely dont want IRQ_SUSPENDED to 'leak' out into an > enabled line, should something call enable_irq() on a suspended > line. So either make it auto-unsuspend in enable_irq(), or add > an extra WARN_ON() to enable_irq(), to make sure IRQ_SUSPENDED > is always off by that time. > > > + arch_suspend_disable_irqs(); > > + BUG_ON(!irqs_disabled()); > > Please. We just disabled all devices - a BUG_ON() is a very > counter-productive thing to do here - chances are the user will > never see anything but a hang. So please turn this into a nice > WARN_ONCE(). This is just moving code. Also, the BUG_ON() can only affect powerpc and it's there on purpose AFAICS (Johannes?). Anyway, changing that would be a separate patch. > > --- linux-2.6.orig/include/linux/interrupt.h > > +++ linux-2.6/include/linux/interrupt.h > > @@ -470,4 +470,7 @@ extern int early_irq_init(void); > > extern int arch_early_irq_init(void); > > extern int arch_init_chip_data(struct irq_desc *desc, int cpu); > > > > +extern void suspend_device_irqs(void); > > +extern void resume_device_irqs(void); > > Header cleanliness nit: please dont just throw new prototypes to > the tail of headers, but think about where they fit in best, > logically. > > These two new prototypes should go straight after the normal irq > line state management functions: > > extern void disable_irq_nosync(unsigned int irq); > extern void disable_irq(unsigned int irq); > extern void enable_irq(unsigned int irq); > > Perhaps also with a comment like this: > > /* > * Note: dont use these functions in driver code - they are for > * core kernel use only. > */ OK, I'll put them in there. > > +++ linux-2.6/kernel/power/main.c > [...] > > + > > + Unlock: > > + resume_device_irqs(); > > Small drive-by style nit: while at it could you please fix the > capitalization and the naming of the label (and all labels in > this file)? I don't think they are wrong. They are uniform accross the file and it's clear what they mean. > The standard label is "out_unlock". [and "err_unlock" for failure cases > - but this isnt a failure case.] Where exactly is this standard defined? > There's 43 such bad label names in kernel/power/*.c, see the > output of: > > git grep '^ [A-Z][a-z].*:$' kernel/power/ If you think they are bad, please send a patch to change them. > > Index: linux-2.6/arch/x86/kernel/apm_32.c > > =================================================================== > > --- linux-2.6.orig/arch/x86/kernel/apm_32.c > > +++ linux-2.6/arch/x86/kernel/apm_32.c > > > + > > + suspend_device_irqs(); > > device_power_down(PMSG_SUSPEND); > > + > > + local_irq_disable(); > > hm, this is a very repetitive pattern, all around the various > suspend/resume variants. Might make sense to make: > > device_power_down(PMSG_SUSPEND); > > do the irq line disabling plus the local irq disabling > automatically. That also means it cannot be forgotten. The > symmetric action should happen for PMSG_RESUME. > > Is there ever a case where we want a different pattern? Even if there's no such case, I prefer to call local_irq_disable() explicitly in here, so that it's clearly known where it happens to anyone reading this code. Doing the "late" suspend of devices and disabling interrupts on the CPU are separate logical steps. > > Index: linux-2.6/drivers/xen/manage.c > > =================================================================== > > --- linux-2.6.orig/drivers/xen/manage.c > > +++ linux-2.6/drivers/xen/manage.c > > @@ -39,12 +39,6 @@ static int xen_suspend(void *data) > > > - if (!*cancelled) { > > - xen_irq_resume(); > > - xen_console_resume(); > > - xen_timer_resume(); > > This change needs a second look. xen_suspend() is a > stop_machine() handler and as such executes on specific CPUs, > and your change modifies this. OTOH, i had a look at these > handlers and it all looks safe. Jeremy? > > > +resume_devices: > > + resume_device_irqs(); > > Small style nit: labels should start with a space character. > I.e. it should be: I know, but the second label in there starts without a space character and IMO keeping a uniform coding style i a single file is more important than trying to adjust it to a broader set of rules FWIW. I also think that coding style changes shouldn't be mixed with functional changes as far as reasonably possible. > > + resume_devices: > > + resume_device_irqs(); > > > +++ linux-2.6/kernel/kexec.c > > @@ -1454,7 +1454,7 @@ int kernel_kexec(void) > > if (error) > > goto Resume_devices; > > device_pm_lock(); > > - local_irq_disable(); > > + suspend_device_irqs(); > > /* At this point, device_suspend() has been called, > > * but *not* device_power_down(). We *must* > > * device_power_down() now. Otherwise, drivers for > > @@ -1464,8 +1464,9 @@ int kernel_kexec(void) > > */ > > error = device_power_down(PMSG_FREEZE); > > if (error) > > - goto Enable_irqs; > > + goto Resume_irqs; > > > > + local_irq_disable(); > > /* Suspend system devices */ > > error = sysdev_suspend(PMSG_FREEZE); > > if (error) > > @@ -1484,9 +1485,10 @@ int kernel_kexec(void) > > if (kexec_image->preserve_context) { > > sysdev_resume(); > > Power_up_devices: > > - device_power_up(PMSG_RESTORE); > > - Enable_irqs: > > local_irq_enable(); > > + device_power_up(PMSG_RESTORE); > > + Resume_irqs: > > + resume_device_irqs(); > > device_pm_unlock(); > > enable_nonboot_cpus(); > > Resume_devices: > > (same comment about label style applies here too.) > > > Index: linux-2.6/include/linux/irq.h > > =================================================================== > > --- linux-2.6.orig/include/linux/irq.h > > +++ linux-2.6/include/linux/irq.h > > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig > > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > > > #ifdef CONFIG_IRQ_PER_CPU > > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) > > Note, you should probably make PM_SLEEP depend on > GENERIC_HARDIRQS - as this change will break the build on all > non-genirq architectures. (sparc, alpha, etc.) PM_SLEEP depends on ARCH_SUSPEND_POSSIBLE || ARCH_HIBERNATION_POSSIBLE, which I don't think is set on these architectures. Thanlks a lot for your comments. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 23:48 ` Rafael J. Wysocki @ 2009-02-22 23:48 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 23:48 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sunday 22 February 2009, Rafael J. Wysocki wrote: > On Sunday 22 February 2009, Linus Torvalds wrote: > > > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: [--snip--] > > Thanks a lot for your comments, I'll send an updated patch shortly. The updated patch is appended. It has been initially tested, but requires more testing, especially with APM, XEN, kexec jump etc. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 2) Introduce two helper functions allowing us to disable device interrupts (at the IO-APIC level) during suspend or hibernation and enable them during the subsequent resume, respectively, so that the timer interrupts are enabled while "late" suspend callbacks and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device interrupts will be disabled (at the IO-APIC level), with the help of the new helper function, before calling "late" suspend callbacks provided by device drivers and analogously during resume. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 20 ++++++++++++---- drivers/xen/manage.c | 32 +++++++++++++++---------- include/linux/interrupt.h | 3 ++ include/linux/irq.h | 1 kernel/irq/manage.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 10 ++++---- kernel/power/disk.c | 46 +++++++++++++++++++++++++++++-------- kernel/power/main.c | 20 +++++++++++----- 8 files changed, 152 insertions(+), 37 deletions(-) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -746,3 +746,60 @@ int request_irq(unsigned int irq, irq_ha return retval; } EXPORT_SYMBOL(request_irq); + +#ifdef CONFIG_PM_SLEEP +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and sets the IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && desc->action + && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupts disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() + * that have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + if (!(desc->status & IRQ_SUSPENDED)) + continue; + desc->status &= ~IRQ_SUSPENDED; + enable_irq(irq); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); +#endif /* CONFIG_PM_SLEEP */ Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -470,4 +470,7 @@ extern int early_irq_init(void); extern int arch_early_irq_init(void); extern int arch_init_chip_data(struct irq_desc *desc, int cpu); +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); + #endif Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -22,6 +22,7 @@ #include <linux/freezer.h> #include <linux/vmstat.h> #include <linux/syscalls.h> +#include <linux/interrupt.h> #include "power.h" @@ -287,17 +288,20 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); + suspend_device_irqs(); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +309,15 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: + resume_device_irqs(); device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -22,6 +22,7 @@ #include <linux/console.h> #include <linux/cpu.h> #include <linux/freezer.h> +#include <linux/interrupt.h> #include "power.h" @@ -214,7 +215,8 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +227,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +257,17 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: + resume_device_irqs(); device_pm_unlock(); + return error; } @@ -336,13 +346,17 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +380,17 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: + resume_device_irqs(); device_pm_unlock(); + return error; } @@ -447,15 +467,18 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + + resume_device_irqs(); device_pm_unlock(); /* @@ -464,12 +487,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -228,6 +228,7 @@ #include <linux/suspend.h> #include <linux/kthread.h> #include <linux/jiffies.h> +#include <linux/interrupt.h> #include <asm/system.h> #include <asm/uaccess.h> @@ -1190,8 +1191,11 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + + suspend_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1213,13 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + resume_device_irqs(); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1236,10 @@ static void standby(void) { int err; - local_irq_disable(); + suspend_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1249,10 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + resume_device_irqs(); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,13 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); - - if (!*cancelled) { - xen_irq_resume(); - xen_console_resume(); - xen_timer_resume(); - } return 0; } @@ -108,6 +95,14 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + suspend_device_irqs(); + + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +115,17 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + + if (!cancelled) { + xen_irq_resume(); + xen_console_resume(); + xen_timer_resume(); + } + +resume_devices: + resume_device_irqs(); + device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,7 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); + suspend_device_irqs(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1464,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Resume_irqs; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1485,10 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Resume_irqs: + resume_device_irqs(); device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 18:01 ` Linus Torvalds 2009-02-22 22:42 ` Rafael J. Wysocki @ 2009-02-22 22:42 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 22:42 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sunday 22 February 2009, Linus Torvalds wrote: > > On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > interrupts will be disabled (at the IO-APIC level), with the help of > > the new helper function, before calling "late" suspend callbacks > > provided by device drivers and analogously during resume. > > I think this patch is actually a bit too complicated. > > > +struct disabled_irq { > > + struct list_head list; > > + int irq; > > +}; > > + > > +static LIST_HEAD(resume_irqs_list); > > + > > +/** > > + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by disable_device_irqs() > > + * that are on resume_irqs_list. > > + */ > > +void enable_device_irqs(void) > > +{ > > + struct disabled_irq *resume_irq, *tmp; > > + > > + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { > > + enable_irq(resume_irq->irq); > > + list_del(&resume_irq->list); > > + kfree(resume_irq); > > + } > > +} > > Don't do this whole separate list. Instead, just add a per-irq-descriptor > flag to the desc->status field that says "suspended". IOW, just do > something like OK > diff --git a/include/linux/irq.h b/include/linux/irq.h > index f899b50..7bc2a31 100644 > --- a/include/linux/irq.h > +++ b/include/linux/irq.h > @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsigned int irq, > #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ > #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ > #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ > +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ > > #ifdef CONFIG_IRQ_PER_CPU > # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) > > and then just make the suspend sequence do > > for_each_irq_desc(irq, desc) { > .. check desc if we should disable it .. > disable_irq(irq); > desc->status |= IRQ_SUSPENDED; > } > > and the resume sequence do > > for_each_irq_desc(irq, desc) { > if (!(desc->status & IRQ_SUSPENDED)) > continue; > desc->status &= ~IRQ_SUSPENDED; > enabled_irq(irq); > } > > And that simplifcation then gets rid of > > > +/** > > + * disable_device_irqs - disable all enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this > > + * purpose. It disables all interrupt lines that are enabled at the > > + * moment and saves their numbers for enable_device_irqs(). > > + */ > > +int disable_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + struct disabled_irq *resume_irq; > > + struct irqaction *action; > > + bool is_timer_irq; > > + > > + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); > > + if (!resume_irq) { > > + enable_device_irqs(); > > + return -ENOMEM; > > + } > > this just goes away. > > > + is_timer_irq = false; > > + action = desc->action; > > + while (action) { > > + if (action->flags | IRQF_TIMER) { > > + is_timer_irq = true; > > + break; > > + } > > + action = action->next; > > + } > > This is also pointless and wrong (and buggy). You should use '&' to > test that flag, not '|', Ouch, sorry. > but more importantly, if you share interrupts with a timer irq, there's > nothing sane the irq layer can do ANYWAY, so just ignore the whole problem. > Just look at the first one, don't try to be clever, because your clever code > doesn't buy anything at all. > > So get rid of the loop, and just do > > if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > desc->depth++; > desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; > desc->chip->disable(irq); > } > spin_unlock_irqrestore(&desc->lock, flags); > > and you're done. OK > Also, I'd actually suggest that the whole "synchronize_irq()" be handled > in a separate loop after the main one, so make that one just be > > for_each_irq_desc(irq, desc) { > if (desc->status & IRQ_SUSPENDED) > serialize_irq(irq); > } > > at the end. No need for desc->lock, since the IRQ_SUSPENDED bit is stable. OK > Finally: > > > +extern int disable_device_irqs(void); > > +extern void enable_device_irqs(void); > > I think the naming is not great. It's not about disable/enable, it's very > much about suspend/resume. In your version, it had that global > "disabled_irq" list, and in mine it has that IRQ_SUSPENDED bit - and in > both cases you can't nest things, and you can't consider them in any way > "generic" enable/disable things, they are very specialized "shut up > everything but the timer irq". OK, would extern void suspend_device_irqs(void); extern void resume_device_irqs(void); be better? > I also don't think there is any reasonable error case, so just make the > "suspend" thing return 'void', and don't complicate the caller. We don't > error out on the simple "disable_irq()" either. It's a imperative > statement, not a "please can you try to do that" thing. The error is there just because the memory allocation can fail. With the IRQ_SUSPENDED flag as per your suggestion it won't be necessary any more. Thanks a lot for your comments, I'll send an updated patch shortly. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:39 ` [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds @ 2009-02-22 18:01 ` Linus Torvalds 2009-02-23 22:11 ` Arve Hjønnevåg 2009-02-23 22:11 ` Arve Hjønnevåg 3 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. I think this patch is actually a bit too complicated. > +struct disabled_irq { > + struct list_head list; > + int irq; > +}; > + > +static LIST_HEAD(resume_irqs_list); > + > +/** > + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() > + * > + * Enable all interrupt lines previously disabled by disable_device_irqs() > + * that are on resume_irqs_list. > + */ > +void enable_device_irqs(void) > +{ > + struct disabled_irq *resume_irq, *tmp; > + > + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { > + enable_irq(resume_irq->irq); > + list_del(&resume_irq->list); > + kfree(resume_irq); > + } > +} Don't do this whole separate list. Instead, just add a per-irq-descriptor flag to the desc->status field that says "suspended". IOW, just do something like diff --git a/include/linux/irq.h b/include/linux/irq.h index f899b50..7bc2a31 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsigned int irq, #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) and then just make the suspend sequence do for_each_irq_desc(irq, desc) { .. check desc if we should disable it .. disable_irq(irq); desc->status |= IRQ_SUSPENDED; } and the resume sequence do for_each_irq_desc(irq, desc) { if (!(desc->status & IRQ_SUSPENDED)) continue; desc->status &= ~IRQ_SUSPENDED; enabled_irq(irq); } And that simplifcation then gets rid of > +/** > + * disable_device_irqs - disable all enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this > + * purpose. It disables all interrupt lines that are enabled at the > + * moment and saves their numbers for enable_device_irqs(). > + */ > +int disable_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + struct disabled_irq *resume_irq; > + struct irqaction *action; > + bool is_timer_irq; > + > + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); > + if (!resume_irq) { > + enable_device_irqs(); > + return -ENOMEM; > + } this just goes away. > + is_timer_irq = false; > + action = desc->action; > + while (action) { > + if (action->flags | IRQF_TIMER) { > + is_timer_irq = true; > + break; > + } > + action = action->next; > + } This is also pointless and wrong (and buggy). You should use '&' to test that flag, not '|', but more importantly, if you share interrupts with a timer irq, there's nothing sane the irq layer can do ANYWAY, so just ignore the whole problem. Just look at the first one, don't try to be clever, because your clever code doesn't buy anything at all. So get rid of the loop, and just do if (desc->action && !(desc->action->flags & IRQF_TIMER)) { desc->depth++; desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; desc->chip->disable(irq); } spin_unlock_irqrestore(&desc->lock, flags); and you're done. Also, I'd actually suggest that the whole "synchronize_irq()" be handled in a separate loop after the main one, so make that one just be for_each_irq_desc(irq, desc) { if (desc->status & IRQ_SUSPENDED) serialize_irq(irq); } at the end. No need for desc->lock, since the IRQ_SUSPENDED bit is stable. Finally: > +extern int disable_device_irqs(void); > +extern void enable_device_irqs(void); I think the naming is not great. It's not about disable/enable, it's very much about suspend/resume. In your version, it had that global "disabled_irq" list, and in mine it has that IRQ_SUSPENDED bit - and in both cases you can't nest things, and you can't consider them in any way "generic" enable/disable things, they are very specialized "shut up everything but the timer irq". I also don't think there is any reasonable error case, so just make the "suspend" thing return 'void', and don't complicate the caller. We don't error out on the simple "disable_irq()" either. It's a imperative statement, not a "please can you try to do that" thing. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:39 ` [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds 2009-02-22 18:01 ` Linus Torvalds @ 2009-02-23 22:11 ` Arve Hjønnevåg 2009-02-23 22:11 ` Arve Hjønnevåg 3 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-23 22:11 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to disable device > interrupts (at the IO-APIC level) during suspend or hibernation > and enable them during the subsequent resume, respectively, so that > the timer interrupts are enabled while "late" suspend callbacks and > "early" resume callbacks provided by device drivers are being > executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. > What impact does this have on wakeup interrupts? Unless you add a check, after masking all interrupt at the CPU, to abort suspend if any wakeup interrupt has IRQ_PENDING set I think you will loose wakeup interrupts (at least for irqs that use default_disable). -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:39 ` [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki ` (2 preceding siblings ...) 2009-02-23 22:11 ` Arve Hjønnevåg @ 2009-02-23 22:11 ` Arve Hjønnevåg 2009-02-23 22:23 ` Rafael J. Wysocki 3 siblings, 1 reply; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-23 22:11 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to disable device > interrupts (at the IO-APIC level) during suspend or hibernation > and enable them during the subsequent resume, respectively, so that > the timer interrupts are enabled while "late" suspend callbacks and > "early" resume callbacks provided by device drivers are being > executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > interrupts will be disabled (at the IO-APIC level), with the help of > the new helper function, before calling "late" suspend callbacks > provided by device drivers and analogously during resume. > What impact does this have on wakeup interrupts? Unless you add a check, after masking all interrupt at the CPU, to abort suspend if any wakeup interrupt has IRQ_PENDING set I think you will loose wakeup interrupts (at least for irqs that use default_disable). -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 22:11 ` Arve Hjønnevåg @ 2009-02-23 22:23 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 22:23 UTC (permalink / raw) To: Arve Hjønnevåg Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Monday 23 February 2009, Arve Hjønnevåg wrote: > On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to disable device > > interrupts (at the IO-APIC level) during suspend or hibernation > > and enable them during the subsequent resume, respectively, so that > > the timer interrupts are enabled while "late" suspend callbacks and > > "early" resume callbacks provided by device drivers are being > > executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > interrupts will be disabled (at the IO-APIC level), with the help of > > the new helper function, before calling "late" suspend callbacks > > provided by device drivers and analogously during resume. > > > > What impact does this have on wakeup interrupts? Unless you add a > check, after masking all interrupt at the CPU, to abort suspend if any > wakeup interrupt has IRQ_PENDING set I think you will loose wakeup > interrupts (at least for irqs that use default_disable). I _think_ they would have to be reenabled after we've called local_irq_disable(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume @ 2009-02-23 22:23 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-23 22:23 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Monday 23 February 2009, Arve Hjønnevåg wrote: > On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to disable device > > interrupts (at the IO-APIC level) during suspend or hibernation > > and enable them during the subsequent resume, respectively, so that > > the timer interrupts are enabled while "late" suspend callbacks and > > "early" resume callbacks provided by device drivers are being > > executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > interrupts will be disabled (at the IO-APIC level), with the help of > > the new helper function, before calling "late" suspend callbacks > > provided by device drivers and analogously during resume. > > > > What impact does this have on wakeup interrupts? Unless you add a > check, after masking all interrupt at the CPU, to abort suspend if any > wakeup interrupt has IRQ_PENDING set I think you will loose wakeup > interrupts (at least for irqs that use default_disable). I _think_ they would have to be reenabled after we've called local_irq_disable(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 22:23 ` Rafael J. Wysocki (?) @ 2009-02-23 22:44 ` Arve Hjønnevåg -1 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-23 22:44 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Mon, Feb 23, 2009 at 2:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Monday 23 February 2009, Arve Hjønnevåg wrote: >> On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> > >> > Introduce two helper functions allowing us to disable device >> > interrupts (at the IO-APIC level) during suspend or hibernation >> > and enable them during the subsequent resume, respectively, so that >> > the timer interrupts are enabled while "late" suspend callbacks and >> > "early" resume callbacks provided by device drivers are being >> > executed. >> > >> > Use these functions to rework the handling of interrupts during >> > suspend (hibernation) and resume. Namely, interrupts will only be >> > disabled on the CPU right before suspending sysdevs, while device >> > interrupts will be disabled (at the IO-APIC level), with the help of >> > the new helper function, before calling "late" suspend callbacks >> > provided by device drivers and analogously during resume. >> > >> >> What impact does this have on wakeup interrupts? Unless you add a >> check, after masking all interrupt at the CPU, to abort suspend if any >> wakeup interrupt has IRQ_PENDING set I think you will loose wakeup >> interrupts (at least for irqs that use default_disable). > > I _think_ they would have to be reenabled after we've called > local_irq_disable(). Are you talking about the irq_chip switching from enabled interrupts to wake interrupts? It is not enough for the irq_chip to reenable the hardware interrupt. If the interrupt is edge triggered and occurred after you disabled it, but before local_irq_disable, the only record of it is the IRQ_PENDING flag. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-23 22:23 ` Rafael J. Wysocki (?) (?) @ 2009-02-23 22:44 ` Arve Hjønnevåg -1 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-02-23 22:44 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Mon, Feb 23, 2009 at 2:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Monday 23 February 2009, Arve Hjønnevåg wrote: >> On Sun, Feb 22, 2009 at 9:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> > >> > Introduce two helper functions allowing us to disable device >> > interrupts (at the IO-APIC level) during suspend or hibernation >> > and enable them during the subsequent resume, respectively, so that >> > the timer interrupts are enabled while "late" suspend callbacks and >> > "early" resume callbacks provided by device drivers are being >> > executed. >> > >> > Use these functions to rework the handling of interrupts during >> > suspend (hibernation) and resume. Namely, interrupts will only be >> > disabled on the CPU right before suspending sysdevs, while device >> > interrupts will be disabled (at the IO-APIC level), with the help of >> > the new helper function, before calling "late" suspend callbacks >> > provided by device drivers and analogously during resume. >> > >> >> What impact does this have on wakeup interrupts? Unless you add a >> check, after masking all interrupt at the CPU, to abort suspend if any >> wakeup interrupt has IRQ_PENDING set I think you will loose wakeup >> interrupts (at least for irqs that use default_disable). > > I _think_ they would have to be reenabled after we've called > local_irq_disable(). Are you talking about the irq_chip switching from enabled interrupts to wake interrupts? It is not enough for the irq_chip to reenable the hardware interrupt. If the interrupt is edge triggered and occurred after you disabled it, but before local_irq_disable, the only record of it is the IRQ_PENDING flag. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (3 preceding siblings ...) (?) @ 2009-02-22 17:39 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-02-22 17:39 UTC (permalink / raw) To: LKML Cc: Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to disable device interrupts (at the IO-APIC level) during suspend or hibernation and enable them during the subsequent resume, respectively, so that the timer interrupts are enabled while "late" suspend callbacks and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device interrupts will be disabled (at the IO-APIC level), with the help of the new helper function, before calling "late" suspend callbacks provided by device drivers and analogously during resume. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 20 ++++++++-- drivers/xen/manage.c | 37 ++++++++++++-------- include/linux/interrupt.h | 3 + kernel/irq/manage.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 11 ++++- kernel/power/disk.c | 56 +++++++++++++++++++++++++++--- kernel/power/main.c | 27 +++++++++++--- 7 files changed, 208 insertions(+), 31 deletions(-) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -746,3 +746,88 @@ int request_irq(unsigned int irq, irq_ha return retval; } EXPORT_SYMBOL(request_irq); + +#ifdef CONFIG_PM_SLEEP +struct disabled_irq { + struct list_head list; + int irq; +}; + +static LIST_HEAD(resume_irqs_list); + +/** + * enable_device_irqs - enable interrupts disabled by disable_device_irqs() + * + * Enable all interrupt lines previously disabled by disable_device_irqs() + * that are on resume_irqs_list. + */ +void enable_device_irqs(void) +{ + struct disabled_irq *resume_irq, *tmp; + + list_for_each_entry_safe(resume_irq, tmp, &resume_irqs_list, list) { + enable_irq(resume_irq->irq); + list_del(&resume_irq->list); + kfree(resume_irq); + } +} + +/** + * disable_device_irqs - disable all enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this + * purpose. It disables all interrupt lines that are enabled at the + * moment and saves their numbers for enable_device_irqs(). + */ +int disable_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + struct disabled_irq *resume_irq; + struct irqaction *action; + bool is_timer_irq; + + resume_irq = kzalloc(sizeof(*resume_irq), GFP_NOIO); + if (!resume_irq) { + enable_device_irqs(); + return -ENOMEM; + } + + spin_lock_irqsave(&desc->lock, flags); + + is_timer_irq = false; + action = desc->action; + while (action) { + if (action->flags | IRQF_TIMER) { + is_timer_irq = true; + break; + } + action = action->next; + } + + if (!is_timer_irq && !desc->depth) { + desc->depth++; + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } else { + spin_unlock_irqrestore(&desc->lock, flags); + kfree(resume_irq); + continue; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (desc->action) + synchronize_irq(irq); + + resume_irq->irq = irq; + list_add(&resume_irq->list, &resume_irqs_list); + } + + return 0; +} +#endif /* CONFIG_PM_SLEEP */ Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -470,4 +470,7 @@ extern int early_irq_init(void); extern int arch_early_irq_init(void); extern int arch_init_chip_data(struct irq_desc *desc, int cpu); +extern int disable_device_irqs(void); +extern void enable_device_irqs(void); + #endif Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -22,6 +22,7 @@ #include <linux/freezer.h> #include <linux/vmstat.h> #include <linux/syscalls.h> +#include <linux/interrupt.h> #include "power.h" @@ -287,17 +288,25 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +314,17 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: + enable_device_irqs(); + + Unlock: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -22,6 +22,7 @@ #include <linux/console.h> #include <linux/cpu.h> #include <linux/freezer.h> +#include <linux/interrupt.h> #include "power.h" @@ -214,7 +215,13 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -227,6 +234,9 @@ static int create_image(int platform_mod "aborting hibernation\n"); goto Enable_irqs; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,11 +262,17 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); + Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + + Unlock: device_pm_unlock(); return error; } @@ -336,13 +352,22 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) { + printk(KERN_ERR "PM: Failed to disable device interrupts\n"); + goto Unlock; + } + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); goto Enable_irqs; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +391,19 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); + + local_irq_enable(); + device_power_up(PMSG_RECOVER); + Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +480,23 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) + goto Unlock; + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + + enable_device_irqs(); + + Unlock: device_pm_unlock(); /* @@ -464,12 +505,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -228,6 +228,7 @@ #include <linux/suspend.h> #include <linux/kthread.h> #include <linux/jiffies.h> +#include <linux/interrupt.h> #include <asm/system.h> #include <asm/uaccess.h> @@ -1190,8 +1191,11 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + + disable_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1213,13 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + enable_device_irqs(); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1236,10 @@ static void standby(void) { int err; - local_irq_disable(); + disable_device_irqs(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1249,10 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + enable_device_irqs(); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,13 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); - - if (!*cancelled) { - xen_irq_resume(); - xen_console_resume(); - xen_timer_resume(); - } return 0; } @@ -108,6 +95,18 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = disable_device_irqs(); + if (err) { + printk(KERN_ERR "disable_device_irqs failed: %d\n", err); + goto resume_devices; + } + + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto enable_irqs; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,18 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + + if (!cancelled) { + xen_irq_resume(); + xen_console_resume(); + xen_timer_resume(); + } + +enable_irqs: + enable_device_irqs(); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,11 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); + + error = disable_device_irqs(); + if (error) + goto Unlock_pm; + /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1466,6 +1470,7 @@ int kernel_kexec(void) if (error) goto Enable_irqs; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1489,11 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: + local_irq_enable(); device_power_up(PMSG_RESTORE); Enable_irqs: - local_irq_enable(); + enable_device_irqs(); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki @ 2009-02-22 18:13 ` Linus Torvalds -1 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:13 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > However, x86 currently doesn't set the IRQF_TIMER flag and I need to > make it do so before going further in this direction and changing the > PCI PM framework to take advantage of the $subject changes, for example. Actually, you don't. The modern form of timer interrupt on x86 is the local apic timer, and it doesn't go through the io-apic at all, and is not even visible to the irq subsystem. So it stays enabled through this all. But for old-style timer interrupts, something like the appended should do it. Untested, of course, but it looks obvious enough. Linus --- arch/x86/kernel/time_64.c | 2 +- arch/x86/kernel/vmiclock_32.c | 3 ++- arch/x86/mach-default/setup.c | 2 +- arch/x86/mach-voyager/setup.c | 2 +- 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c index e6e695a..241ec39 100644 --- a/arch/x86/kernel/time_64.c +++ b/arch/x86/kernel/time_64.c @@ -115,7 +115,7 @@ unsigned long __init calibrate_cpu(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING, + .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING | IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; diff --git a/arch/x86/kernel/vmiclock_32.c b/arch/x86/kernel/vmiclock_32.c index bde106c..7a29d5c 100644 --- a/arch/x86/kernel/vmiclock_32.c +++ b/arch/x86/kernel/vmiclock_32.c @@ -1,3 +1,4 @@ + /* * VMI paravirtual timer support routines. * @@ -202,7 +203,7 @@ static irqreturn_t vmi_timer_interrupt(int irq, void *dev_id) static struct irqaction vmi_clock_action = { .name = "vmi-timer", .handler = vmi_timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING, + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, .mask = CPU_MASK_ALL, }; diff --git a/arch/x86/mach-default/setup.c b/arch/x86/mach-default/setup.c index a265a7c..d737542 100644 --- a/arch/x86/mach-default/setup.c +++ b/arch/x86/mach-default/setup.c @@ -96,7 +96,7 @@ void __init trap_init_hook(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; diff --git a/arch/x86/mach-voyager/setup.c b/arch/x86/mach-voyager/setup.c index d914a79..4de9e08 100644 --- a/arch/x86/mach-voyager/setup.c +++ b/arch/x86/mach-voyager/setup.c @@ -56,7 +56,7 @@ void __init trap_init_hook(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 18:13 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:13 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 22 Feb 2009, Rafael J. Wysocki wrote: > > However, x86 currently doesn't set the IRQF_TIMER flag and I need to > make it do so before going further in this direction and changing the > PCI PM framework to take advantage of the $subject changes, for example. Actually, you don't. The modern form of timer interrupt on x86 is the local apic timer, and it doesn't go through the io-apic at all, and is not even visible to the irq subsystem. So it stays enabled through this all. But for old-style timer interrupts, something like the appended should do it. Untested, of course, but it looks obvious enough. Linus --- arch/x86/kernel/time_64.c | 2 +- arch/x86/kernel/vmiclock_32.c | 3 ++- arch/x86/mach-default/setup.c | 2 +- arch/x86/mach-voyager/setup.c | 2 +- 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/time_64.c b/arch/x86/kernel/time_64.c index e6e695a..241ec39 100644 --- a/arch/x86/kernel/time_64.c +++ b/arch/x86/kernel/time_64.c @@ -115,7 +115,7 @@ unsigned long __init calibrate_cpu(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING, + .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING | IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; diff --git a/arch/x86/kernel/vmiclock_32.c b/arch/x86/kernel/vmiclock_32.c index bde106c..7a29d5c 100644 --- a/arch/x86/kernel/vmiclock_32.c +++ b/arch/x86/kernel/vmiclock_32.c @@ -1,3 +1,4 @@ + /* * VMI paravirtual timer support routines. * @@ -202,7 +203,7 @@ static irqreturn_t vmi_timer_interrupt(int irq, void *dev_id) static struct irqaction vmi_clock_action = { .name = "vmi-timer", .handler = vmi_timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING, + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, .mask = CPU_MASK_ALL, }; diff --git a/arch/x86/mach-default/setup.c b/arch/x86/mach-default/setup.c index a265a7c..d737542 100644 --- a/arch/x86/mach-default/setup.c +++ b/arch/x86/mach-default/setup.c @@ -96,7 +96,7 @@ void __init trap_init_hook(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; diff --git a/arch/x86/mach-voyager/setup.c b/arch/x86/mach-voyager/setup.c index d914a79..4de9e08 100644 --- a/arch/x86/mach-voyager/setup.c +++ b/arch/x86/mach-voyager/setup.c @@ -56,7 +56,7 @@ void __init trap_init_hook(void) static struct irqaction irq0 = { .handler = timer_interrupt, - .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, .mask = CPU_MASK_NONE, .name = "timer" }; ^ permalink raw reply related [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 18:13 ` Linus Torvalds (?) @ 2009-02-22 18:18 ` Ingo Molnar 2009-02-22 18:25 ` Linus Torvalds -1 siblings, 1 reply; 373+ messages in thread From: Ingo Molnar @ 2009-02-22 18:18 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner > + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, s/, IRQF_TIMER/ | IRQF_TIMER i guess. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 18:18 ` Ingo Molnar @ 2009-02-22 18:25 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:25 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Ingo Molnar wrote: > > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > > s/, IRQF_TIMER/ | IRQF_TIMER > > i guess. Oops yes. I got one of them right. I guess that's the same one I happened to compile in my config. Duh. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 18:25 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:25 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner On Sun, 22 Feb 2009, Ingo Molnar wrote: > > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > > s/, IRQF_TIMER/ | IRQF_TIMER > > i guess. Oops yes. I got one of them right. I guess that's the same one I happened to compile in my config. Duh. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 18:25 ` Linus Torvalds @ 2009-02-22 18:35 ` Linus Torvalds -1 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:35 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, LKML, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Linus Torvalds wrote: > > Oops yes. I got one of them right. I guess that's the same one I happened > to compile in my config. Duh. I committed the trivially fixed version. I also committed Rafael's patch 1/2 (the one that doesn't actually change anything). Even if we don't do this in 2.6.29, I want to make it easy to test, and get the infrastructure unified. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 18:35 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 18:35 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner On Sun, 22 Feb 2009, Linus Torvalds wrote: > > Oops yes. I got one of them right. I guess that's the same one I happened > to compile in my config. Duh. I committed the trivially fixed version. I also committed Rafael's patch 1/2 (the one that doesn't actually change anything). Even if we don't do this in 2.6.29, I want to make it easy to test, and get the infrastructure unified. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 18:13 ` Linus Torvalds (?) (?) @ 2009-02-22 18:18 ` Ingo Molnar -1 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-02-22 18:18 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner > + .flags = IRQF_DISABLED | IRQF_NOBALANCING, IRQF_TIMER, > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, > + .flags = IRQF_DISABLED | IRQF_NOBALANCING | IRQF_IRQPOLL, IRQF_TIMER, s/, IRQF_TIMER/ | IRQF_TIMER i guess. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (5 preceding siblings ...) (?) @ 2009-02-22 22:37 ` Eric W. Biederman -1 siblings, 0 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-22 22:37 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list "Rafael J. Wysocki" <rjw@sisk.pl> writes: > Moreover, the real purpose of these changes is to be able to execute the > "late" suspend and "early" resume device callbacks with timer interrupts > enabled, so that they can use mutexes etc. However, x86 currently doesn't set > the IRQF_TIMER flag and I need to make it do so before going further in this > direction and changing the PCI PM framework to take advantage of the $subject > changes, for example. So, I need to know how to modify x86 timer code so that > the IRQF_TIMER flag is set by it. How does this sync with the ACPI requirement that the it's late suspend MUST happen with irqs disabled? Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (6 preceding siblings ...) (?) @ 2009-02-22 22:37 ` Eric W. Biederman 2009-02-22 22:56 ` Benjamin Herrenschmidt ` (2 more replies) -1 siblings, 3 replies; 373+ messages in thread From: Eric W. Biederman @ 2009-02-22 22:37 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner "Rafael J. Wysocki" <rjw@sisk.pl> writes: > Moreover, the real purpose of these changes is to be able to execute the > "late" suspend and "early" resume device callbacks with timer interrupts > enabled, so that they can use mutexes etc. However, x86 currently doesn't set > the IRQF_TIMER flag and I need to make it do so before going further in this > direction and changing the PCI PM framework to take advantage of the $subject > changes, for example. So, I need to know how to modify x86 timer code so that > the IRQF_TIMER flag is set by it. How does this sync with the ACPI requirement that the it's late suspend MUST happen with irqs disabled? Eric ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 22:37 ` Eric W. Biederman @ 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 23:02 ` Linus Torvalds 2 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-22 22:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, Linus Torvalds, pm list On Sun, 2009-02-22 at 14:37 -0800, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > Moreover, the real purpose of these changes is to be able to execute the > > "late" suspend and "early" resume device callbacks with timer interrupts > > enabled, so that they can use mutexes etc. However, x86 currently doesn't set > > the IRQF_TIMER flag and I need to make it do so before going further in this > > direction and changing the PCI PM framework to take advantage of the $subject > > changes, for example. So, I need to know how to modify x86 timer code so that > > the IRQF_TIMER flag is set by it. > > How does this sync with the ACPI requirement that the it's late suspend MUST > happen with irqs disabled? If I understand properly what the intention here is, the sysdev suspend and later still happens with hard irqs off. This is purely the layer between suspend and suspend_late at the driver level that uses the above instead of hard IRQs off in order to be able to properly order the ACPI calls vs. the driver calls. Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 22:37 ` Eric W. Biederman 2009-02-22 22:56 ` Benjamin Herrenschmidt @ 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 23:02 ` Linus Torvalds 2 siblings, 0 replies; 373+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-22 22:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, LKML, Linus Torvalds, Ingo Molnar, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 2009-02-22 at 14:37 -0800, Eric W. Biederman wrote: > "Rafael J. Wysocki" <rjw@sisk.pl> writes: > > > Moreover, the real purpose of these changes is to be able to execute the > > "late" suspend and "early" resume device callbacks with timer interrupts > > enabled, so that they can use mutexes etc. However, x86 currently doesn't set > > the IRQF_TIMER flag and I need to make it do so before going further in this > > direction and changing the PCI PM framework to take advantage of the $subject > > changes, for example. So, I need to know how to modify x86 timer code so that > > the IRQF_TIMER flag is set by it. > > How does this sync with the ACPI requirement that the it's late suspend MUST > happen with irqs disabled? If I understand properly what the intention here is, the sysdev suspend and later still happens with hard irqs off. This is purely the layer between suspend and suspend_late at the driver level that uses the above instead of hard IRQs off in order to be able to properly order the ACPI calls vs. the driver calls. Ben. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume 2009-02-22 22:37 ` Eric W. Biederman @ 2009-02-22 23:02 ` Linus Torvalds 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 23:02 ` Linus Torvalds 2 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 23:02 UTC (permalink / raw) To: Eric W. Biederman Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner On Sun, 22 Feb 2009, Eric W. Biederman wrote: > > How does this sync with the ACPI requirement that the it's late suspend MUST > happen with irqs disabled? All the system device suspend and the actual CPU power-off still happens with CPU interrupts disabled. It's just that the regular two-phase device suspend code now runs first with interrupts enabled (the regular "->suspend()" callback), and then the second phase runs with the CPU still having interrupts on (and taking timer interrupts), but with the actual device interrupts disabled. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume @ 2009-02-22 23:02 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-02-22 23:02 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Ingo Molnar, pm list On Sun, 22 Feb 2009, Eric W. Biederman wrote: > > How does this sync with the ACPI requirement that the it's late suspend MUST > happen with irqs disabled? All the system device suspend and the actual CPU power-off still happens with CPU interrupts disabled. It's just that the regular two-phase device suspend code now runs first with interrupts enabled (the regular "->suspend()" callback), and then the second phase runs with the CPU still having interrupts on (and taking timer interrupts), but with the actual device interrupts disabled. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (7 preceding siblings ...) (?) @ 2009-03-01 22:21 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:21 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list Hi, The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/4 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). [Ingo, I didn't add your ACK to the patch, because it's changed since you saw it last time.] 2/4 - 4/4 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before the platform "prepare" callback and the disabling of nonboot CPUs (and analogously during resume). Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-02-22 17:37 ` Rafael J. Wysocki ` (8 preceding siblings ...) (?) @ 2009-03-01 22:21 ` Rafael J. Wysocki 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki ` (8 more replies) -1 siblings, 9 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:21 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg Hi, The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/4 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). [Ingo, I didn't add your ACK to the patch, because it's changed since you saw it last time.] 2/4 - 4/4 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before the platform "prepare" callback and the disabling of nonboot CPUs (and analogously during resume). Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-01 22:21 ` Rafael J. Wysocki @ 2009-03-01 22:24 ` Rafael J. Wysocki 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-01 22:24 ` Rafael J. Wysocki ` (7 subsequent siblings) 8 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:24 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++-- drivers/base/power/main.c | 20 ++++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 +++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/manage.c | 3 + kernel/irq/pm.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 +++++++++++++++++------ kernel/power/main.c | 17 ++++++---- 12 files changed, 170 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,78 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && desc->action + && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + enable_irq(irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { - unsigned int status = desc->status & ~IRQ_DISABLED; + unsigned int status; + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki @ 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:01 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-02 23:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to prevent device drivers > from getting any interrupts (without disabling interrupts on the CPU) > during suspend (or hibernation) and to make them start to receive > interrupts again during the subsequent resume, respectively. These > functions make it possible to keep timer interrupts enabled while the > "late" suspend and "early" resume callbacks provided by device > drivers are being executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > drivers will be prevented from receiving interrupts, with the help of > the new helper function, before their "late" suspend callbacks run > (and analogously during resume). > > In addition, since the device interrups are now disabled before the > CPU has turned all interrupts off and the CPU will ACK the interrupts > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > any wake-up interrupts are pending and abort suspend if that's the > case. > > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) > + if (desc->status & IRQ_SUSPENDED) > + enable_irq(irq); > +} I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > break; > case 1: { > - unsigned int status = desc->status & ~IRQ_DISABLED; > + unsigned int status; > > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > /* Prevent probing on this irq: */ > desc->status = status | IRQ_NOPROBE; > check_irq_resend(desc, irq); This only clears IRQ_SUSPENDED if the interrupt was not disabled elsewhere. If a driver calls interrupt_disable in suspend_late, but calls interrupt_enable lazily, resume_device_irqs will reenable the interrupt even though the driver has a disable reference. The rest of the patch looks good. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki 2009-03-02 23:01 ` Arve Hjønnevåg @ 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:13 ` Rafael J. Wysocki 2009-03-02 23:13 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-02 23:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to prevent device drivers > from getting any interrupts (without disabling interrupts on the CPU) > during suspend (or hibernation) and to make them start to receive > interrupts again during the subsequent resume, respectively. These > functions make it possible to keep timer interrupts enabled while the > "late" suspend and "early" resume callbacks provided by device > drivers are being executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > drivers will be prevented from receiving interrupts, with the help of > the new helper function, before their "late" suspend callbacks run > (and analogously during resume). > > In addition, since the device interrups are now disabled before the > CPU has turned all interrupts off and the CPU will ACK the interrupts > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > any wake-up interrupts are pending and abort suspend if that's the > case. > > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) > + if (desc->status & IRQ_SUSPENDED) > + enable_irq(irq); > +} I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > break; > case 1: { > - unsigned int status = desc->status & ~IRQ_DISABLED; > + unsigned int status; > > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > /* Prevent probing on this irq: */ > desc->status = status | IRQ_NOPROBE; > check_irq_resend(desc, irq); This only clears IRQ_SUSPENDED if the interrupt was not disabled elsewhere. If a driver calls interrupt_disable in suspend_late, but calls interrupt_enable lazily, resume_device_irqs will reenable the interrupt even though the driver has a disable reference. The rest of the patch looks good. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:01 ` Arve Hjønnevåg @ 2009-03-02 23:13 ` Rafael J. Wysocki 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:13 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 23:13 UTC (permalink / raw) To: Arve Hjønnevåg Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to prevent device drivers > > from getting any interrupts (without disabling interrupts on the CPU) > > during suspend (or hibernation) and to make them start to receive > > interrupts again during the subsequent resume, respectively. These > > functions make it possible to keep timer interrupts enabled while the > > "late" suspend and "early" resume callbacks provided by device > > drivers are being executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > drivers will be prevented from receiving interrupts, with the help of > > the new helper function, before their "late" suspend callbacks run > > (and analogously during resume). > > > > In addition, since the device interrups are now disabled before the > > CPU has turned all interrupts off and the CPU will ACK the interrupts > > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > > any wake-up interrupts are pending and abort suspend if that's the > > case. > > > > > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + enable_irq(irq); > > +} > > I think you need to clear IRQ_SUSPENDED here, not in enable_irq. enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > > break; > > case 1: { > > - unsigned int status = desc->status & ~IRQ_DISABLED; > > + unsigned int status; > > > > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > > /* Prevent probing on this irq: */ > > desc->status = status | IRQ_NOPROBE; > > check_irq_resend(desc, irq); > > This only clears IRQ_SUSPENDED if the interrupt was not disabled > elsewhere. If a driver calls interrupt_disable in suspend_late, but > calls interrupt_enable lazily, resume_device_irqs will reenable the > interrupt even though the driver has a disable reference. Then I'd regard the driver as buggy. > The rest of the patch looks good. I'm glad you like it. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:13 ` Rafael J. Wysocki @ 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:18 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-02 23:18 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> > >> > Introduce two helper functions allowing us to prevent device drivers >> > from getting any interrupts (without disabling interrupts on the CPU) >> > during suspend (or hibernation) and to make them start to receive >> > interrupts again during the subsequent resume, respectively. These >> > functions make it possible to keep timer interrupts enabled while the >> > "late" suspend and "early" resume callbacks provided by device >> > drivers are being executed. >> > >> > Use these functions to rework the handling of interrupts during >> > suspend (hibernation) and resume. Namely, interrupts will only be >> > disabled on the CPU right before suspending sysdevs, while device >> > drivers will be prevented from receiving interrupts, with the help of >> > the new helper function, before their "late" suspend callbacks run >> > (and analogously during resume). >> > >> > In addition, since the device interrups are now disabled before the >> > CPU has turned all interrupts off and the CPU will ACK the interrupts >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if >> > any wake-up interrupts are pending and abort suspend if that's the >> > case. >> > >> >> >> > +void resume_device_irqs(void) >> > +{ >> > + struct irq_desc *desc; >> > + int irq; >> > + >> > + for_each_irq_desc(irq, desc) >> > + if (desc->status & IRQ_SUSPENDED) >> > + enable_irq(irq); >> > +} >> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > I'm if I missed that discussion, but enable_irq cannot know who is calling it and therefore cannot know if IRQ_SUSPENDED should be cleared. >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); >> > break; >> > case 1: { >> > - unsigned int status = desc->status & ~IRQ_DISABLED; >> > + unsigned int status; >> > >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); >> > /* Prevent probing on this irq: */ >> > desc->status = status | IRQ_NOPROBE; >> > check_irq_resend(desc, irq); >> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled >> elsewhere. If a driver calls interrupt_disable in suspend_late, but >> calls interrupt_enable lazily, resume_device_irqs will reenable the >> interrupt even though the driver has a disable reference. > > Then I'd regard the driver as buggy. The bug is not in the driver. The driver called disable_irq once. You called disable_irq once, but enable_irq twice. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:13 ` Rafael J. Wysocki 2009-03-02 23:18 ` Arve Hjønnevåg @ 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:27 ` Rafael J. Wysocki ` (2 more replies) 1 sibling, 3 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-02 23:18 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> > >> > Introduce two helper functions allowing us to prevent device drivers >> > from getting any interrupts (without disabling interrupts on the CPU) >> > during suspend (or hibernation) and to make them start to receive >> > interrupts again during the subsequent resume, respectively. These >> > functions make it possible to keep timer interrupts enabled while the >> > "late" suspend and "early" resume callbacks provided by device >> > drivers are being executed. >> > >> > Use these functions to rework the handling of interrupts during >> > suspend (hibernation) and resume. Namely, interrupts will only be >> > disabled on the CPU right before suspending sysdevs, while device >> > drivers will be prevented from receiving interrupts, with the help of >> > the new helper function, before their "late" suspend callbacks run >> > (and analogously during resume). >> > >> > In addition, since the device interrups are now disabled before the >> > CPU has turned all interrupts off and the CPU will ACK the interrupts >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if >> > any wake-up interrupts are pending and abort suspend if that's the >> > case. >> > >> >> >> > +void resume_device_irqs(void) >> > +{ >> > + struct irq_desc *desc; >> > + int irq; >> > + >> > + for_each_irq_desc(irq, desc) >> > + if (desc->status & IRQ_SUSPENDED) >> > + enable_irq(irq); >> > +} >> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > I'm if I missed that discussion, but enable_irq cannot know who is calling it and therefore cannot know if IRQ_SUSPENDED should be cleared. >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); >> > break; >> > case 1: { >> > - unsigned int status = desc->status & ~IRQ_DISABLED; >> > + unsigned int status; >> > >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); >> > /* Prevent probing on this irq: */ >> > desc->status = status | IRQ_NOPROBE; >> > check_irq_resend(desc, irq); >> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled >> elsewhere. If a driver calls interrupt_disable in suspend_late, but >> calls interrupt_enable lazily, resume_device_irqs will reenable the >> interrupt even though the driver has a disable reference. > > Then I'd regard the driver as buggy. The bug is not in the driver. The driver called disable_irq once. You called disable_irq once, but enable_irq twice. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:18 ` Arve Hjønnevåg @ 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-02 23:32 ` Linus Torvalds 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 23:27 UTC (permalink / raw) To: Arve Hjønnevåg, Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > From: Rafael J. Wysocki <rjw@sisk.pl> > >> > > >> > Introduce two helper functions allowing us to prevent device drivers > >> > from getting any interrupts (without disabling interrupts on the CPU) > >> > during suspend (or hibernation) and to make them start to receive > >> > interrupts again during the subsequent resume, respectively. These > >> > functions make it possible to keep timer interrupts enabled while the > >> > "late" suspend and "early" resume callbacks provided by device > >> > drivers are being executed. > >> > > >> > Use these functions to rework the handling of interrupts during > >> > suspend (hibernation) and resume. Namely, interrupts will only be > >> > disabled on the CPU right before suspending sysdevs, while device > >> > drivers will be prevented from receiving interrupts, with the help of > >> > the new helper function, before their "late" suspend callbacks run > >> > (and analogously during resume). > >> > > >> > In addition, since the device interrups are now disabled before the > >> > CPU has turned all interrupts off and the CPU will ACK the interrupts > >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > >> > any wake-up interrupts are pending and abort suspend if that's the > >> > case. > >> > > >> > >> > >> > +void resume_device_irqs(void) > >> > +{ > >> > + struct irq_desc *desc; > >> > + int irq; > >> > + > >> > + for_each_irq_desc(irq, desc) > >> > + if (desc->status & IRQ_SUSPENDED) > >> > + enable_irq(irq); > >> > +} > >> > >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > > > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > > > I'm if I missed that discussion, but enable_irq cannot know who is > calling it and therefore cannot know if IRQ_SUSPENDED should be > cleared. This change has been requested by Ingo and for a reason. Ingo, what's your opinion? > >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > >> > break; > >> > case 1: { > >> > - unsigned int status = desc->status & ~IRQ_DISABLED; > >> > + unsigned int status; > >> > > >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > >> > /* Prevent probing on this irq: */ > >> > desc->status = status | IRQ_NOPROBE; > >> > check_irq_resend(desc, irq); > >> > >> This only clears IRQ_SUSPENDED if the interrupt was not disabled > >> elsewhere. If a driver calls interrupt_disable in suspend_late, but > >> calls interrupt_enable lazily, resume_device_irqs will reenable the > >> interrupt even though the driver has a disable reference. > > > > Then I'd regard the driver as buggy. > > The bug is not in the driver. The driver called disable_irq once. You > called disable_irq once, but enable_irq twice. Please. Can you show me a _single_ _driver_ currently in the tree doing something like you describe in suspend_late and resume_early? If you can't, then please give up. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:27 ` Rafael J. Wysocki @ 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-02 23:32 ` Linus Torvalds 2 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 23:27 UTC (permalink / raw) To: Arve Hjønnevåg, Ingo Molnar Cc: LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > From: Rafael J. Wysocki <rjw@sisk.pl> > >> > > >> > Introduce two helper functions allowing us to prevent device drivers > >> > from getting any interrupts (without disabling interrupts on the CPU) > >> > during suspend (or hibernation) and to make them start to receive > >> > interrupts again during the subsequent resume, respectively. These > >> > functions make it possible to keep timer interrupts enabled while the > >> > "late" suspend and "early" resume callbacks provided by device > >> > drivers are being executed. > >> > > >> > Use these functions to rework the handling of interrupts during > >> > suspend (hibernation) and resume. Namely, interrupts will only be > >> > disabled on the CPU right before suspending sysdevs, while device > >> > drivers will be prevented from receiving interrupts, with the help of > >> > the new helper function, before their "late" suspend callbacks run > >> > (and analogously during resume). > >> > > >> > In addition, since the device interrups are now disabled before the > >> > CPU has turned all interrupts off and the CPU will ACK the interrupts > >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > >> > any wake-up interrupts are pending and abort suspend if that's the > >> > case. > >> > > >> > >> > >> > +void resume_device_irqs(void) > >> > +{ > >> > + struct irq_desc *desc; > >> > + int irq; > >> > + > >> > + for_each_irq_desc(irq, desc) > >> > + if (desc->status & IRQ_SUSPENDED) > >> > + enable_irq(irq); > >> > +} > >> > >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. > > > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > > > I'm if I missed that discussion, but enable_irq cannot know who is > calling it and therefore cannot know if IRQ_SUSPENDED should be > cleared. This change has been requested by Ingo and for a reason. Ingo, what's your opinion? > >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > >> > break; > >> > case 1: { > >> > - unsigned int status = desc->status & ~IRQ_DISABLED; > >> > + unsigned int status; > >> > > >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > >> > /* Prevent probing on this irq: */ > >> > desc->status = status | IRQ_NOPROBE; > >> > check_irq_resend(desc, irq); > >> > >> This only clears IRQ_SUSPENDED if the interrupt was not disabled > >> elsewhere. If a driver calls interrupt_disable in suspend_late, but > >> calls interrupt_enable lazily, resume_device_irqs will reenable the > >> interrupt even though the driver has a disable reference. > > > > Then I'd regard the driver as buggy. > > The bug is not in the driver. The driver called disable_irq once. You > called disable_irq once, but enable_irq twice. Please. Can you show me a _single_ _driver_ currently in the tree doing something like you describe in suspend_late and resume_early? If you can't, then please give up. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:27 ` Rafael J. Wysocki @ 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-04 22:03 ` [Update, rev. 5] " Rafael J. Wysocki 2009-03-04 22:03 ` Rafael J. Wysocki 2009-03-03 22:56 ` Arve Hjønnevåg 1 sibling, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 22:56 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> >> > >> >> > Introduce two helper functions allowing us to prevent device drivers >> >> > from getting any interrupts (without disabling interrupts on the CPU) >> >> > during suspend (or hibernation) and to make them start to receive >> >> > interrupts again during the subsequent resume, respectively. These >> >> > functions make it possible to keep timer interrupts enabled while the >> >> > "late" suspend and "early" resume callbacks provided by device >> >> > drivers are being executed. >> >> > >> >> > Use these functions to rework the handling of interrupts during >> >> > suspend (hibernation) and resume. Namely, interrupts will only be >> >> > disabled on the CPU right before suspending sysdevs, while device >> >> > drivers will be prevented from receiving interrupts, with the help of >> >> > the new helper function, before their "late" suspend callbacks run >> >> > (and analogously during resume). >> >> > >> >> > In addition, since the device interrups are now disabled before the >> >> > CPU has turned all interrupts off and the CPU will ACK the interrupts >> >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if >> >> > any wake-up interrupts are pending and abort suspend if that's the >> >> > case. >> >> > >> >> >> >> >> >> > +void resume_device_irqs(void) >> >> > +{ >> >> > + struct irq_desc *desc; >> >> > + int irq; >> >> > + >> >> > + for_each_irq_desc(irq, desc) >> >> > + if (desc->status & IRQ_SUSPENDED) >> >> > + enable_irq(irq); >> >> > +} >> >> >> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. >> > >> > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. >> > >> >> I'm if I missed that discussion, but enable_irq cannot know who is >> calling it and therefore cannot know if IRQ_SUSPENDED should be >> cleared. > > This change has been requested by Ingo and for a reason. > > Ingo, what's your opinion? > >> >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc >> >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); >> >> > break; >> >> > case 1: { >> >> > - unsigned int status = desc->status & ~IRQ_DISABLED; >> >> > + unsigned int status; >> >> > >> >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); >> >> > /* Prevent probing on this irq: */ >> >> > desc->status = status | IRQ_NOPROBE; >> >> > check_irq_resend(desc, irq); >> >> >> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled >> >> elsewhere. If a driver calls interrupt_disable in suspend_late, but >> >> calls interrupt_enable lazily, resume_device_irqs will reenable the >> >> interrupt even though the driver has a disable reference. >> > >> > Then I'd regard the driver as buggy. >> >> The bug is not in the driver. The driver called disable_irq once. You >> called disable_irq once, but enable_irq twice. > > Please. > > Can you show me a _single_ _driver_ currently in the tree doing something > like you describe in suspend_late and resume_early? If you can't, then please > give up. I don't know if any drivers call disable_irq or enable_irq in their suspend hooks, but your change also allow timers, and I assume kernel threads, to run during this phase. There are several drivers (keypad drivers in particular), in tree and out of tree, that call enable_irq from timers, and disable_irq from their interrupt handler. If you also apply your later change to disable non boot cpus after suspend_device_irqs, then on smp systems the interrupt handler may run at the same time as suspend_device_irqs. If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED gets set. If another suspend/resume cycle happens before the timer runs, you will incorrectly enable the interrupt. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-03 22:56 ` Arve Hjønnevåg @ 2009-03-04 22:03 ` Rafael J. Wysocki 2009-03-05 10:35 ` Ingo Molnar 2009-03-05 10:35 ` Ingo Molnar 2009-03-04 22:03 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-04 22:03 UTC (permalink / raw) To: Arve Hjønnevåg, Ingo Molnar Cc: LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: [--snip--] > > Can you show me a _single_ _driver_ currently in the tree doing something > > like you describe in suspend_late and resume_early? If you can't, then please > > give up. > > I don't know if any drivers call disable_irq or enable_irq in their > suspend hooks, but your change also allow timers, and I assume kernel > threads, to run during this phase. > > There are several drivers (keypad drivers in particular), in tree and > out of tree, that call enable_irq from timers, and disable_irq from > their interrupt handler. If you also apply your later change to > disable non boot cpus after suspend_device_irqs, then on smp systems > the interrupt handler may run at the same time as suspend_device_irqs. > If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED > gets set. If another suspend/resume cycle happens before the timer > runs, you will incorrectly enable the interrupt. Well, unfortunately this is a valid point IMO. I've been thinking for quite a while how to fix it nicely, but I'm not sure if there is a nice fix. Below is an updated patch, hopefully everyone will be fine with it. Ingo, is making __enable_irq() an extern function acceptable? Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 5) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-04 22:03 ` [Update, rev. 5] " Rafael J. Wysocki @ 2009-03-05 10:35 ` Ingo Molnar 2009-03-05 10:35 ` Ingo Molnar 1 sibling, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-03-05 10:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > [--snip--] > > > Can you show me a _single_ _driver_ currently in the tree doing something > > > like you describe in suspend_late and resume_early? If you can't, then please > > > give up. > > > > I don't know if any drivers call disable_irq or enable_irq in their > > suspend hooks, but your change also allow timers, and I assume kernel > > threads, to run during this phase. > > > > There are several drivers (keypad drivers in particular), in tree and > > out of tree, that call enable_irq from timers, and disable_irq from > > their interrupt handler. If you also apply your later change to > > disable non boot cpus after suspend_device_irqs, then on smp systems > > the interrupt handler may run at the same time as suspend_device_irqs. > > If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED > > gets set. If another suspend/resume cycle happens before the timer > > runs, you will incorrectly enable the interrupt. > > Well, unfortunately this is a valid point IMO. I've been thinking for quite a > while how to fix it nicely, but I'm not sure if there is a nice fix. > > Below is an updated patch, hopefully everyone will be fine with it. > > Ingo, is making __enable_irq() an extern function acceptable? Sure, that's fine - it's a genirq internal function still between kernel/irq/manage.c and kernel/irq/pm.c. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-04 22:03 ` [Update, rev. 5] " Rafael J. Wysocki 2009-03-05 10:35 ` Ingo Molnar @ 2009-03-05 10:35 ` Ingo Molnar 1 sibling, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-03-05 10:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arve Hjønnevåg, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > > >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > [--snip--] > > > Can you show me a _single_ _driver_ currently in the tree doing something > > > like you describe in suspend_late and resume_early? If you can't, then please > > > give up. > > > > I don't know if any drivers call disable_irq or enable_irq in their > > suspend hooks, but your change also allow timers, and I assume kernel > > threads, to run during this phase. > > > > There are several drivers (keypad drivers in particular), in tree and > > out of tree, that call enable_irq from timers, and disable_irq from > > their interrupt handler. If you also apply your later change to > > disable non boot cpus after suspend_device_irqs, then on smp systems > > the interrupt handler may run at the same time as suspend_device_irqs. > > If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED > > gets set. If another suspend/resume cycle happens before the timer > > runs, you will incorrectly enable the interrupt. > > Well, unfortunately this is a valid point IMO. I've been thinking for quite a > while how to fix it nicely, but I'm not sure if there is a nice fix. > > Below is an updated patch, hopefully everyone will be fine with it. > > Ingo, is making __enable_irq() an extern function acceptable? Sure, that's fine - it's a genirq internal function still between kernel/irq/manage.c and kernel/irq/pm.c. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* [Update, rev. 5] Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-04 22:03 ` [Update, rev. 5] " Rafael J. Wysocki @ 2009-03-04 22:03 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-04 22:03 UTC (permalink / raw) To: Arve Hjønnevåg, Ingo Molnar Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Eric W. Biederman, pm list, Linus Torvalds, Thomas Gleixner On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: [--snip--] > > Can you show me a _single_ _driver_ currently in the tree doing something > > like you describe in suspend_late and resume_early? If you can't, then please > > give up. > > I don't know if any drivers call disable_irq or enable_irq in their > suspend hooks, but your change also allow timers, and I assume kernel > threads, to run during this phase. > > There are several drivers (keypad drivers in particular), in tree and > out of tree, that call enable_irq from timers, and disable_irq from > their interrupt handler. If you also apply your later change to > disable non boot cpus after suspend_device_irqs, then on smp systems > the interrupt handler may run at the same time as suspend_device_irqs. > If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED > gets set. If another suspend/resume cycle happens before the timer > runs, you will incorrectly enable the interrupt. Well, unfortunately this is a valid point IMO. I've been thinking for quite a while how to fix it nicely, but I'm not sure if there is a nice fix. Below is an updated patch, hopefully everyone will be fine with it. Ingo, is making __enable_irq() an extern function acceptable? Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 5) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-03 22:56 ` Arve Hjønnevåg @ 2009-03-03 22:56 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 22:56 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Mon, Mar 2, 2009 at 3:27 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> On Mon, Mar 2, 2009 at 3:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> > On Tuesday 03 March 2009, Arve Hjønnevåg wrote: >> >> On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> >> > From: Rafael J. Wysocki <rjw@sisk.pl> >> >> > >> >> > Introduce two helper functions allowing us to prevent device drivers >> >> > from getting any interrupts (without disabling interrupts on the CPU) >> >> > during suspend (or hibernation) and to make them start to receive >> >> > interrupts again during the subsequent resume, respectively. These >> >> > functions make it possible to keep timer interrupts enabled while the >> >> > "late" suspend and "early" resume callbacks provided by device >> >> > drivers are being executed. >> >> > >> >> > Use these functions to rework the handling of interrupts during >> >> > suspend (hibernation) and resume. Namely, interrupts will only be >> >> > disabled on the CPU right before suspending sysdevs, while device >> >> > drivers will be prevented from receiving interrupts, with the help of >> >> > the new helper function, before their "late" suspend callbacks run >> >> > (and analogously during resume). >> >> > >> >> > In addition, since the device interrups are now disabled before the >> >> > CPU has turned all interrupts off and the CPU will ACK the interrupts >> >> > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if >> >> > any wake-up interrupts are pending and abort suspend if that's the >> >> > case. >> >> > >> >> >> >> >> >> > +void resume_device_irqs(void) >> >> > +{ >> >> > + struct irq_desc *desc; >> >> > + int irq; >> >> > + >> >> > + for_each_irq_desc(irq, desc) >> >> > + if (desc->status & IRQ_SUSPENDED) >> >> > + enable_irq(irq); >> >> > +} >> >> >> >> I think you need to clear IRQ_SUSPENDED here, not in enable_irq. >> > >> > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. >> > >> >> I'm if I missed that discussion, but enable_irq cannot know who is >> calling it and therefore cannot know if IRQ_SUSPENDED should be >> cleared. > > This change has been requested by Ingo and for a reason. > > Ingo, what's your opinion? > >> >> > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc >> >> > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); >> >> > break; >> >> > case 1: { >> >> > - unsigned int status = desc->status & ~IRQ_DISABLED; >> >> > + unsigned int status; >> >> > >> >> > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); >> >> > /* Prevent probing on this irq: */ >> >> > desc->status = status | IRQ_NOPROBE; >> >> > check_irq_resend(desc, irq); >> >> >> >> This only clears IRQ_SUSPENDED if the interrupt was not disabled >> >> elsewhere. If a driver calls interrupt_disable in suspend_late, but >> >> calls interrupt_enable lazily, resume_device_irqs will reenable the >> >> interrupt even though the driver has a disable reference. >> > >> > Then I'd regard the driver as buggy. >> >> The bug is not in the driver. The driver called disable_irq once. You >> called disable_irq once, but enable_irq twice. > > Please. > > Can you show me a _single_ _driver_ currently in the tree doing something > like you describe in suspend_late and resume_early? If you can't, then please > give up. I don't know if any drivers call disable_irq or enable_irq in their suspend hooks, but your change also allow timers, and I assume kernel threads, to run during this phase. There are several drivers (keypad drivers in particular), in tree and out of tree, that call enable_irq from timers, and disable_irq from their interrupt handler. If you also apply your later change to disable non boot cpus after suspend_device_irqs, then on smp systems the interrupt handler may run at the same time as suspend_device_irqs. If suspend_device_irqs gets the spinlock first, then IRQ_SUSPENDED gets set. If another suspend/resume cycle happens before the timer runs, you will incorrectly enable the interrupt. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:18 ` Arve Hjønnevåg @ 2009-03-02 23:32 ` Linus Torvalds 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-02 23:32 ` Linus Torvalds 2 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 23:32 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, 2 Mar 2009, Arve Hjønnevåg wrote: > > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > I'm if I missed that discussion, but enable_irq cannot know who is > calling it and therefore cannot know if IRQ_SUSPENDED should be > cleared. Sure it can. If IRQ_SUSPENDED is not set, then clearing it is a no-op, so that's fine. If IRQ_SUSPENDED _is_ set, then that means that we're after the suspend_late() sequence and before the resume_early() sequence, and no device driver is possibly called in between, so they'd sure better not be doing anything that does an enable_irq(). IOW, we know who the caller is, simply because there can be no other valid caller! Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) @ 2009-03-02 23:32 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 23:32 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Mon, 2 Mar 2009, Arve Hjønnevåg wrote: > > > enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > I'm if I missed that discussion, but enable_irq cannot know who is > calling it and therefore cannot know if IRQ_SUSPENDED should be > cleared. Sure it can. If IRQ_SUSPENDED is not set, then clearing it is a no-op, so that's fine. If IRQ_SUSPENDED _is_ set, then that means that we're after the suspend_late() sequence and before the resume_early() sequence, and no device driver is possibly called in between, so they'd sure better not be doing anything that does an enable_irq(). IOW, we know who the caller is, simply because there can be no other valid caller! Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:32 ` Linus Torvalds @ 2009-03-02 23:35 ` Linus Torvalds -1 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 23:35 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, 2 Mar 2009, Linus Torvalds wrote: > > If IRQ_SUSPENDED _is_ set, then that means that we're after the > suspend_late() sequence and before the resume_early() sequence Sorry, after the suspend, and before the resume. We could be _in_ the suspend_late/resume_early sequence, but a driver that were to try to play with interrupts at that stage would be broken. It can't very well do a enable_irq(), because that would be a MAJOR BUG - it would make the whole irq suspend thing pointless, since now interrupts would start to happen exactly where they must not happen! Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) @ 2009-03-02 23:35 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 23:35 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Mon, 2 Mar 2009, Linus Torvalds wrote: > > If IRQ_SUSPENDED _is_ set, then that means that we're after the > suspend_late() sequence and before the resume_early() sequence Sorry, after the suspend, and before the resume. We could be _in_ the suspend_late/resume_early sequence, but a driver that were to try to play with interrupts at that stage would be broken. It can't very well do a enable_irq(), because that would be a MAJOR BUG - it would make the whole irq suspend thing pointless, since now interrupts would start to happen exactly where they must not happen! Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:35 ` Linus Torvalds (?) @ 2009-03-03 0:08 ` Arve Hjønnevåg -1 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 0:08 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Mon, 2 Mar 2009, Linus Torvalds wrote: >> >> If IRQ_SUSPENDED _is_ set, then that means that we're after the >> suspend_late() sequence and before the resume_early() sequence > > Sorry, after the suspend, and before the resume. > > We could be _in_ the suspend_late/resume_early sequence, but a driver that > were to try to play with interrupts at that stage would be broken. It > can't very well do a enable_irq(), because that would be a MAJOR BUG - it > would make the whole irq suspend thing pointless, since now interrupts > would start to happen exactly where they must not happen! It may be pointless for a driver to call disable_irq and enable_irq from suspend_late or resume_early (instead of suspend and resume), but I would not call it a bug. Since disable_irq and enable_irq are reference counted all this is doing is indicating that this driver can or cannot accept interrupts. If you want to make an additional restriction that drivers are not allowed to call disable_irq or enable_irq from suspend_late and resume_early, then yes you can tell that enable_irq was called from resume_device_irqs. I don't know of any drivers that do this, I was just pointing out the danger of releasing a reference without knowing if you acquired that reference. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:35 ` Linus Torvalds (?) (?) @ 2009-03-03 0:08 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg -1 siblings, 2 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 0:08 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Mon, 2 Mar 2009, Linus Torvalds wrote: >> >> If IRQ_SUSPENDED _is_ set, then that means that we're after the >> suspend_late() sequence and before the resume_early() sequence > > Sorry, after the suspend, and before the resume. > > We could be _in_ the suspend_late/resume_early sequence, but a driver that > were to try to play with interrupts at that stage would be broken. It > can't very well do a enable_irq(), because that would be a MAJOR BUG - it > would make the whole irq suspend thing pointless, since now interrupts > would start to happen exactly where they must not happen! It may be pointless for a driver to call disable_irq and enable_irq from suspend_late or resume_early (instead of suspend and resume), but I would not call it a bug. Since disable_irq and enable_irq are reference counted all this is doing is indicating that this driver can or cannot accept interrupts. If you want to make an additional restriction that drivers are not allowed to call disable_irq or enable_irq from suspend_late and resume_early, then yes you can tell that enable_irq was called from resume_device_irqs. I don't know of any drivers that do this, I was just pointing out the danger of releasing a reference without knowing if you acquired that reference. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-03 0:08 ` Arve Hjønnevåg @ 2009-03-03 8:41 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 8:41 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Mon, Mar 2, 2009 at 4:08 PM, Arve Hjønnevåg <arve@android.com> wrote: > On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> >> On Mon, 2 Mar 2009, Linus Torvalds wrote: >>> >>> If IRQ_SUSPENDED _is_ set, then that means that we're after the >>> suspend_late() sequence and before the resume_early() sequence >> >> Sorry, after the suspend, and before the resume. >> >> We could be _in_ the suspend_late/resume_early sequence, but a driver that >> were to try to play with interrupts at that stage would be broken. It >> can't very well do a enable_irq(), because that would be a MAJOR BUG - it >> would make the whole irq suspend thing pointless, since now interrupts >> would start to happen exactly where they must not happen! > > It may be pointless for a driver to call disable_irq and enable_irq > from suspend_late or resume_early (instead of suspend and resume), but > I would not call it a bug. Since disable_irq and enable_irq are > reference counted all this is doing is indicating that this driver can > or cannot accept interrupts. If you want to make an additional > restriction that drivers are not allowed to call disable_irq or > enable_irq from suspend_late and resume_early, then yes you can tell > that enable_irq was called from resume_device_irqs. > > I don't know of any drivers that do this, I was just pointing out the > danger of releasing a reference without knowing if you acquired that > reference. I did think of a driver that can call enable_irq during the suspend_late phase with this patch. This will not cause an extra enable_irq, but it will enable the interrupt since suspend_device_irqs never incremented depth. Our keypad driver disables its interrupt(s) as soon as you press a key and starts a timer to scan the keypad. When the timer detects that no keys are pressed, it re-enables the interrupt. Since timers now run during suspend_late, this enable_irq call can happen after suspend_device_irqs. If suspend_device_irqs increments depth even if it is not zero, this can be avoided. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-03 0:08 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg @ 2009-03-03 8:41 ` Arve Hjønnevåg 1 sibling, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-03 8:41 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Alan Stern, Johannes Berg On Mon, Mar 2, 2009 at 4:08 PM, Arve Hjønnevåg <arve@android.com> wrote: > On Mon, Mar 2, 2009 at 3:35 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> >> On Mon, 2 Mar 2009, Linus Torvalds wrote: >>> >>> If IRQ_SUSPENDED _is_ set, then that means that we're after the >>> suspend_late() sequence and before the resume_early() sequence >> >> Sorry, after the suspend, and before the resume. >> >> We could be _in_ the suspend_late/resume_early sequence, but a driver that >> were to try to play with interrupts at that stage would be broken. It >> can't very well do a enable_irq(), because that would be a MAJOR BUG - it >> would make the whole irq suspend thing pointless, since now interrupts >> would start to happen exactly where they must not happen! > > It may be pointless for a driver to call disable_irq and enable_irq > from suspend_late or resume_early (instead of suspend and resume), but > I would not call it a bug. Since disable_irq and enable_irq are > reference counted all this is doing is indicating that this driver can > or cannot accept interrupts. If you want to make an additional > restriction that drivers are not allowed to call disable_irq or > enable_irq from suspend_late and resume_early, then yes you can tell > that enable_irq was called from resume_device_irqs. > > I don't know of any drivers that do this, I was just pointing out the > danger of releasing a reference without knowing if you acquired that > reference. I did think of a driver that can call enable_irq during the suspend_late phase with this patch. This will not cause an extra enable_irq, but it will enable the interrupt since suspend_device_irqs never incremented depth. Our keypad driver disables its interrupt(s) as soon as you press a key and starts a timer to scan the keypad. When the timer detects that no keys are pressed, it re-enables the interrupt. Since timers now run during suspend_late, this enable_irq call can happen after suspend_device_irqs. If suspend_device_irqs increments depth even if it is not zero, this can be avoided. -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:13 ` Rafael J. Wysocki @ 2009-03-02 23:13 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 23:13 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Tuesday 03 March 2009, Arve Hjønnevåg wrote: > On Sun, Mar 1, 2009 at 2:24 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to prevent device drivers > > from getting any interrupts (without disabling interrupts on the CPU) > > during suspend (or hibernation) and to make them start to receive > > interrupts again during the subsequent resume, respectively. These > > functions make it possible to keep timer interrupts enabled while the > > "late" suspend and "early" resume callbacks provided by device > > drivers are being executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > drivers will be prevented from receiving interrupts, with the help of > > the new helper function, before their "late" suspend callbacks run > > (and analogously during resume). > > > > In addition, since the device interrups are now disabled before the > > CPU has turned all interrupts off and the CPU will ACK the interrupts > > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > > any wake-up interrupts are pending and abort suspend if that's the > > case. > > > > > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + enable_irq(irq); > > +} > > I think you need to clear IRQ_SUSPENDED here, not in enable_irq. enable_irq() clears IRQ_SUSPENDED. This has already been discussed btw. > > @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc > > WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); > > break; > > case 1: { > > - unsigned int status = desc->status & ~IRQ_DISABLED; > > + unsigned int status; > > > > + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); > > /* Prevent probing on this irq: */ > > desc->status = status | IRQ_NOPROBE; > > check_irq_resend(desc, irq); > > This only clears IRQ_SUSPENDED if the interrupt was not disabled > elsewhere. If a driver calls interrupt_disable in suspend_late, but > calls interrupt_enable lazily, resume_device_irqs will reenable the > interrupt even though the driver has a disable reference. Then I'd regard the driver as buggy. > The rest of the patch looks good. I'm glad you like it. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) 2009-03-01 22:21 ` Rafael J. Wysocki 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki @ 2009-03-01 22:24 ` Rafael J. Wysocki 2009-03-01 22:25 ` [RFC][PATCH 2/4] PM: Change suspend code ordering Rafael J. Wysocki ` (6 subsequent siblings) 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:24 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++-- drivers/base/power/main.c | 20 ++++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 +++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/manage.c | 3 + kernel/irq/pm.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 +++++++++++++++++------ kernel/power/main.c | 17 ++++++---- 12 files changed, 170 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,78 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + + if (!desc->depth && desc->action + && !(desc->action->flags & IRQF_TIMER)) { + desc->depth++; + desc->status |= IRQ_DISABLED | IRQ_SUSPENDED; + desc->chip->disable(irq); + } + + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) { + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + enable_irq(irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -222,8 +222,9 @@ static void __enable_irq(struct irq_desc WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { - unsigned int status = desc->status & ~IRQ_DISABLED; + unsigned int status; + status = desc->status & ~(IRQ_DISABLED | IRQ_SUSPENDED); /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 2/4] PM: Change suspend code ordering 2009-03-01 22:21 ` Rafael J. Wysocki 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki 2009-03-01 22:24 ` Rafael J. Wysocki @ 2009-03-01 22:25 ` Rafael J. Wysocki 2009-03-01 22:25 ` Rafael J. Wysocki ` (5 subsequent siblings) 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:25 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 2/4] PM: Change suspend code ordering 2009-03-01 22:21 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-01 22:25 ` [RFC][PATCH 2/4] PM: Change suspend code ordering Rafael J. Wysocki @ 2009-03-01 22:25 ` Rafael J. Wysocki 2009-03-02 20:48 ` Linus Torvalds 2009-03-01 22:26 ` [RFC][PATCH 3/4] PM: Change hibernation " Rafael J. Wysocki ` (4 subsequent siblings) 8 siblings, 1 reply; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:25 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/4] PM: Change suspend code ordering 2009-03-01 22:25 ` Rafael J. Wysocki @ 2009-03-02 20:48 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 20:48 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > Change the ordering of the suspend core code so that the platform > "prepare" callback is executed and the nonboot CPUs are disabled > after calling device drivers' "late suspend" methods. Ok, ack on this whole series, looks fine. I'd like to see a 5/4 though: > This change will allow us to rework the PCI PM core so that the power > state of devices is changed in the "late" phase of suspend (and > analogously in the "early" phase of resume) .. doing this. Right now we have that hacky "avoid ACPI by doing a special limited form of pci_set_power_state() and pci_enable() in the early_resume. I'd love to see the actual PCI code cleanup too. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/4] PM: Change suspend code ordering @ 2009-03-02 20:48 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-02 20:48 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rjw@sisk.pl> > > Change the ordering of the suspend core code so that the platform > "prepare" callback is executed and the nonboot CPUs are disabled > after calling device drivers' "late suspend" methods. Ok, ack on this whole series, looks fine. I'd like to see a 5/4 though: > This change will allow us to rework the PCI PM core so that the power > state of devices is changed in the "late" phase of suspend (and > analogously in the "early" phase of resume) .. doing this. Right now we have that hacky "avoid ACPI by doing a special limited form of pci_set_power_state() and pci_enable() in the early_resume. I'd love to see the actual PCI code cleanup too. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/4] PM: Change suspend code ordering 2009-03-02 20:48 ` Linus Torvalds (?) @ 2009-03-02 22:02 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 22:02 UTC (permalink / raw) To: Linus Torvalds Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Monday 02 March 2009, Linus Torvalds wrote: > > On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Change the ordering of the suspend core code so that the platform > > "prepare" callback is executed and the nonboot CPUs are disabled > > after calling device drivers' "late suspend" methods. > > Ok, ack on this whole series, looks fine. Thanks! > I'd like to see a 5/4 though: > > > This change will allow us to rework the PCI PM core so that the power > > state of devices is changed in the "late" phase of suspend (and > > analogously in the "early" phase of resume) > > .. doing this. Right now we have that hacky "avoid ACPI by doing a special > limited form of pci_set_power_state() and pci_enable() in the > early_resume. I'd love to see the actual PCI code cleanup too. Sure, that's the next step, but I wanted to get the ack on the preliminary patches first. :-) Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 2/4] PM: Change suspend code ordering 2009-03-02 20:48 ` Linus Torvalds (?) (?) @ 2009-03-02 22:02 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-02 22:02 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg On Monday 02 March 2009, Linus Torvalds wrote: > > On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Change the ordering of the suspend core code so that the platform > > "prepare" callback is executed and the nonboot CPUs are disabled > > after calling device drivers' "late suspend" methods. > > Ok, ack on this whole series, looks fine. Thanks! > I'd like to see a 5/4 though: > > > This change will allow us to rework the PCI PM core so that the power > > state of devices is changed in the "late" phase of suspend (and > > analogously in the "early" phase of resume) > > .. doing this. Right now we have that hacky "avoid ACPI by doing a special > limited form of pci_set_power_state() and pci_enable() in the > early_resume. I'd love to see the actual PCI code cleanup too. Sure, that's the next step, but I wanted to get the ack on the preliminary patches first. :-) Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 3/4] PM: Change hibernation code ordering 2009-03-01 22:21 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-03-01 22:25 ` Rafael J. Wysocki @ 2009-03-01 22:26 ` Rafael J. Wysocki 2009-03-01 22:26 ` Rafael J. Wysocki ` (3 subsequent siblings) 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:26 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 3/4] PM: Change hibernation code ordering 2009-03-01 22:21 ` Rafael J. Wysocki ` (4 preceding siblings ...) 2009-03-01 22:26 ` [RFC][PATCH 3/4] PM: Change hibernation " Rafael J. Wysocki @ 2009-03-01 22:26 ` Rafael J. Wysocki 2009-03-01 22:27 ` [RFC][PATCH 4/4] kexec: Change kexec jump " Rafael J. Wysocki ` (2 subsequent siblings) 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:26 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 4/4] kexec: Change kexec jump code ordering 2009-03-01 22:21 ` Rafael J. Wysocki ` (5 preceding siblings ...) 2009-03-01 22:26 ` Rafael J. Wysocki @ 2009-03-01 22:27 ` Rafael J. Wysocki 2009-03-01 22:27 ` Rafael J. Wysocki 2009-03-05 23:44 ` Linus Torvalds 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:27 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH 4/4] kexec: Change kexec jump code ordering 2009-03-01 22:21 ` Rafael J. Wysocki ` (6 preceding siblings ...) 2009-03-01 22:27 ` [RFC][PATCH 4/4] kexec: Change kexec jump " Rafael J. Wysocki @ 2009-03-01 22:27 ` Rafael J. Wysocki 2009-03-05 23:44 ` Linus Torvalds 8 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-01 22:27 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-03-01 22:21 ` Rafael J. Wysocki @ 2009-03-05 23:44 ` Linus Torvalds 2009-03-01 22:24 ` Rafael J. Wysocki ` (7 subsequent siblings) 8 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-05 23:44 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > The following patches modifiy the way in which we handle disabling interrupts > during suspend and enabling them during resume. They also change the ordering > of the core suspend and hibernation code. Side note - I've tested them on the EeePC that had trouble resuming due to interrupt timings, and it suspends and resumes fine with these patches (modulo some new X problems, but that's what I get for living with Fedora-11 testing). Of course, it also suspends and resumes without them, since the CPU "cli" was sufficient for that machine, and it doesn't have any ACPI issues. But it's still an ack that at least nothing breaks that I can tell. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume @ 2009-03-05 23:44 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-05 23:44 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > The following patches modifiy the way in which we handle disabling interrupts > during suspend and enabling them during resume. They also change the ordering > of the core suspend and hibernation code. Side note - I've tested them on the EeePC that had trouble resuming due to interrupt timings, and it suspends and resumes fine with these patches (modulo some new X problems, but that's what I get for living with Fedora-11 testing). Of course, it also suspends and resumes without them, since the CPU "cli" was sufficient for that machine, and it doesn't have any ACPI issues. But it's still an ack that at least nothing breaks that I can tell. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-03-05 23:44 ` Linus Torvalds (?) @ 2009-03-06 6:47 ` Sitsofe Wheeler -1 siblings, 0 replies; 373+ messages in thread From: Sitsofe Wheeler @ 2009-03-06 6:47 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg On Thu, Mar 05, 2009 at 03:44:22PM -0800, Linus Torvalds wrote: > Side note - I've tested them on the EeePC that had trouble resuming due to > interrupt timings, and it suspends and resumes fine with these patches > (modulo some new X problems, but that's what I get for living with > Fedora-11 testing). Since you have an EeePC I'm guessing your graphics card is an i9xx of some variety. The i9xx kernel driver seems to have been recently reworked so even those people who aren't using GEM yet are seeing issues here and there (some with VT switching, some with suspend to ram and others with suspend to disk). I actually don't know if the stuff poping up is any more than usual but if so I'm hoping most new issues go away once the dust settles... -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-03-05 23:44 ` Linus Torvalds (?) (?) @ 2009-03-06 6:47 ` Sitsofe Wheeler -1 siblings, 0 replies; 373+ messages in thread From: Sitsofe Wheeler @ 2009-03-06 6:47 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Thu, Mar 05, 2009 at 03:44:22PM -0800, Linus Torvalds wrote: > Side note - I've tested them on the EeePC that had trouble resuming due to > interrupt timings, and it suspends and resumes fine with these patches > (modulo some new X problems, but that's what I get for living with > Fedora-11 testing). Since you have an EeePC I'm guessing your graphics card is an i9xx of some variety. The i9xx kernel driver seems to have been recently reworked so even those people who aren't using GEM yet are seeing issues here and there (some with VT switching, some with suspend to ram and others with suspend to disk). I actually don't know if the stuff poping up is any more than usual but if so I'm hoping most new issues go away once the dust settles... -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-03-05 23:44 ` Linus Torvalds ` (2 preceding siblings ...) (?) @ 2009-03-06 10:19 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-06 10:19 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Alan Stern, Johannes Berg On Friday 06 March 2009, Linus Torvalds wrote: > > On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > > > The following patches modifiy the way in which we handle disabling interrupts > > during suspend and enabling them during resume. They also change the ordering > > of the core suspend and hibernation code. > > Side note - I've tested them on the EeePC that had trouble resuming due to > interrupt timings, and it suspends and resumes fine with these patches > (modulo some new X problems, but that's what I get for living with > Fedora-11 testing). > > Of course, it also suspends and resumes without them, since the CPU "cli" > was sufficient for that machine, and it doesn't have any ACPI issues. But > it's still an ack that at least nothing breaks that I can tell. OK, thanks for testing! The next-step patches (ie. PCI suspend-resume rework) are in the works. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume 2009-03-05 23:44 ` Linus Torvalds ` (3 preceding siblings ...) (?) @ 2009-03-06 10:19 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-06 10:19 UTC (permalink / raw) To: Linus Torvalds Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Johannes Berg, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list On Friday 06 March 2009, Linus Torvalds wrote: > > On Sun, 1 Mar 2009, Rafael J. Wysocki wrote: > > > > The following patches modifiy the way in which we handle disabling interrupts > > during suspend and enabling them during resume. They also change the ordering > > of the core suspend and hibernation code. > > Side note - I've tested them on the EeePC that had trouble resuming due to > interrupt timings, and it suspends and resumes fine with these patches > (modulo some new X problems, but that's what I get for living with > Fedora-11 testing). > > Of course, it also suspends and resumes without them, since the CPU "cli" > was sufficient for that machine, and it doesn't have any ACPI issues. But > it's still an ack that at least nothing breaks that I can tell. OK, thanks for testing! The next-step patches (ie. PCI suspend-resume rework) are in the works. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-02-22 17:37 ` Rafael J. Wysocki ` (9 preceding siblings ...) (?) @ 2009-03-07 10:19 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:19 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list Hi, The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/8 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 2/8 - 4/8 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 5/8 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 6/8 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 7/8 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 8/8 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-02-22 17:37 ` Rafael J. Wysocki ` (10 preceding siblings ...) (?) @ 2009-03-07 10:19 ` Rafael J. Wysocki 2009-03-07 10:20 ` Rafael J. Wysocki ` (16 more replies) -1 siblings, 17 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:19 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg Hi, The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/8 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 2/8 - 4/8 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 5/8 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 6/8 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 7/8 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 8/8 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. Comments welcome. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 10:19 ` Rafael J. Wysocki @ 2009-03-07 10:20 ` Rafael J. Wysocki 2009-03-07 10:21 ` [RFC][PATCH][2/8] PM: Change suspend code ordering Rafael J. Wysocki ` (15 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:20 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) @ 2009-03-07 10:20 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:20 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 10:20 ` Rafael J. Wysocki (?) @ 2009-03-07 16:51 ` Alan Stern 2009-03-07 17:56 ` Rafael J. Wysocki 2009-03-07 17:56 ` [linux-pm] " Rafael J. Wysocki -1 siblings, 2 replies; 373+ messages in thread From: Alan Stern @ 2009-03-07 16:51 UTC (permalink / raw) To: Rafael J. Wysocki Cc: LKML, Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to prevent device drivers > from getting any interrupts (without disabling interrupts on the CPU) > during suspend (or hibernation) and to make them start to receive > interrupts again during the subsequent resume, respectively. These > functions make it possible to keep timer interrupts enabled while the > "late" suspend and "early" resume callbacks provided by device > drivers are being executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > drivers will be prevented from receiving interrupts, with the help of > the new helper function, before their "late" suspend callbacks run > (and analogously during resume). > > In addition, since the device interrups are now disabled before the > CPU has turned all interrupts off and the CPU will ACK the interrupts > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > any wake-up interrupts are pending and abort suspend if that's the > case. One thing about this isn't clear: the distinction between "wake-up" interrupts and other interrupts. In an ideal world, the only pending interrupts during sysdev_suspend would be wake-up interrupts, because drivers would have prevented their devices from generating any other kind of IRQ and would have done all the necessary synchronization as part of their suspend (_not_ suspend_late) methods. Thus there would be no need to distinguish between wake-up and non-wake-up interrupts. So perhaps you're worried about drivers that aren't sufficiently clever. Or is something deeper going on? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 16:51 ` [linux-pm] " Alan Stern @ 2009-03-07 17:56 ` Rafael J. Wysocki 2009-03-07 17:56 ` [linux-pm] " Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 17:56 UTC (permalink / raw) To: Alan Stern, Jeremy Fitzhardinge Cc: Arve, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Saturday 07 March 2009, Alan Stern wrote: > On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to prevent device drivers > > from getting any interrupts (without disabling interrupts on the CPU) > > during suspend (or hibernation) and to make them start to receive > > interrupts again during the subsequent resume, respectively. These > > functions make it possible to keep timer interrupts enabled while the > > "late" suspend and "early" resume callbacks provided by device > > drivers are being executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > drivers will be prevented from receiving interrupts, with the help of > > the new helper function, before their "late" suspend callbacks run > > (and analogously during resume). > > > > In addition, since the device interrups are now disabled before the > > CPU has turned all interrupts off and the CPU will ACK the interrupts > > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > > any wake-up interrupts are pending and abort suspend if that's the > > case. > > One thing about this isn't clear: the distinction between "wake-up" > interrupts and other interrupts. > > In an ideal world, the only pending interrupts during sysdev_suspend > would be wake-up interrupts, because drivers would have prevented their > devices from generating any other kind of IRQ and would have done all > the necessary synchronization as part of their suspend (_not_ > suspend_late) methods. Thus there would be no need to distinguish > between wake-up and non-wake-up interrupts. > > So perhaps you're worried about drivers that aren't sufficiently > clever. Or is something deeper going on? Some drivers leave interrupts enabled during suspend on purpose and mark them as "wake-up interrupts" so that the platform can abort suspend if any of them is pending at the time the "enter suspend" hook is called (this doesn't happen on x86 AFAICS). However, after the $subject patch the CPU will ACK those interrupts if they happen between suspend_device_irqs() and local_irq_disable(), so the platform won't see them as pending. Instead, they will have IRQ_PENDING set in desc->status, so we check if this is the case. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 16:51 ` [linux-pm] " Alan Stern 2009-03-07 17:56 ` Rafael J. Wysocki @ 2009-03-07 17:56 ` Rafael J. Wysocki 2009-03-08 3:53 ` Alan Stern 2009-03-08 3:53 ` [linux-pm] " Alan Stern 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 17:56 UTC (permalink / raw) To: Alan Stern, Jeremy Fitzhardinge Cc: LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list, Arve Hjønnevåg On Saturday 07 March 2009, Alan Stern wrote: > On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Introduce two helper functions allowing us to prevent device drivers > > from getting any interrupts (without disabling interrupts on the CPU) > > during suspend (or hibernation) and to make them start to receive > > interrupts again during the subsequent resume, respectively. These > > functions make it possible to keep timer interrupts enabled while the > > "late" suspend and "early" resume callbacks provided by device > > drivers are being executed. > > > > Use these functions to rework the handling of interrupts during > > suspend (hibernation) and resume. Namely, interrupts will only be > > disabled on the CPU right before suspending sysdevs, while device > > drivers will be prevented from receiving interrupts, with the help of > > the new helper function, before their "late" suspend callbacks run > > (and analogously during resume). > > > > In addition, since the device interrups are now disabled before the > > CPU has turned all interrupts off and the CPU will ACK the interrupts > > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > > any wake-up interrupts are pending and abort suspend if that's the > > case. > > One thing about this isn't clear: the distinction between "wake-up" > interrupts and other interrupts. > > In an ideal world, the only pending interrupts during sysdev_suspend > would be wake-up interrupts, because drivers would have prevented their > devices from generating any other kind of IRQ and would have done all > the necessary synchronization as part of their suspend (_not_ > suspend_late) methods. Thus there would be no need to distinguish > between wake-up and non-wake-up interrupts. > > So perhaps you're worried about drivers that aren't sufficiently > clever. Or is something deeper going on? Some drivers leave interrupts enabled during suspend on purpose and mark them as "wake-up interrupts" so that the platform can abort suspend if any of them is pending at the time the "enter suspend" hook is called (this doesn't happen on x86 AFAICS). However, after the $subject patch the CPU will ACK those interrupts if they happen between suspend_device_irqs() and local_irq_disable(), so the platform won't see them as pending. Instead, they will have IRQ_PENDING set in desc->status, so we check if this is the case. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 17:56 ` [linux-pm] " Rafael J. Wysocki @ 2009-03-08 3:53 ` Alan Stern 2009-03-08 3:53 ` [linux-pm] " Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 3:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar, Linus Torvalds, Thomas Gleixner On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > One thing about this isn't clear: the distinction between "wake-up" > > interrupts and other interrupts. > > > > In an ideal world, the only pending interrupts during sysdev_suspend > > would be wake-up interrupts, because drivers would have prevented their > > devices from generating any other kind of IRQ and would have done all > > the necessary synchronization as part of their suspend (_not_ > > suspend_late) methods. Thus there would be no need to distinguish > > between wake-up and non-wake-up interrupts. > > > > So perhaps you're worried about drivers that aren't sufficiently > > clever. Or is something deeper going on? > > Some drivers leave interrupts enabled during suspend on purpose and mark > them as "wake-up interrupts" so that the platform can abort suspend if any > of them is pending at the time the "enter suspend" hook is called (this doesn't > happen on x86 AFAICS). > > However, after the $subject patch the CPU will ACK those interrupts if they > happen between suspend_device_irqs() and local_irq_disable(), so the platform > won't see them as pending. Instead, they will have IRQ_PENDING set in > desc->status, so we check if this is the case. You didn't answer my question. Why bother to distinguish between "wake-up" interrupts and non-"wake-up" interrupts? In other words, why not simply abort the suspend if IRQ_PENDING is set for _any_ interrupt during sysdev_suspend()? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 17:56 ` [linux-pm] " Rafael J. Wysocki 2009-03-08 3:53 ` Alan Stern @ 2009-03-08 3:53 ` Alan Stern 2009-03-08 10:00 ` Rafael J. Wysocki ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 3:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list, Arve Hjønnevåg On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > One thing about this isn't clear: the distinction between "wake-up" > > interrupts and other interrupts. > > > > In an ideal world, the only pending interrupts during sysdev_suspend > > would be wake-up interrupts, because drivers would have prevented their > > devices from generating any other kind of IRQ and would have done all > > the necessary synchronization as part of their suspend (_not_ > > suspend_late) methods. Thus there would be no need to distinguish > > between wake-up and non-wake-up interrupts. > > > > So perhaps you're worried about drivers that aren't sufficiently > > clever. Or is something deeper going on? > > Some drivers leave interrupts enabled during suspend on purpose and mark > them as "wake-up interrupts" so that the platform can abort suspend if any > of them is pending at the time the "enter suspend" hook is called (this doesn't > happen on x86 AFAICS). > > However, after the $subject patch the CPU will ACK those interrupts if they > happen between suspend_device_irqs() and local_irq_disable(), so the platform > won't see them as pending. Instead, they will have IRQ_PENDING set in > desc->status, so we check if this is the case. You didn't answer my question. Why bother to distinguish between "wake-up" interrupts and non-"wake-up" interrupts? In other words, why not simply abort the suspend if IRQ_PENDING is set for _any_ interrupt during sysdev_suspend()? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 3:53 ` [linux-pm] " Alan Stern @ 2009-03-08 10:00 ` Rafael J. Wysocki 2009-03-08 10:00 ` [linux-pm] " Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 10:00 UTC (permalink / raw) To: Alan Stern Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar, Linus Torvalds, Thomas Gleixner On Sunday 08 March 2009, Alan Stern wrote: > On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > > > One thing about this isn't clear: the distinction between "wake-up" > > > interrupts and other interrupts. > > > > > > In an ideal world, the only pending interrupts during sysdev_suspend > > > would be wake-up interrupts, because drivers would have prevented their > > > devices from generating any other kind of IRQ and would have done all > > > the necessary synchronization as part of their suspend (_not_ > > > suspend_late) methods. Thus there would be no need to distinguish > > > between wake-up and non-wake-up interrupts. > > > > > > So perhaps you're worried about drivers that aren't sufficiently > > > clever. Or is something deeper going on? > > > > Some drivers leave interrupts enabled during suspend on purpose and mark > > them as "wake-up interrupts" so that the platform can abort suspend if any > > of them is pending at the time the "enter suspend" hook is called (this doesn't > > happen on x86 AFAICS). > > > > However, after the $subject patch the CPU will ACK those interrupts if they > > happen between suspend_device_irqs() and local_irq_disable(), so the platform > > won't see them as pending. Instead, they will have IRQ_PENDING set in > > desc->status, so we check if this is the case. > > You didn't answer my question. Why bother to distinguish between > "wake-up" interrupts and non-"wake-up" interrupts? Sorry, I thought it followed from what I wrote. > In other words, why not simply abort the suspend if IRQ_PENDING is set > for _any_ interrupt during sysdev_suspend()? The "wake-up" ones are _intentionally_ left enabled, while the other ones may be left enabled by mistake. The check is intended to prevent the current behavior from changing (ie. suspend is aborted if any "wake-up" interrupts are pending) and since the platforms only check for the "wake-up" interrupts, it doesn't go any further. Moreover, I think it might introduce a regression if it did. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 3:53 ` [linux-pm] " Alan Stern 2009-03-08 10:00 ` Rafael J. Wysocki @ 2009-03-08 10:00 ` Rafael J. Wysocki 2009-03-08 12:37 ` Alan Stern 2009-03-08 12:37 ` [linux-pm] " Alan Stern 2009-03-08 17:20 ` Linus Torvalds 2009-03-08 17:20 ` Linus Torvalds 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 10:00 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list, Arve Hjønnevåg On Sunday 08 March 2009, Alan Stern wrote: > On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > > > > One thing about this isn't clear: the distinction between "wake-up" > > > interrupts and other interrupts. > > > > > > In an ideal world, the only pending interrupts during sysdev_suspend > > > would be wake-up interrupts, because drivers would have prevented their > > > devices from generating any other kind of IRQ and would have done all > > > the necessary synchronization as part of their suspend (_not_ > > > suspend_late) methods. Thus there would be no need to distinguish > > > between wake-up and non-wake-up interrupts. > > > > > > So perhaps you're worried about drivers that aren't sufficiently > > > clever. Or is something deeper going on? > > > > Some drivers leave interrupts enabled during suspend on purpose and mark > > them as "wake-up interrupts" so that the platform can abort suspend if any > > of them is pending at the time the "enter suspend" hook is called (this doesn't > > happen on x86 AFAICS). > > > > However, after the $subject patch the CPU will ACK those interrupts if they > > happen between suspend_device_irqs() and local_irq_disable(), so the platform > > won't see them as pending. Instead, they will have IRQ_PENDING set in > > desc->status, so we check if this is the case. > > You didn't answer my question. Why bother to distinguish between > "wake-up" interrupts and non-"wake-up" interrupts? Sorry, I thought it followed from what I wrote. > In other words, why not simply abort the suspend if IRQ_PENDING is set > for _any_ interrupt during sysdev_suspend()? The "wake-up" ones are _intentionally_ left enabled, while the other ones may be left enabled by mistake. The check is intended to prevent the current behavior from changing (ie. suspend is aborted if any "wake-up" interrupts are pending) and since the platforms only check for the "wake-up" interrupts, it doesn't go any further. Moreover, I think it might introduce a regression if it did. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 10:00 ` [linux-pm] " Rafael J. Wysocki @ 2009-03-08 12:37 ` Alan Stern 2009-03-08 12:37 ` [linux-pm] " Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 12:37 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar, Linus Torvalds, Thomas Gleixner On Sun, 8 Mar 2009, Rafael J. Wysocki wrote: > > > > So perhaps you're worried about drivers that aren't sufficiently > > > > clever. Or is something deeper going on? > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > for _any_ interrupt during sysdev_suspend()? > > The "wake-up" ones are _intentionally_ left enabled, while the other ones may > be left enabled by mistake. The check is intended to prevent the current > behavior from changing (ie. suspend is aborted if any "wake-up" interrupts > are pending) and since the platforms only check for the "wake-up" interrupts, > it doesn't go any further. Moreover, I think it might introduce a regression > if it did. So it _is_ because you are worried about drivers that aren't sufficiently clever. If the drivers did their job correctly then there wouldn't be any pending non-"wake-up" interrupts to confuse matters. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 10:00 ` [linux-pm] " Rafael J. Wysocki 2009-03-08 12:37 ` Alan Stern @ 2009-03-08 12:37 ` Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 12:37 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list, Arve Hjønnevåg On Sun, 8 Mar 2009, Rafael J. Wysocki wrote: > > > > So perhaps you're worried about drivers that aren't sufficiently > > > > clever. Or is something deeper going on? > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > for _any_ interrupt during sysdev_suspend()? > > The "wake-up" ones are _intentionally_ left enabled, while the other ones may > be left enabled by mistake. The check is intended to prevent the current > behavior from changing (ie. suspend is aborted if any "wake-up" interrupts > are pending) and since the platforms only check for the "wake-up" interrupts, > it doesn't go any further. Moreover, I think it might introduce a regression > if it did. So it _is_ because you are worried about drivers that aren't sufficiently clever. If the drivers did their job correctly then there wouldn't be any pending non-"wake-up" interrupts to confuse matters. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 3:53 ` [linux-pm] " Alan Stern 2009-03-08 10:00 ` Rafael J. Wysocki 2009-03-08 10:00 ` [linux-pm] " Rafael J. Wysocki @ 2009-03-08 17:20 ` Linus Torvalds 2009-03-08 20:40 ` Alan Stern 2009-03-08 20:40 ` [linux-pm] " Alan Stern 2009-03-08 17:20 ` Linus Torvalds 3 siblings, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-08 17:20 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Sat, 7 Mar 2009, Alan Stern wrote: > > You didn't answer my question. Why bother to distinguish between > "wake-up" interrupts and non-"wake-up" interrupts? > > In other words, why not simply abort the suspend if IRQ_PENDING is set > for _any_ interrupt during sysdev_suspend()? .. because some drivers might not actually shut down the hardware until they get to "suspend_late"? If even then, for that matter - a driver may simply not care, knowing that the hardware will be powered off, and will be re-initialized at resume. The thinking that you have to shut your hardware down at "->suspend()" time is a _disease_. There are literally classes of hardware out there where that would be an outright _bug_, like for a PCI bridge device. For many devices, "suspend()" has to be the phase where you shut down the _external_ stuff (eg for a disk controller, it's when you'd flush and stop your disks), but the controller itself may well be alive until later. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 17:20 ` Linus Torvalds @ 2009-03-08 20:40 ` Alan Stern 2009-03-08 20:40 ` [linux-pm] " Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 20:40 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Sun, 8 Mar 2009, Linus Torvalds wrote: > On Sat, 7 Mar 2009, Alan Stern wrote: > > > > You didn't answer my question. Why bother to distinguish between > > "wake-up" interrupts and non-"wake-up" interrupts? > > > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > for _any_ interrupt during sysdev_suspend()? > > .. because some drivers might not actually shut down the hardware until > they get to "suspend_late"? If even then, for that matter - a driver may > simply not care, knowing that the hardware will be powered off, and will > be re-initialized at resume. > > The thinking that you have to shut your hardware down at "->suspend()" > time is a _disease_. There are literally classes of hardware out there > where that would be an outright _bug_, like for a PCI bridge device. For > many devices, "suspend()" has to be the phase where you shut down the > _external_ stuff (eg for a disk controller, it's when you'd flush and stop > your disks), but the controller itself may well be alive until later. Yes, certainly. I agree completely. But there is a difference between shutting down the hardware and merely preventing it from generating interrupt requests. If a device remains capable of generating IRQs after its driver's suspend method has run, the driver runs the risk of having its handler called at a time when it isn't prepared to cope correctly. Of course, this will depend on the details of how the driver is written. There have been examples in the past of devices that, for one reason or another, _did_ generate IRQs at inconvenient times. The hardware or the BIOS may have done improper initialization, for example. On a shared IRQ this led to interrupt storms. IIRC, the solution was to add a PCI quirk routine to disable IRQ generation at an early stage. Didn't e100 have this problem? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 17:20 ` Linus Torvalds 2009-03-08 20:40 ` Alan Stern @ 2009-03-08 20:40 ` Alan Stern 2009-03-08 21:37 ` Rafael J. Wysocki ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Alan Stern @ 2009-03-08 20:40 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Sun, 8 Mar 2009, Linus Torvalds wrote: > On Sat, 7 Mar 2009, Alan Stern wrote: > > > > You didn't answer my question. Why bother to distinguish between > > "wake-up" interrupts and non-"wake-up" interrupts? > > > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > for _any_ interrupt during sysdev_suspend()? > > .. because some drivers might not actually shut down the hardware until > they get to "suspend_late"? If even then, for that matter - a driver may > simply not care, knowing that the hardware will be powered off, and will > be re-initialized at resume. > > The thinking that you have to shut your hardware down at "->suspend()" > time is a _disease_. There are literally classes of hardware out there > where that would be an outright _bug_, like for a PCI bridge device. For > many devices, "suspend()" has to be the phase where you shut down the > _external_ stuff (eg for a disk controller, it's when you'd flush and stop > your disks), but the controller itself may well be alive until later. Yes, certainly. I agree completely. But there is a difference between shutting down the hardware and merely preventing it from generating interrupt requests. If a device remains capable of generating IRQs after its driver's suspend method has run, the driver runs the risk of having its handler called at a time when it isn't prepared to cope correctly. Of course, this will depend on the details of how the driver is written. There have been examples in the past of devices that, for one reason or another, _did_ generate IRQs at inconvenient times. The hardware or the BIOS may have done improper initialization, for example. On a shared IRQ this led to interrupt storms. IIRC, the solution was to add a PCI quirk routine to disable IRQ generation at an early stage. Didn't e100 have this problem? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 20:40 ` [linux-pm] " Alan Stern @ 2009-03-08 21:37 ` Rafael J. Wysocki 2009-03-08 21:37 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 21:37 UTC (permalink / raw) To: Alan Stern Cc: Linus Torvalds, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Sunday 08 March 2009, Alan Stern wrote: > On Sun, 8 Mar 2009, Linus Torvalds wrote: > > > On Sat, 7 Mar 2009, Alan Stern wrote: > > > > > > You didn't answer my question. Why bother to distinguish between > > > "wake-up" interrupts and non-"wake-up" interrupts? > > > > > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > > for _any_ interrupt during sysdev_suspend()? > > > > .. because some drivers might not actually shut down the hardware until > > they get to "suspend_late"? If even then, for that matter - a driver may > > simply not care, knowing that the hardware will be powered off, and will > > be re-initialized at resume. > > > > The thinking that you have to shut your hardware down at "->suspend()" > > time is a _disease_. There are literally classes of hardware out there > > where that would be an outright _bug_, like for a PCI bridge device. For > > many devices, "suspend()" has to be the phase where you shut down the > > _external_ stuff (eg for a disk controller, it's when you'd flush and stop > > your disks), but the controller itself may well be alive until later. > > Yes, certainly. I agree completely. > > But there is a difference between shutting down the hardware and merely > preventing it from generating interrupt requests. If a device remains > capable of generating IRQs after its driver's suspend method has run, > the driver runs the risk of having its handler called at a time when it > isn't prepared to cope correctly. Of course, this will depend on the > details of how the driver is written. > > There have been examples in the past of devices that, for one reason or > another, _did_ generate IRQs at inconvenient times. The hardware or > the BIOS may have done improper initialization, for example. On a > shared IRQ this led to interrupt storms. Well, we're now trying to fix exactly this problem. :-) > IIRC, the solution was to add a PCI quirk routine to disable IRQ generation > at an early stage. Didn't e100 have this problem? I don't remember, sorry. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 20:40 ` [linux-pm] " Alan Stern 2009-03-08 21:37 ` Rafael J. Wysocki @ 2009-03-08 21:37 ` Rafael J. Wysocki 2009-03-09 14:59 ` Linus Torvalds 2009-03-09 14:59 ` [linux-pm] " Linus Torvalds 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 21:37 UTC (permalink / raw) To: Alan Stern Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Sunday 08 March 2009, Alan Stern wrote: > On Sun, 8 Mar 2009, Linus Torvalds wrote: > > > On Sat, 7 Mar 2009, Alan Stern wrote: > > > > > > You didn't answer my question. Why bother to distinguish between > > > "wake-up" interrupts and non-"wake-up" interrupts? > > > > > > In other words, why not simply abort the suspend if IRQ_PENDING is set > > > for _any_ interrupt during sysdev_suspend()? > > > > .. because some drivers might not actually shut down the hardware until > > they get to "suspend_late"? If even then, for that matter - a driver may > > simply not care, knowing that the hardware will be powered off, and will > > be re-initialized at resume. > > > > The thinking that you have to shut your hardware down at "->suspend()" > > time is a _disease_. There are literally classes of hardware out there > > where that would be an outright _bug_, like for a PCI bridge device. For > > many devices, "suspend()" has to be the phase where you shut down the > > _external_ stuff (eg for a disk controller, it's when you'd flush and stop > > your disks), but the controller itself may well be alive until later. > > Yes, certainly. I agree completely. > > But there is a difference between shutting down the hardware and merely > preventing it from generating interrupt requests. If a device remains > capable of generating IRQs after its driver's suspend method has run, > the driver runs the risk of having its handler called at a time when it > isn't prepared to cope correctly. Of course, this will depend on the > details of how the driver is written. > > There have been examples in the past of devices that, for one reason or > another, _did_ generate IRQs at inconvenient times. The hardware or > the BIOS may have done improper initialization, for example. On a > shared IRQ this led to interrupt storms. Well, we're now trying to fix exactly this problem. :-) > IIRC, the solution was to add a PCI quirk routine to disable IRQ generation > at an early stage. Didn't e100 have this problem? I don't remember, sorry. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 20:40 ` [linux-pm] " Alan Stern 2009-03-08 21:37 ` Rafael J. Wysocki 2009-03-08 21:37 ` Rafael J. Wysocki @ 2009-03-09 14:59 ` Linus Torvalds 2009-03-09 14:59 ` [linux-pm] " Linus Torvalds 3 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-09 14:59 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Sun, 8 Mar 2009, Alan Stern wrote: > > There have been examples in the past of devices that, for one reason or > another, _did_ generate IRQs at inconvenient times. The hardware or > the BIOS may have done improper initialization, for example. On a > shared IRQ this led to interrupt storms. IIRC, the solution was to add > a PCI quirk routine to disable IRQ generation at an early stage. > Didn't e100 have this problem? .. and this is exactly the reason why we've done all these changes. There are tons of drivers that are unable to cope with interrupts that happen after they've done their "pci_set_power_state(PCI_D3hot)". With shared interrupts (and _another_ device still live), they do stupid things like read the interrupt status register, getting all-ones (because the device is dead), and then deciding that that means that that need to handle the interrupt. And that goes on in a loop. Forever. Or they do _that_ part right, but their suspend also free'd some data structure, so now the interrupt handler will follow a NULL pointer and/or scribble to freed memory. The source of bugs is infinite, and not fixable (because, quite frankly, most device driver writers are very focused on the hardware, and have a hard time thinking about it as part of the bigger system - and even if they happen test suspend/resume, they probably won't be testing it with shared interrupts, so it will work _for_them_ even if it's totally broken). So what all the PCI changes try to do is to basically not have the driver do the "pci_set_power_state(PCI_D3)" at _all_, an do it in the PCI layer. But more importantly, it needs to be done _after_ interrupts have been disabled for this all to work. And, for exactly the same reason, the PCI layer needs to wake the device up and restore its config space _before_ enabling interrupts again, and _before_ doing any ->resume calls. And that, in turn, means that since we have all these ACPI ordering things, and many cases want to use ACPI to wake things up, and/or have delays etc, we end up actually wanting things like timer interrupts working at that time - but not normal "device" interrupts. Because many delays do need them, even as simple delays as the (fairly short, but not "busy loop" short) one for turning the device back into PCI_D0 again. So this literally explains all the re-ordering, and all the interrupt games we now play in Rafael's patch-set. The _whole_ (and only) point is to make it easier for device drivers, while also changing the environment so that we can call ACPI and we can sleep even before the devices have really resumed (or even early_resume'd). Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 20:40 ` [linux-pm] " Alan Stern ` (2 preceding siblings ...) 2009-03-09 14:59 ` Linus Torvalds @ 2009-03-09 14:59 ` Linus Torvalds 2009-03-09 15:13 ` Alan Stern 2009-03-09 15:13 ` Alan Stern 3 siblings, 2 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-09 14:59 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Sun, 8 Mar 2009, Alan Stern wrote: > > There have been examples in the past of devices that, for one reason or > another, _did_ generate IRQs at inconvenient times. The hardware or > the BIOS may have done improper initialization, for example. On a > shared IRQ this led to interrupt storms. IIRC, the solution was to add > a PCI quirk routine to disable IRQ generation at an early stage. > Didn't e100 have this problem? .. and this is exactly the reason why we've done all these changes. There are tons of drivers that are unable to cope with interrupts that happen after they've done their "pci_set_power_state(PCI_D3hot)". With shared interrupts (and _another_ device still live), they do stupid things like read the interrupt status register, getting all-ones (because the device is dead), and then deciding that that means that that need to handle the interrupt. And that goes on in a loop. Forever. Or they do _that_ part right, but their suspend also free'd some data structure, so now the interrupt handler will follow a NULL pointer and/or scribble to freed memory. The source of bugs is infinite, and not fixable (because, quite frankly, most device driver writers are very focused on the hardware, and have a hard time thinking about it as part of the bigger system - and even if they happen test suspend/resume, they probably won't be testing it with shared interrupts, so it will work _for_them_ even if it's totally broken). So what all the PCI changes try to do is to basically not have the driver do the "pci_set_power_state(PCI_D3)" at _all_, an do it in the PCI layer. But more importantly, it needs to be done _after_ interrupts have been disabled for this all to work. And, for exactly the same reason, the PCI layer needs to wake the device up and restore its config space _before_ enabling interrupts again, and _before_ doing any ->resume calls. And that, in turn, means that since we have all these ACPI ordering things, and many cases want to use ACPI to wake things up, and/or have delays etc, we end up actually wanting things like timer interrupts working at that time - but not normal "device" interrupts. Because many delays do need them, even as simple delays as the (fairly short, but not "busy loop" short) one for turning the device back into PCI_D0 again. So this literally explains all the re-ordering, and all the interrupt games we now play in Rafael's patch-set. The _whole_ (and only) point is to make it easier for device drivers, while also changing the environment so that we can call ACPI and we can sleep even before the devices have really resumed (or even early_resume'd). Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-09 14:59 ` [linux-pm] " Linus Torvalds @ 2009-03-09 15:13 ` Alan Stern 2009-03-09 15:40 ` Linus Torvalds 2009-03-09 15:40 ` [linux-pm] " Linus Torvalds 2009-03-09 15:13 ` Alan Stern 1 sibling, 2 replies; 373+ messages in thread From: Alan Stern @ 2009-03-09 15:13 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Mon, 9 Mar 2009, Linus Torvalds wrote: > On Sun, 8 Mar 2009, Alan Stern wrote: > > > > There have been examples in the past of devices that, for one reason or > > another, _did_ generate IRQs at inconvenient times. The hardware or > > the BIOS may have done improper initialization, for example. On a > > shared IRQ this led to interrupt storms. IIRC, the solution was to add > > a PCI quirk routine to disable IRQ generation at an early stage. > > Didn't e100 have this problem? > > .. and this is exactly the reason why we've done all these changes. > > There are tons of drivers that are unable to cope with interrupts that > happen after they've done their "pci_set_power_state(PCI_D3hot)". > > With shared interrupts (and _another_ device still live), they do stupid > things like read the interrupt status register, getting all-ones (because > the device is dead), and then deciding that that means that that need to > handle the interrupt. And that goes on in a loop. Forever. > > Or they do _that_ part right, but their suspend also free'd some data > structure, so now the interrupt handler will follow a NULL pointer and/or > scribble to freed memory. The source of bugs is infinite, and not fixable > (because, quite frankly, most device driver writers are very focused on > the hardware, and have a hard time thinking about it as part of the bigger > system - and even if they happen test suspend/resume, they probably won't > be testing it with shared interrupts, so it will work _for_them_ even if > it's totally broken). > > So what all the PCI changes try to do is to basically not have the driver > do the "pci_set_power_state(PCI_D3)" at _all_, an do it in the PCI layer. > But more importantly, it needs to be done _after_ interrupts have been > disabled for this all to work. And, for exactly the same reason, the PCI > layer needs to wake the device up and restore its config space _before_ > enabling interrupts again, and _before_ doing any ->resume calls. > > And that, in turn, means that since we have all these ACPI ordering > things, and many cases want to use ACPI to wake things up, and/or have > delays etc, we end up actually wanting things like timer interrupts > working at that time - but not normal "device" interrupts. Because many > delays do need them, even as simple delays as the (fairly short, but not > "busy loop" short) one for turning the device back into PCI_D0 again. > > So this literally explains all the re-ordering, and all the interrupt > games we now play in Rafael's patch-set. The _whole_ (and only) point is > to make it easier for device drivers, while also changing the environment > so that we can call ACPI and we can sleep even before the devices have > really resumed (or even early_resume'd). I see. The unstated key point is this: Unsophisticated drivers can still be expected to work if they get an interrupt after their suspend method has run, _provided_ the device is still in D0. Likewise, unsophisticated drivers can be expected to fail if they get an interrupt after the device has been put in D3. Hence you don't require drivers to disable interrupt generation in their suspend methods, and you do prevent interrupts from being delivered to drivers before changing device power states. And hence you also go to some trouble to distinguish between IRQs which might be received merely because the driver didn't bother to suppress them vs. IRQs which indicate a genuine wakeup request. Got it. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-09 15:13 ` Alan Stern @ 2009-03-09 15:40 ` Linus Torvalds 2009-03-09 15:40 ` [linux-pm] " Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-09 15:40 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Mon, 9 Mar 2009, Alan Stern wrote: > > I see. The unstated key point is this: > > Unsophisticated drivers [...] Another key point is: - _un_sophisticated is the norm, and anybody who expects otherwise is living in some odd la-la-land together with his or her pink unicorn and endless supplies of quaaludes. The thing is, we have about a metric sh*tload of drivers, and many of them are effectively written by people who don't really do kernel work, and are basically unmaintained in the long run (ie they may be maintained while written, but two years down the line they have a couple of hundred users and nobody who really cares about it, because the original author long since moved on to fancier hardware). Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [linux-pm] [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-09 15:13 ` Alan Stern 2009-03-09 15:40 ` Linus Torvalds @ 2009-03-09 15:40 ` Linus Torvalds 1 sibling, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-09 15:40 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Jeremy Fitzhardinge, LKML, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, pm list, Arve Hjønnevåg On Mon, 9 Mar 2009, Alan Stern wrote: > > I see. The unstated key point is this: > > Unsophisticated drivers [...] Another key point is: - _un_sophisticated is the norm, and anybody who expects otherwise is living in some odd la-la-land together with his or her pink unicorn and endless supplies of quaaludes. The thing is, we have about a metric sh*tload of drivers, and many of them are effectively written by people who don't really do kernel work, and are basically unmaintained in the long run (ie they may be maintained while written, but two years down the line they have a couple of hundred users and nobody who really cares about it, because the original author long since moved on to fancier hardware). Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-09 14:59 ` [linux-pm] " Linus Torvalds 2009-03-09 15:13 ` Alan Stern @ 2009-03-09 15:13 ` Alan Stern 1 sibling, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-09 15:13 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Mon, 9 Mar 2009, Linus Torvalds wrote: > On Sun, 8 Mar 2009, Alan Stern wrote: > > > > There have been examples in the past of devices that, for one reason or > > another, _did_ generate IRQs at inconvenient times. The hardware or > > the BIOS may have done improper initialization, for example. On a > > shared IRQ this led to interrupt storms. IIRC, the solution was to add > > a PCI quirk routine to disable IRQ generation at an early stage. > > Didn't e100 have this problem? > > .. and this is exactly the reason why we've done all these changes. > > There are tons of drivers that are unable to cope with interrupts that > happen after they've done their "pci_set_power_state(PCI_D3hot)". > > With shared interrupts (and _another_ device still live), they do stupid > things like read the interrupt status register, getting all-ones (because > the device is dead), and then deciding that that means that that need to > handle the interrupt. And that goes on in a loop. Forever. > > Or they do _that_ part right, but their suspend also free'd some data > structure, so now the interrupt handler will follow a NULL pointer and/or > scribble to freed memory. The source of bugs is infinite, and not fixable > (because, quite frankly, most device driver writers are very focused on > the hardware, and have a hard time thinking about it as part of the bigger > system - and even if they happen test suspend/resume, they probably won't > be testing it with shared interrupts, so it will work _for_them_ even if > it's totally broken). > > So what all the PCI changes try to do is to basically not have the driver > do the "pci_set_power_state(PCI_D3)" at _all_, an do it in the PCI layer. > But more importantly, it needs to be done _after_ interrupts have been > disabled for this all to work. And, for exactly the same reason, the PCI > layer needs to wake the device up and restore its config space _before_ > enabling interrupts again, and _before_ doing any ->resume calls. > > And that, in turn, means that since we have all these ACPI ordering > things, and many cases want to use ACPI to wake things up, and/or have > delays etc, we end up actually wanting things like timer interrupts > working at that time - but not normal "device" interrupts. Because many > delays do need them, even as simple delays as the (fairly short, but not > "busy loop" short) one for turning the device back into PCI_D0 again. > > So this literally explains all the re-ordering, and all the interrupt > games we now play in Rafael's patch-set. The _whole_ (and only) point is > to make it easier for device drivers, while also changing the environment > so that we can call ACPI and we can sleep even before the devices have > really resumed (or even early_resume'd). I see. The unstated key point is this: Unsophisticated drivers can still be expected to work if they get an interrupt after their suspend method has run, _provided_ the device is still in D0. Likewise, unsophisticated drivers can be expected to fail if they get an interrupt after the device has been put in D3. Hence you don't require drivers to disable interrupt generation in their suspend methods, and you do prevent interrupts from being delivered to drivers before changing device power states. And hence you also go to some trouble to distinguish between IRQs which might be received merely because the driver didn't bother to suppress them vs. IRQs which indicate a genuine wakeup request. Got it. Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-08 3:53 ` [linux-pm] " Alan Stern ` (2 preceding siblings ...) 2009-03-08 17:20 ` Linus Torvalds @ 2009-03-08 17:20 ` Linus Torvalds 3 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-08 17:20 UTC (permalink / raw) To: Alan Stern Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Sat, 7 Mar 2009, Alan Stern wrote: > > You didn't answer my question. Why bother to distinguish between > "wake-up" interrupts and non-"wake-up" interrupts? > > In other words, why not simply abort the suspend if IRQ_PENDING is set > for _any_ interrupt during sysdev_suspend()? .. because some drivers might not actually shut down the hardware until they get to "suspend_late"? If even then, for that matter - a driver may simply not care, knowing that the hardware will be powered off, and will be re-initialized at resume. The thinking that you have to shut your hardware down at "->suspend()" time is a _disease_. There are literally classes of hardware out there where that would be an outright _bug_, like for a PCI bridge device. For many devices, "suspend()" has to be the phase where you shut down the _external_ stuff (eg for a disk controller, it's when you'd flush and stop your disks), but the controller itself may well be alive until later. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-07 10:20 ` Rafael J. Wysocki (?) (?) @ 2009-03-07 16:51 ` Alan Stern -1 siblings, 0 replies; 373+ messages in thread From: Alan Stern @ 2009-03-07 16:51 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Arve, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Sat, 7 Mar 2009, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Introduce two helper functions allowing us to prevent device drivers > from getting any interrupts (without disabling interrupts on the CPU) > during suspend (or hibernation) and to make them start to receive > interrupts again during the subsequent resume, respectively. These > functions make it possible to keep timer interrupts enabled while the > "late" suspend and "early" resume callbacks provided by device > drivers are being executed. > > Use these functions to rework the handling of interrupts during > suspend (hibernation) and resume. Namely, interrupts will only be > disabled on the CPU right before suspending sysdevs, while device > drivers will be prevented from receiving interrupts, with the help of > the new helper function, before their "late" suspend callbacks run > (and analogously during resume). > > In addition, since the device interrups are now disabled before the > CPU has turned all interrupts off and the CPU will ACK the interrupts > setting the IRQ_PENDING bit for them, check in sysdev_suspend() if > any wake-up interrupts are pending and abort suspend if that's the > case. One thing about this isn't clear: the distinction between "wake-up" interrupts and other interrupts. In an ideal world, the only pending interrupts during sysdev_suspend would be wake-up interrupts, because drivers would have prevented their devices from generating any other kind of IRQ and would have done all the necessary synchronization as part of their suspend (_not_ suspend_late) methods. Thus there would be no need to distinguish between wake-up and non-wake-up interrupts. So perhaps you're worried about drivers that aren't sufficiently clever. Or is something deeper going on? Alan Stern ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][2/8] PM: Change suspend code ordering 2009-03-07 10:19 ` Rafael J. Wysocki 2009-03-07 10:20 ` Rafael J. Wysocki @ 2009-03-07 10:21 ` Rafael J. Wysocki 2009-03-07 10:21 ` Rafael J. Wysocki ` (14 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:21 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][2/8] PM: Change suspend code ordering 2009-03-07 10:19 ` Rafael J. Wysocki 2009-03-07 10:20 ` Rafael J. Wysocki 2009-03-07 10:21 ` [RFC][PATCH][2/8] PM: Change suspend code ordering Rafael J. Wysocki @ 2009-03-07 10:21 ` Rafael J. Wysocki 2009-03-07 10:22 ` [RFC][PATCH][3/8] PM: Change hibernation " Rafael J. Wysocki ` (13 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:21 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][3/8] PM: Change hibernation code ordering 2009-03-07 10:19 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-07 10:21 ` Rafael J. Wysocki @ 2009-03-07 10:22 ` Rafael J. Wysocki 2009-03-07 10:22 ` Rafael J. Wysocki ` (12 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:22 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][3/8] PM: Change hibernation code ordering 2009-03-07 10:19 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-03-07 10:22 ` [RFC][PATCH][3/8] PM: Change hibernation " Rafael J. Wysocki @ 2009-03-07 10:22 ` Rafael J. Wysocki 2009-03-07 10:23 ` [RFC][PATCH][4/8] kexec: Change kexec jump " Rafael J. Wysocki ` (11 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:22 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][4/8] kexec: Change kexec jump code ordering 2009-03-07 10:19 ` Rafael J. Wysocki ` (4 preceding siblings ...) 2009-03-07 10:22 ` Rafael J. Wysocki @ 2009-03-07 10:23 ` Rafael J. Wysocki 2009-03-07 10:23 ` Rafael J. Wysocki ` (10 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:23 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][4/8] kexec: Change kexec jump code ordering 2009-03-07 10:19 ` Rafael J. Wysocki ` (5 preceding siblings ...) 2009-03-07 10:23 ` [RFC][PATCH][4/8] kexec: Change kexec jump " Rafael J. Wysocki @ 2009-03-07 10:23 ` Rafael J. Wysocki 2009-03-07 10:24 ` [RFC][PATCH][5/8] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki ` (9 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:23 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][5/8] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-07 10:19 ` Rafael J. Wysocki ` (6 preceding siblings ...) 2009-03-07 10:23 ` Rafael J. Wysocki @ 2009-03-07 10:24 ` Rafael J. Wysocki 2009-03-07 10:24 ` Rafael J. Wysocki ` (8 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:24 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Frans Pop From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][5/8] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-07 10:19 ` Rafael J. Wysocki ` (7 preceding siblings ...) 2009-03-07 10:24 ` [RFC][PATCH][5/8] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki @ 2009-03-07 10:24 ` Rafael J. Wysocki 2009-03-07 10:25 ` [RFC][PATCH][6/8] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki ` (7 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:24 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Frans Pop, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][6/8] PCI PM: Use pci_set_power_state during early resume 2009-03-07 10:19 ` Rafael J. Wysocki ` (8 preceding siblings ...) 2009-03-07 10:24 ` Rafael J. Wysocki @ 2009-03-07 10:25 ` Rafael J. Wysocki 2009-03-07 10:25 ` Rafael J. Wysocki ` (6 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:25 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][6/8] PCI PM: Use pci_set_power_state during early resume 2009-03-07 10:19 ` Rafael J. Wysocki ` (9 preceding siblings ...) 2009-03-07 10:25 ` [RFC][PATCH][6/8] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki @ 2009-03-07 10:25 ` Rafael J. Wysocki 2009-03-07 10:26 ` [RFC][PATCH][7/8] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki ` (5 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:25 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][7/8] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-07 10:19 ` Rafael J. Wysocki ` (10 preceding siblings ...) 2009-03-07 10:25 ` Rafael J. Wysocki @ 2009-03-07 10:26 ` Rafael J. Wysocki 2009-03-07 10:26 ` Rafael J. Wysocki ` (4 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:26 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][7/8] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-07 10:19 ` Rafael J. Wysocki ` (11 preceding siblings ...) 2009-03-07 10:26 ` [RFC][PATCH][7/8] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki @ 2009-03-07 10:26 ` Rafael J. Wysocki 2009-03-07 10:27 ` [RFC][PATCH][8/8] PCI PM: Put devices into low power states during late suspend Rafael J. Wysocki ` (3 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:26 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][8/8] PCI PM: Put devices into low power states during late suspend 2009-03-07 10:19 ` Rafael J. Wysocki ` (12 preceding siblings ...) 2009-03-07 10:26 ` Rafael J. Wysocki @ 2009-03-07 10:27 ` Rafael J. Wysocki 2009-03-07 10:27 ` Rafael J. Wysocki ` (2 subsequent siblings) 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:27 UTC (permalink / raw) To: LKML Cc: Arve, Jeremy Fitzhardinge, Jesse Barnes, Thomas Gleixner, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 129 ++++++++++++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 52 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,41 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); if (drv && drv->pm && drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [RFC][PATCH][8/8] PCI PM: Put devices into low power states during late suspend 2009-03-07 10:19 ` Rafael J. Wysocki ` (13 preceding siblings ...) 2009-03-07 10:27 ` [RFC][PATCH][8/8] PCI PM: Put devices into low power states during late suspend Rafael J. Wysocki @ 2009-03-07 10:27 ` Rafael J. Wysocki 2009-03-08 19:28 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Frans Pop 2009-03-08 19:28 ` Frans Pop 16 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-07 10:27 UTC (permalink / raw) To: LKML Cc: Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, pm list, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 129 ++++++++++++++++++++++++++++------------------- 1 file changed, 77 insertions(+), 52 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,41 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); if (drv && drv->pm && drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-07 10:19 ` Rafael J. Wysocki ` (14 preceding siblings ...) 2009-03-07 10:27 ` Rafael J. Wysocki @ 2009-03-08 19:28 ` Frans Pop 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-08 19:28 ` Frans Pop 16 siblings, 2 replies; 373+ messages in thread From: Frans Pop @ 2009-03-08 19:28 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-kernel, torvalds, linux-pm (Most CCs dropped.) Hi Rafael, Rafael J. Wysocki wrote: > The following patches modifiy the way in which we handle disabling > interrupts during suspend and enabling them during resume. They also > change the ordering of the core suspend and hibernation code to take > advantage of the new approach to the interrupts and modify the PCI PM > core to avoid a few problems. I've given this series a try on my HP 2510p. I've seen no regressions with suspend to RAM. Below is a diff between suspend/resume dmesg from before (based on rc5) and after (rc7 + series) the patch, with some comments. Nothing looks really wrong, but there are some surprising changes. Essentially JFYI though. Cheers, FJP PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. Suspending console(s) (use no_console_suspend to debug) sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk ACPI handle has no context! ACPI handle has no context! sdhci-pci 0000:02:06.2: PME# disabled sdhci-pci 0000:02:06.2: PCI INT C disabled ACPI handle has no context! ACPI handle has no context! # Bogus: result of using wireless instead of wired networking. +iwlagn 0000:10:00.0: PCI INT A disabled ata2: port disabled. ignoring. ata_piix 0000:00:1f.1: PCI INT A disabled ehci_hcd 0000:00:1d.7: PCI INT A disabled ehci_hcd 0000:00:1d.7: PME# disabled uhci_hcd 0000:00:1d.2: PCI INT C disabled uhci_hcd 0000:00:1d.1: PCI INT B disabled uhci_hcd 0000:00:1d.0: PCI INT A disabled HDA Intel 0000:00:1b.0: PCI INT A disabled HDA Intel 0000:00:1b.0: power state changed by ACPI to D3 ehci_hcd 0000:00:1a.7: PCI INT C disabled ehci_hcd 0000:00:1a.7: PME# disabled uhci_hcd 0000:00:1a.1: PCI INT B disabled uhci_hcd 0000:00:1a.0: PCI INT A disabled e1000e 0000:00:19.0: PME# enabled e1000e 0000:00:19.0: wake-up capability enabled by ACPI e1000e 0000:00:19.0: PME# enabled e1000e 0000:00:19.0: wake-up capability enabled by ACPI e1000e 0000:00:19.0: PCI INT A disabled ACPI handle has no context! # This has moved up a bit. Looks more logical. +ricoh-mmc: Suspending. +ricoh-mmc: Controller is now re-enabled. ACPI: Preparing to enter system sleep state S3 Disabling non-boot CPUs ... CPU 1 is now offline SMP alternatives: switching to UP code CPU0 attaching NULL sched-domain. CPU1 attaching NULL sched-domain. CPU0 attaching NULL sched-domain. CPU1 is down -ricoh-mmc: Suspending. -ricoh-mmc: Controller is now re-enabled. Extended CMOS year: 2000 Back to C! +CPU0: Thermal monitoring enabled (TM2) Extended CMOS year: 2000 # This whole block has moved up before early config space restores. # No changes in the block itself. +Enabling non-boot CPUs ... +SMP alternatives: switching to SMP code +Booting processor 1 APIC 0x1 ip 0x6000 +Initializing CPU#1 +Calibrating delay using timer specific routine.. 2660.04 BogoMIPS (lpj=5320097) +CPU: L1 I cache: 32K, L1 D cache: 32K +CPU: L2 cache: 2048K +[ds] using Core 2/Atom configuration +CPU: Physical Processor ID: 0 +CPU: Processor Core ID: 1 +CPU1: Thermal monitoring enabled (TM2) +CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d +CPU0 attaching NULL sched-domain. +Switched to high resolution mode on CPU 1 +CPU0 attaching sched-domain: + domain 0: span 0-1 level MC + groups: 0 1 +CPU1 attaching sched-domain: + domain 0: span 0-1 level MC + groups: 1 0 +CPU1 is up +ACPI: Waking up from system sleep state S3 pci 0000:00:02.0: restoring config space at offset 0x8 (was 0x1, writing 0x2001) # These don't need restoring anymore? -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) serial 0000:00:03.3: restoring config space at offset 0xf (was 0x200, writing 0x20a) serial 0000:00:03.3: restoring config space at offset 0x5 (was 0x0, writing 0xe0601000) serial 0000:00:03.3: restoring config space at offset 0x4 (was 0x1, writing 0x2041) serial 0000:00:03.3: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00007) e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0x2061) e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xe0640000) e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) # These have moved down to late resume. -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) ehci_hcd 0000:00:1a.7: restoring config space at offset 0xf (was 0x300, writing 0x30b) ehci_hcd 0000:00:1a.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0641000) ehci_hcd 0000:00:1a.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) HDA Intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) HDA Intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) HDA Intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100002) pcieport-driver 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x4010a) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x8 (was 0x0, writing 0xfff0) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x7 (was 0x0, writing 0x200000f0) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x6 (was 0x0, writing 0x80800) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) pcieport-driver 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x4020a) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x8 (was 0x0, writing 0xe000e000) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x7 (was 0x0, writing 0xf0) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) # These have moved down to late resume. -uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) ehci_hcd 0000:00:1d.7: restoring config space at offset 0xf (was 0x100, writing 0x10a) ehci_hcd 0000:00:1d.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0648000) ehci_hcd 0000:00:1d.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) # These have disappeared. -pci 0000:00:1e.0: restoring config space at offset 0x9 (was 0x10001, writing 0x83f18001) -pci 0000:00:1e.0: restoring config space at offset 0x8 (was 0x0, writing 0xe030e010) -pci 0000:00:1e.0: restoring config space at offset 0x7 (was 0x228000f0, writing 0x22803030) -pci 0000:00:1e.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100107) # First two moved to late resume. # The third already happened during late resume (duplicated). -ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) -ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) -ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) iwlagn 0000:10:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) iwlagn 0000:10:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xe0000004) iwlagn 0000:10:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) iwlagn 0000:10:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100006) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xf (was 0x3000100, writing 0x580010b) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xe (was 0x0, writing 0x34fc) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xd (was 0x0, writing 0x3400) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xc (was 0x0, writing 0x30fc) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xb (was 0x0, writing 0x3000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xa (was 0x0, writing 0x87fff000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x9 (was 0x0, writing 0x84000000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x8 (was 0x0, writing 0x83fff000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x7 (was 0x0, writing 0x80000000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x6 (was 0x0, writing 0xb0060302) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x4 (was 0x0, writing 0xe0100000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x3 (was 0x820000, writing 0x82a800) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100007) ohci1394 0000:02:06.1: restoring config space at offset 0xf (was 0x4020200, writing 0x4020205) ohci1394 0000:02:06.1: restoring config space at offset 0x4 (was 0x0, writing 0xe0101000) ohci1394 0000:02:06.1: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) ohci1394 0000:02:06.1: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) sdhci-pci 0000:02:06.2: restoring config space at offset 0xf (was 0x300, writing 0x30a) sdhci-pci 0000:02:06.2: restoring config space at offset 0x4 (was 0x0, writing 0xe0102000) sdhci-pci 0000:02:06.2: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) sdhci-pci 0000:02:06.2: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) # Some changes; a lot just got dropped. -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0x30a) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xe (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xd (was 0x80, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xc (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xb (was 0x30c9103c, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xa (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x9 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x8 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x7 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x6 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x5 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xe0103000) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x2 (was 0x8800011, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x0 (was 0x8431180, writing 0xffffffff) ricoh-mmc: Resuming. ricoh-mmc: Controller is now disabled. -Enabling non-boot CPUs ... -SMP alternatives: switching to SMP code -Booting processor 1 APIC 0x1 ip 0x6000 -Initializing CPU#1 -Calibrating delay using timer specific routine.. 2660.07 BogoMIPS (lpj=5320158) -CPU: L1 I cache: 32K, L1 D cache: 32K -CPU: L2 cache: 2048K -[ds] using Core 2/Atom configuration -CPU: Physical Processor ID: 0 -CPU: Processor Core ID: 1 -CPU1: Thermal monitoring enabled (TM2) -x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106 -CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d -CPU0 attaching NULL sched-domain. -Switched to high resolution mode on CPU 1 -CPU0 attaching sched-domain: - domain 0: span 0-1 level MC - groups: 0 1 -CPU1 attaching sched-domain: - domain 0: span 0-1 level MC - groups: 1 0 -CPU1 is up -ACPI: Waking up from system sleep state S3 ACPI: EC: non-query interrupt received, switching to interrupt mode pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) pci 0000:00:02.0: PME# disabled pci 0000:00:02.1: PME# disabled pci 0000:00:03.0: PME# disabled pci 0000:00:03.2: PME# disabled e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 e1000e 0000:00:19.0: setting latency timer to 64 e1000e 0000:00:19.0: wake-up capability disabled by ACPI e1000e 0000:00:19.0: PME# disabled e1000e 0000:00:19.0: wake-up capability disabled by ACPI e1000e 0000:00:19.0: PME# disabled e1000e 0000:00:19.0: irq 26 for MSI/MSI-X +uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 uhci_hcd 0000:00:1a.0: setting latency timer to 64 usb usb1: root hub lost power or was reset +uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 uhci_hcd 0000:00:1a.1: setting latency timer to 64 usb usb3: root hub lost power or was reset ehci_hcd 0000:00:1a.7: PME# disabled ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18 ehci_hcd 0000:00:1a.7: setting latency timer to 64 ehci_hcd 0000:00:1a.7: PME# disabled # Called twice now? HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 +HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 HDA Intel 0000:00:1b.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 HDA Intel 0000:00:1b.0: setting latency timer to 64 pcieport-driver 0000:00:1c.0: setting latency timer to 64 pcieport-driver 0000:00:1c.1: setting latency timer to 64 +uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 uhci_hcd 0000:00:1d.0: setting latency timer to 64 usb usb5: root hub lost power or was reset +uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 22 (level, low) -> IRQ 22 uhci_hcd 0000:00:1d.1: setting latency timer to 64 usb usb6: root hub lost power or was reset +uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 uhci_hcd 0000:00:1d.2: setting latency timer to 64 usb usb7: root hub lost power or was reset ehci_hcd 0000:00:1d.7: PME# disabled ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 20 (level, low) -> IRQ 20 ehci_hcd 0000:00:1d.7: setting latency timer to 64 ehci_hcd 0000:00:1d.7: PME# disabled pci 0000:00:1e.0: setting latency timer to 64 +ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) +ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) ata_piix 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 ata_piix 0000:00:1f.1: setting latency timer to 64 ata2: port disabled. ignoring. ACPI Exception (exoparg2-0445): AE_AML_PACKAGE_LIMIT, Index (000000005) is beyond end of object [20081204] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C2C3] (Node ffff88007e01dea0), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C003.C0F6.C3F3._STM] (Node ffff88007e043de0), AE_AML_PACKAGE_LIMIT ata1: ACPI set timing mode failed (status=0x300b) # Remaining differences are bogus: result of using wireless instead of wired networking. +iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 +iwlagn 0000:10:00.0: irq 27 for MSI/MSI-X ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[19] MMIO=[e0101000-e01017ff] Max Packet=[2048] IR/IT contexts=[4/4] sdhci-pci 0000:02:06.2: PCI INT C -> GSI 20 (level, low) -> IRQ 20 +Registered led device: iwl-phy0:radio +Registered led device: iwl-phy0:assoc +Registered led device: iwl-phy0:RX +Registered led device: iwl-phy0:TX sd 0:0:0:0: [sda] Starting disk ata1.01: ACPI cmd ef/03:0c:00:00:00:b0 filtered out ata1.01: ACPI cmd ef/03:40:00:00:00:b0 filtered out ata1.00: ACPI cmd ef/03:01:00:00:00:a0 filtered out ata1.00: ACPI cmd ef/03:45:00:00:00:a0 filtered out ata1.00: ACPI cmd f5/00:00:00:00:00:a0 filtered out ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 filtered out ata1.00: ACPI cmd c6/00:10:00:00:00:a0 succeeded -e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX -0000:00:19.0: eth0: 10/100 speed: disabling TSO ata1.00: configured for UDMA/100 ata1.01: configured for MWDMA2 ata1.00: configured for UDMA/100 ata1.01: configured for MWDMA2 ata1: EH complete sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors: (120 GB/111 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors: (120 GB/111 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA usb 1-1: reset full speed USB device using uhci_hcd and address 2 usb 5-2: reset full speed USB device using uhci_hcd and address 2 pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) pci 0000:00:02.0: setting latency timer to 64 Restarting tasks ... done. ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-08 19:28 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Frans Pop @ 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-08 20:50 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 20:50 UTC (permalink / raw) To: Frans Pop; +Cc: linux-pm, torvalds, linux-kernel On Sunday 08 March 2009, Frans Pop wrote: > (Most CCs dropped.) > > Hi Rafael, Hi Frans, > Rafael J. Wysocki wrote: > > The following patches modifiy the way in which we handle disabling > > interrupts during suspend and enabling them during resume. They also > > change the ordering of the core suspend and hibernation code to take > > advantage of the new approach to the interrupts and modify the PCI PM > > core to avoid a few problems. > > I've given this series a try on my HP 2510p. I've seen no regressions > with suspend to RAM. Great, thanks for testing! > Below is a diff between suspend/resume dmesg from before (based on rc5) > and after (rc7 + series) the patch, with some comments. > Nothing looks really wrong, but there are some surprising changes. > > Essentially JFYI though. > > Cheers, > FJP > > PM: Syncing filesystems ... done. > Freezing user space processes ... (elapsed 0.00 seconds) done. > Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. > Suspending console(s) (use no_console_suspend to debug) > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Stopping disk > ACPI handle has no context! > ACPI handle has no context! > sdhci-pci 0000:02:06.2: PME# disabled > sdhci-pci 0000:02:06.2: PCI INT C disabled > ACPI handle has no context! > ACPI handle has no context! > # Bogus: result of using wireless instead of wired networking. > +iwlagn 0000:10:00.0: PCI INT A disabled > ata2: port disabled. ignoring. > ata_piix 0000:00:1f.1: PCI INT A disabled > ehci_hcd 0000:00:1d.7: PCI INT A disabled > ehci_hcd 0000:00:1d.7: PME# disabled > uhci_hcd 0000:00:1d.2: PCI INT C disabled > uhci_hcd 0000:00:1d.1: PCI INT B disabled > uhci_hcd 0000:00:1d.0: PCI INT A disabled > HDA Intel 0000:00:1b.0: PCI INT A disabled > HDA Intel 0000:00:1b.0: power state changed by ACPI to D3 > ehci_hcd 0000:00:1a.7: PCI INT C disabled > ehci_hcd 0000:00:1a.7: PME# disabled > uhci_hcd 0000:00:1a.1: PCI INT B disabled > uhci_hcd 0000:00:1a.0: PCI INT A disabled > e1000e 0000:00:19.0: PME# enabled > e1000e 0000:00:19.0: wake-up capability enabled by ACPI > e1000e 0000:00:19.0: PME# enabled > e1000e 0000:00:19.0: wake-up capability enabled by ACPI > e1000e 0000:00:19.0: PCI INT A disabled > ACPI handle has no context! > # This has moved up a bit. Looks more logical. This is a result of patch 2/8, intentional. > +ricoh-mmc: Suspending. > +ricoh-mmc: Controller is now re-enabled. > ACPI: Preparing to enter system sleep state S3 > Disabling non-boot CPUs ... > CPU 1 is now offline > SMP alternatives: switching to UP code > CPU0 attaching NULL sched-domain. > CPU1 attaching NULL sched-domain. > CPU0 attaching NULL sched-domain. > CPU1 is down > -ricoh-mmc: Suspending. > -ricoh-mmc: Controller is now re-enabled. > Extended CMOS year: 2000 > > Back to C! > +CPU0: Thermal monitoring enabled (TM2) > Extended CMOS year: 2000 > # This whole block has moved up before early config space restores. > # No changes in the block itself. Yes this also is an intentional result of patch 2/8. > +Enabling non-boot CPUs ... > +SMP alternatives: switching to SMP code > +Booting processor 1 APIC 0x1 ip 0x6000 > +Initializing CPU#1 > +Calibrating delay using timer specific routine.. 2660.04 BogoMIPS (lpj=5320097) > +CPU: L1 I cache: 32K, L1 D cache: 32K > +CPU: L2 cache: 2048K > +[ds] using Core 2/Atom configuration > +CPU: Physical Processor ID: 0 > +CPU: Processor Core ID: 1 > +CPU1: Thermal monitoring enabled (TM2) > +CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d > +CPU0 attaching NULL sched-domain. > +Switched to high resolution mode on CPU 1 > +CPU0 attaching sched-domain: > + domain 0: span 0-1 level MC > + groups: 0 1 > +CPU1 attaching sched-domain: > + domain 0: span 0-1 level MC > + groups: 1 0 > +CPU1 is up > +ACPI: Waking up from system sleep state S3 > pci 0000:00:02.0: restoring config space at offset 0x8 (was 0x1, writing 0x2001) > # These don't need restoring anymore? I think they generally do, but the restored values may (and often are) identical to the current ones. > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) > serial 0000:00:03.3: restoring config space at offset 0xf (was 0x200, writing 0x20a) > serial 0000:00:03.3: restoring config space at offset 0x5 (was 0x0, writing 0xe0601000) > serial 0000:00:03.3: restoring config space at offset 0x4 (was 0x1, writing 0x2041) > serial 0000:00:03.3: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00007) > e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) > e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0x2061) > e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xe0640000) > e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > # These have moved down to late resume. That's a bit strange. It looks like the registers changed after we had restored them during "early" resume. So either we hadn't actually restored them (it would be interesting to find out why), or they really changed (again, it would be interesting to see why). > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0xf (was 0x300, writing 0x30b) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0641000) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) > HDA Intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > HDA Intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > HDA Intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100002) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x4010a) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x8 (was 0x0, writing 0xfff0) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x7 (was 0x0, writing 0x200000f0) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x6 (was 0x0, writing 0x80800) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x4020a) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x8 (was 0x0, writing 0xe000e000) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x7 (was 0x0, writing 0xf0) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > # These have moved down to late resume. The last comment applies here too. > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0xf (was 0x100, writing 0x10a) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0648000) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) > # These have disappeared. Good. > -pci 0000:00:1e.0: restoring config space at offset 0x9 (was 0x10001, writing 0x83f18001) > -pci 0000:00:1e.0: restoring config space at offset 0x8 (was 0x0, writing 0xe030e010) > -pci 0000:00:1e.0: restoring config space at offset 0x7 (was 0x228000f0, writing 0x22803030) > -pci 0000:00:1e.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100107) > # First two moved to late resume. Again, a bit strange. > # The third already happened during late resume (duplicated). > -ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) > -ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) > iwlagn 0000:10:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > iwlagn 0000:10:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xe0000004) > iwlagn 0000:10:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > iwlagn 0000:10:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100006) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xf (was 0x3000100, writing 0x580010b) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xe (was 0x0, writing 0x34fc) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xd (was 0x0, writing 0x3400) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xc (was 0x0, writing 0x30fc) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xb (was 0x0, writing 0x3000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xa (was 0x0, writing 0x87fff000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x9 (was 0x0, writing 0x84000000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x8 (was 0x0, writing 0x83fff000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x7 (was 0x0, writing 0x80000000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x6 (was 0x0, writing 0xb0060302) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x4 (was 0x0, writing 0xe0100000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x3 (was 0x820000, writing 0x82a800) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100007) > ohci1394 0000:02:06.1: restoring config space at offset 0xf (was 0x4020200, writing 0x4020205) > ohci1394 0000:02:06.1: restoring config space at offset 0x4 (was 0x0, writing 0xe0101000) > ohci1394 0000:02:06.1: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > ohci1394 0000:02:06.1: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > sdhci-pci 0000:02:06.2: restoring config space at offset 0xf (was 0x300, writing 0x30a) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x4 (was 0x0, writing 0xe0102000) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > # Some changes; a lot just got dropped. > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0x30a) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xe (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xd (was 0x80, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xc (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xb (was 0x30c9103c, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xa (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x9 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x8 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x7 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x6 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x5 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xe0103000) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x2 (was 0x8800011, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x0 (was 0x8431180, writing 0xffffffff) > ricoh-mmc: Resuming. > ricoh-mmc: Controller is now disabled. > -Enabling non-boot CPUs ... > -SMP alternatives: switching to SMP code > -Booting processor 1 APIC 0x1 ip 0x6000 > -Initializing CPU#1 > -Calibrating delay using timer specific routine.. 2660.07 BogoMIPS (lpj=5320158) > -CPU: L1 I cache: 32K, L1 D cache: 32K > -CPU: L2 cache: 2048K > -[ds] using Core 2/Atom configuration > -CPU: Physical Processor ID: 0 > -CPU: Processor Core ID: 1 > -CPU1: Thermal monitoring enabled (TM2) > -x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106 > -CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d > -CPU0 attaching NULL sched-domain. > -Switched to high resolution mode on CPU 1 > -CPU0 attaching sched-domain: > - domain 0: span 0-1 level MC > - groups: 0 1 > -CPU1 attaching sched-domain: > - domain 0: span 0-1 level MC > - groups: 1 0 > -CPU1 is up > -ACPI: Waking up from system sleep state S3 > ACPI: EC: non-query interrupt received, switching to interrupt mode > pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) > pci 0000:00:02.0: PME# disabled > pci 0000:00:02.1: PME# disabled > pci 0000:00:03.0: PME# disabled > pci 0000:00:03.2: PME# disabled > e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 > e1000e 0000:00:19.0: setting latency timer to 64 > e1000e 0000:00:19.0: wake-up capability disabled by ACPI > e1000e 0000:00:19.0: PME# disabled > e1000e 0000:00:19.0: wake-up capability disabled by ACPI > e1000e 0000:00:19.0: PME# disabled > e1000e 0000:00:19.0: irq 26 for MSI/MSI-X > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > uhci_hcd 0000:00:1a.0: setting latency timer to 64 > usb usb1: root hub lost power or was reset > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 > uhci_hcd 0000:00:1a.1: setting latency timer to 64 > usb usb3: root hub lost power or was reset > ehci_hcd 0000:00:1a.7: PME# disabled > ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18 > ehci_hcd 0000:00:1a.7: setting latency timer to 64 > ehci_hcd 0000:00:1a.7: PME# disabled > # Called twice now? > HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 > +HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 Yeah, it's not nice. The problem is that pci_set_power_state() doesn't check if the power state is already correct before calling the platform to change it. The platform should cope with that, but it shouldn't be called for the second time at all. In fact I have a patch to change this behavior, but I consider it as a separate thing. > HDA Intel 0000:00:1b.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 > HDA Intel 0000:00:1b.0: setting latency timer to 64 > pcieport-driver 0000:00:1c.0: setting latency timer to 64 > pcieport-driver 0000:00:1c.1: setting latency timer to 64 > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 > uhci_hcd 0000:00:1d.0: setting latency timer to 64 > usb usb5: root hub lost power or was reset > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 22 (level, low) -> IRQ 22 > uhci_hcd 0000:00:1d.1: setting latency timer to 64 > usb usb6: root hub lost power or was reset > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 > uhci_hcd 0000:00:1d.2: setting latency timer to 64 > usb usb7: root hub lost power or was reset > ehci_hcd 0000:00:1d.7: PME# disabled > ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 20 (level, low) -> IRQ 20 > ehci_hcd 0000:00:1d.7: setting latency timer to 64 > ehci_hcd 0000:00:1d.7: PME# disabled > pci 0000:00:1e.0: setting latency timer to 64 > +ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) > ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) > ata_piix 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > ata_piix 0000:00:1f.1: setting latency timer to 64 > ata2: port disabled. ignoring. > ACPI Exception (exoparg2-0445): AE_AML_PACKAGE_LIMIT, Index (000000005) is beyond end of object [20081204] > ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C2C3] (Node ffff88007e01dea0), AE_AML_PACKAGE_LIMIT > ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C003.C0F6.C3F3._STM] (Node ffff88007e043de0), AE_AML_PACKAGE_LIMIT > ata1: ACPI set timing mode failed (status=0x300b) > # Remaining differences are bogus: result of using wireless instead of wired networking. OK Thanks for the debugging work. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-08 19:28 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Frans Pop 2009-03-08 20:50 ` Rafael J. Wysocki @ 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-14 8:44 ` Frans Pop 2009-03-14 8:44 ` Frans Pop 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-08 20:50 UTC (permalink / raw) To: Frans Pop; +Cc: linux-kernel, torvalds, linux-pm On Sunday 08 March 2009, Frans Pop wrote: > (Most CCs dropped.) > > Hi Rafael, Hi Frans, > Rafael J. Wysocki wrote: > > The following patches modifiy the way in which we handle disabling > > interrupts during suspend and enabling them during resume. They also > > change the ordering of the core suspend and hibernation code to take > > advantage of the new approach to the interrupts and modify the PCI PM > > core to avoid a few problems. > > I've given this series a try on my HP 2510p. I've seen no regressions > with suspend to RAM. Great, thanks for testing! > Below is a diff between suspend/resume dmesg from before (based on rc5) > and after (rc7 + series) the patch, with some comments. > Nothing looks really wrong, but there are some surprising changes. > > Essentially JFYI though. > > Cheers, > FJP > > PM: Syncing filesystems ... done. > Freezing user space processes ... (elapsed 0.00 seconds) done. > Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. > Suspending console(s) (use no_console_suspend to debug) > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Stopping disk > ACPI handle has no context! > ACPI handle has no context! > sdhci-pci 0000:02:06.2: PME# disabled > sdhci-pci 0000:02:06.2: PCI INT C disabled > ACPI handle has no context! > ACPI handle has no context! > # Bogus: result of using wireless instead of wired networking. > +iwlagn 0000:10:00.0: PCI INT A disabled > ata2: port disabled. ignoring. > ata_piix 0000:00:1f.1: PCI INT A disabled > ehci_hcd 0000:00:1d.7: PCI INT A disabled > ehci_hcd 0000:00:1d.7: PME# disabled > uhci_hcd 0000:00:1d.2: PCI INT C disabled > uhci_hcd 0000:00:1d.1: PCI INT B disabled > uhci_hcd 0000:00:1d.0: PCI INT A disabled > HDA Intel 0000:00:1b.0: PCI INT A disabled > HDA Intel 0000:00:1b.0: power state changed by ACPI to D3 > ehci_hcd 0000:00:1a.7: PCI INT C disabled > ehci_hcd 0000:00:1a.7: PME# disabled > uhci_hcd 0000:00:1a.1: PCI INT B disabled > uhci_hcd 0000:00:1a.0: PCI INT A disabled > e1000e 0000:00:19.0: PME# enabled > e1000e 0000:00:19.0: wake-up capability enabled by ACPI > e1000e 0000:00:19.0: PME# enabled > e1000e 0000:00:19.0: wake-up capability enabled by ACPI > e1000e 0000:00:19.0: PCI INT A disabled > ACPI handle has no context! > # This has moved up a bit. Looks more logical. This is a result of patch 2/8, intentional. > +ricoh-mmc: Suspending. > +ricoh-mmc: Controller is now re-enabled. > ACPI: Preparing to enter system sleep state S3 > Disabling non-boot CPUs ... > CPU 1 is now offline > SMP alternatives: switching to UP code > CPU0 attaching NULL sched-domain. > CPU1 attaching NULL sched-domain. > CPU0 attaching NULL sched-domain. > CPU1 is down > -ricoh-mmc: Suspending. > -ricoh-mmc: Controller is now re-enabled. > Extended CMOS year: 2000 > > Back to C! > +CPU0: Thermal monitoring enabled (TM2) > Extended CMOS year: 2000 > # This whole block has moved up before early config space restores. > # No changes in the block itself. Yes this also is an intentional result of patch 2/8. > +Enabling non-boot CPUs ... > +SMP alternatives: switching to SMP code > +Booting processor 1 APIC 0x1 ip 0x6000 > +Initializing CPU#1 > +Calibrating delay using timer specific routine.. 2660.04 BogoMIPS (lpj=5320097) > +CPU: L1 I cache: 32K, L1 D cache: 32K > +CPU: L2 cache: 2048K > +[ds] using Core 2/Atom configuration > +CPU: Physical Processor ID: 0 > +CPU: Processor Core ID: 1 > +CPU1: Thermal monitoring enabled (TM2) > +CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d > +CPU0 attaching NULL sched-domain. > +Switched to high resolution mode on CPU 1 > +CPU0 attaching sched-domain: > + domain 0: span 0-1 level MC > + groups: 0 1 > +CPU1 attaching sched-domain: > + domain 0: span 0-1 level MC > + groups: 1 0 > +CPU1 is up > +ACPI: Waking up from system sleep state S3 > pci 0000:00:02.0: restoring config space at offset 0x8 (was 0x1, writing 0x2001) > # These don't need restoring anymore? I think they generally do, but the restored values may (and often are) identical to the current ones. > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) > serial 0000:00:03.3: restoring config space at offset 0xf (was 0x200, writing 0x20a) > serial 0000:00:03.3: restoring config space at offset 0x5 (was 0x0, writing 0xe0601000) > serial 0000:00:03.3: restoring config space at offset 0x4 (was 0x1, writing 0x2041) > serial 0000:00:03.3: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00007) > e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) > e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0x2061) > e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xe0640000) > e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) > # These have moved down to late resume. That's a bit strange. It looks like the registers changed after we had restored them during "early" resume. So either we hadn't actually restored them (it would be interesting to find out why), or they really changed (again, it would be interesting to see why). > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0xf (was 0x300, writing 0x30b) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0641000) > ehci_hcd 0000:00:1a.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) > HDA Intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > HDA Intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > HDA Intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100002) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x4010a) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x8 (was 0x0, writing 0xfff0) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x7 (was 0x0, writing 0x200000f0) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x6 (was 0x0, writing 0x80800) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > pcieport-driver 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x4020a) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x8 (was 0x0, writing 0xe000e000) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x7 (was 0x0, writing 0xf0) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) > pcieport-driver 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) > # These have moved down to late resume. The last comment applies here too. > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) > -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) > -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) > -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0xf (was 0x100, writing 0x10a) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0648000) > ehci_hcd 0000:00:1d.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) > # These have disappeared. Good. > -pci 0000:00:1e.0: restoring config space at offset 0x9 (was 0x10001, writing 0x83f18001) > -pci 0000:00:1e.0: restoring config space at offset 0x8 (was 0x0, writing 0xe030e010) > -pci 0000:00:1e.0: restoring config space at offset 0x7 (was 0x228000f0, writing 0x22803030) > -pci 0000:00:1e.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100107) > # First two moved to late resume. Again, a bit strange. > # The third already happened during late resume (duplicated). > -ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) > -ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) > -ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) > iwlagn 0000:10:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > iwlagn 0000:10:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xe0000004) > iwlagn 0000:10:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) > iwlagn 0000:10:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100006) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xf (was 0x3000100, writing 0x580010b) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xe (was 0x0, writing 0x34fc) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xd (was 0x0, writing 0x3400) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xc (was 0x0, writing 0x30fc) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xb (was 0x0, writing 0x3000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0xa (was 0x0, writing 0x87fff000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x9 (was 0x0, writing 0x84000000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x8 (was 0x0, writing 0x83fff000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x7 (was 0x0, writing 0x80000000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x6 (was 0x0, writing 0xb0060302) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x4 (was 0x0, writing 0xe0100000) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x3 (was 0x820000, writing 0x82a800) > yenta_cardbus 0000:02:06.0: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100007) > ohci1394 0000:02:06.1: restoring config space at offset 0xf (was 0x4020200, writing 0x4020205) > ohci1394 0000:02:06.1: restoring config space at offset 0x4 (was 0x0, writing 0xe0101000) > ohci1394 0000:02:06.1: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > ohci1394 0000:02:06.1: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > sdhci-pci 0000:02:06.2: restoring config space at offset 0xf (was 0x300, writing 0x30a) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x4 (was 0x0, writing 0xe0102000) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > sdhci-pci 0000:02:06.2: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > # Some changes; a lot just got dropped. > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0x30a) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xe (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xd (was 0x80, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xc (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xb (was 0x30c9103c, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xa (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x9 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x8 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x7 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x6 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x5 (was 0x0, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xe0103000) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x2 (was 0x8800011, writing 0xffffffff) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0xffffffff) > +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) > -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x0 (was 0x8431180, writing 0xffffffff) > ricoh-mmc: Resuming. > ricoh-mmc: Controller is now disabled. > -Enabling non-boot CPUs ... > -SMP alternatives: switching to SMP code > -Booting processor 1 APIC 0x1 ip 0x6000 > -Initializing CPU#1 > -Calibrating delay using timer specific routine.. 2660.07 BogoMIPS (lpj=5320158) > -CPU: L1 I cache: 32K, L1 D cache: 32K > -CPU: L2 cache: 2048K > -[ds] using Core 2/Atom configuration > -CPU: Physical Processor ID: 0 > -CPU: Processor Core ID: 1 > -CPU1: Thermal monitoring enabled (TM2) > -x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106 > -CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d > -CPU0 attaching NULL sched-domain. > -Switched to high resolution mode on CPU 1 > -CPU0 attaching sched-domain: > - domain 0: span 0-1 level MC > - groups: 0 1 > -CPU1 attaching sched-domain: > - domain 0: span 0-1 level MC > - groups: 1 0 > -CPU1 is up > -ACPI: Waking up from system sleep state S3 > ACPI: EC: non-query interrupt received, switching to interrupt mode > pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) > pci 0000:00:02.0: PME# disabled > pci 0000:00:02.1: PME# disabled > pci 0000:00:03.0: PME# disabled > pci 0000:00:03.2: PME# disabled > e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 > e1000e 0000:00:19.0: setting latency timer to 64 > e1000e 0000:00:19.0: wake-up capability disabled by ACPI > e1000e 0000:00:19.0: PME# disabled > e1000e 0000:00:19.0: wake-up capability disabled by ACPI > e1000e 0000:00:19.0: PME# disabled > e1000e 0000:00:19.0: irq 26 for MSI/MSI-X > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > uhci_hcd 0000:00:1a.0: setting latency timer to 64 > usb usb1: root hub lost power or was reset > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 > uhci_hcd 0000:00:1a.1: setting latency timer to 64 > usb usb3: root hub lost power or was reset > ehci_hcd 0000:00:1a.7: PME# disabled > ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18 > ehci_hcd 0000:00:1a.7: setting latency timer to 64 > ehci_hcd 0000:00:1a.7: PME# disabled > # Called twice now? > HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 > +HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 Yeah, it's not nice. The problem is that pci_set_power_state() doesn't check if the power state is already correct before calling the platform to change it. The platform should cope with that, but it shouldn't be called for the second time at all. In fact I have a patch to change this behavior, but I consider it as a separate thing. > HDA Intel 0000:00:1b.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 > HDA Intel 0000:00:1b.0: setting latency timer to 64 > pcieport-driver 0000:00:1c.0: setting latency timer to 64 > pcieport-driver 0000:00:1c.1: setting latency timer to 64 > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) > +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 > uhci_hcd 0000:00:1d.0: setting latency timer to 64 > usb usb5: root hub lost power or was reset > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) > +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 22 (level, low) -> IRQ 22 > uhci_hcd 0000:00:1d.1: setting latency timer to 64 > usb usb6: root hub lost power or was reset > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) > +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 > uhci_hcd 0000:00:1d.2: setting latency timer to 64 > usb usb7: root hub lost power or was reset > ehci_hcd 0000:00:1d.7: PME# disabled > ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 20 (level, low) -> IRQ 20 > ehci_hcd 0000:00:1d.7: setting latency timer to 64 > ehci_hcd 0000:00:1d.7: PME# disabled > pci 0000:00:1e.0: setting latency timer to 64 > +ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) > +ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) > ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) > ata_piix 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 > ata_piix 0000:00:1f.1: setting latency timer to 64 > ata2: port disabled. ignoring. > ACPI Exception (exoparg2-0445): AE_AML_PACKAGE_LIMIT, Index (000000005) is beyond end of object [20081204] > ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C2C3] (Node ffff88007e01dea0), AE_AML_PACKAGE_LIMIT > ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C003.C0F6.C3F3._STM] (Node ffff88007e043de0), AE_AML_PACKAGE_LIMIT > ata1: ACPI set timing mode failed (status=0x300b) > # Remaining differences are bogus: result of using wireless instead of wired networking. OK Thanks for the debugging work. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-08 20:50 ` Rafael J. Wysocki @ 2009-03-14 8:44 ` Frans Pop 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 8:44 ` Frans Pop 1 sibling, 2 replies; 373+ messages in thread From: Frans Pop @ 2009-03-14 8:44 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-kernel, torvalds, linux-pm On Sunday 08 March 2009, Rafael J. Wysocki wrote: > > # These don't need restoring anymore? > > I think they generally do, but the restored values may (and often are) > identical to the current ones. > > > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) [...] > > # These have moved down to late resume. > > That's a bit strange. It looks like the registers changed after we had > restored them during "early" resume. So either we hadn't actually > restored them (it would be interesting to find out why), or they really > changed (again, it would be interesting to see why). > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) These changes look to have been reverted somehow with rc8 + your latest patch set. Not sure if it's due to changes in the patches, or just an effect of local circumstances (such as (un)suspending while the system is docked). Or sun spots of course. The "restoring config space" messages now look virtually the same as for rc5, only some messages for the ricoh-mmc module are still "missing", but I'm not worried about that. Cheers, FJP ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 8:44 ` Frans Pop @ 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 14:11 ` Frans Pop 2009-03-14 14:11 ` Frans Pop 2009-03-14 11:59 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:59 UTC (permalink / raw) To: Frans Pop; +Cc: linux-kernel, torvalds, linux-pm On Saturday 14 March 2009, Frans Pop wrote: > On Sunday 08 March 2009, Rafael J. Wysocki wrote: > > > # These don't need restoring anymore? > > > > I think they generally do, but the restored values may (and often are) > > identical to the current ones. > > > > > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > > > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > > > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > > > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > > > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > > > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > > > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > > > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > > > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > > > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > > > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) > [...] > > > # These have moved down to late resume. > > > > That's a bit strange. It looks like the registers changed after we had > > restored them during "early" resume. So either we hadn't actually > > restored them (it would be interesting to find out why), or they really > > changed (again, it would be interesting to see why). > > > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > These changes look to have been reverted somehow with rc8 + your latest > patch set. Not sure if it's due to changes in the patches, or just an > effect of local circumstances (such as (un)suspending while the system > is docked). Or sun spots of course. > > The "restoring config space" messages now look virtually the same > as for rc5, only some messages for the ricoh-mmc module are still > "missing", but I'm not worried about that. Thanks for testing! Could you please also test the last iteration of the $subject patch series (just sent) with the appended patch applied on top and post dmesg output? Rafael --- drivers/pci/pci-driver.c | 23 +++++++++++++++++++++-- drivers/pci/pci.c | 5 +++++ 2 files changed, 26 insertions(+), 2 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -732,6 +732,9 @@ int pci_save_state(struct pci_dev *dev) { int i; + + dev_info(&dev->dev, "saving PCI config space\n"); + /* XXX: 100% dword access ok here? */ for (i = 0; i < 16; i++) pci_read_config_dword(dev, i * 4,&dev->saved_config_space[i]); @@ -753,6 +756,8 @@ pci_restore_state(struct pci_dev *dev) int i; u32 val; + dev_info(&dev->dev, "restoring PCI config space\n"); + /* PCI Express register must be restored first */ pci_restore_pcie_state(dev); Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -438,10 +438,24 @@ static int pci_restore_standard_config(s { pci_update_current_state(pci_dev, PCI_UNKNOWN); + switch (pci_dev->current_state) { + case PCI_UNKNOWN: + case PCI_POWER_ERROR: + dev_info(&pci_dev->dev, "%s: unknown power state\n", + __FUNCTION__); + break; + default: + dev_info(&pci_dev->dev, "%s: power state D%d\n", + __FUNCTION__, pci_dev->current_state); + } + if (pci_dev->current_state != PCI_D0) { int error = pci_set_power_state(pci_dev, PCI_D0); - if (error) + if (error) { + dev_err(&pci_dev->dev, + "error %d while changing power state\n", error); return error; + } } return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; @@ -449,6 +463,8 @@ static int pci_restore_standard_config(s static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { + dev_info(&pci_dev->dev, "%s: calling pci_restore_standard_config()\n", + __FUNCTION__); pci_restore_standard_config(pci_dev); pci_dev->state_saved = false; pci_fixup_device(pci_fixup_resume_early, pci_dev); @@ -615,8 +631,11 @@ static int pci_pm_resume(struct device * * This is necessary for the suspend error path in which resume is * called without restoring the standard config registers of the device. */ - if (pci_dev->state_saved) + if (pci_dev->state_saved) { + dev_info(dev, "%s: restoring standard PCI config registers\n", + __FUNCTION__); pci_restore_standard_config(pci_dev); + } if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume(dev); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 11:59 ` Rafael J. Wysocki @ 2009-03-14 14:11 ` Frans Pop 2009-03-14 14:11 ` Frans Pop 1 sibling, 0 replies; 373+ messages in thread From: Frans Pop @ 2009-03-14 14:11 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm, torvalds, linux-kernel [-- Attachment #1: Type: text/plain, Size: 368 bytes --] On Saturday 14 March 2009, you wrote: > Could you please also test the last iteration of the $subject patch > series (just sent) with the appended patch applied on top and post > dmesg output? Here you are: - boot - STR with wireless networking - STD with wireless networking - STR with wired networking and killswitch on wireless No problems seen :-) Cheers, FJP [-- Attachment #2: 2.6.29-rc8-rjw-test.gz --] [-- Type: application/x-gzip, Size: 14185 bytes --] [-- Attachment #3: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 14:11 ` Frans Pop @ 2009-03-14 14:11 ` Frans Pop 2009-03-14 22:31 ` Rafael J. Wysocki 2009-03-14 22:31 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Frans Pop @ 2009-03-14 14:11 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-kernel, torvalds, linux-pm [-- Attachment #1: Type: text/plain, Size: 368 bytes --] On Saturday 14 March 2009, you wrote: > Could you please also test the last iteration of the $subject patch > series (just sent) with the appended patch applied on top and post > dmesg output? Here you are: - boot - STR with wireless networking - STD with wireless networking - STR with wired networking and killswitch on wireless No problems seen :-) Cheers, FJP [-- Attachment #2: 2.6.29-rc8-rjw-test.gz --] [-- Type: application/x-gzip, Size: 14185 bytes --] ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 14:11 ` Frans Pop @ 2009-03-14 22:31 ` Rafael J. Wysocki 2009-03-14 22:31 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 22:31 UTC (permalink / raw) To: Frans Pop; +Cc: linux-pm, torvalds, linux-kernel On Saturday 14 March 2009, Frans Pop wrote: > On Saturday 14 March 2009, you wrote: > > Could you please also test the last iteration of the $subject patch > > series (just sent) with the appended patch applied on top and post > > dmesg output? > > Here you are: > - boot > - STR with wireless networking > - STD with wireless networking > - STR with wired networking and killswitch on wireless > > No problems seen :-) Great, thanks for the log, it looks correct. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 14:11 ` Frans Pop 2009-03-14 22:31 ` Rafael J. Wysocki @ 2009-03-14 22:31 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 22:31 UTC (permalink / raw) To: Frans Pop; +Cc: linux-kernel, torvalds, linux-pm On Saturday 14 March 2009, Frans Pop wrote: > On Saturday 14 March 2009, you wrote: > > Could you please also test the last iteration of the $subject patch > > series (just sent) with the appended patch applied on top and post > > dmesg output? > > Here you are: > - boot > - STR with wireless networking > - STD with wireless networking > - STR with wired networking and killswitch on wireless > > No problems seen :-) Great, thanks for the log, it looks correct. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-14 8:44 ` Frans Pop 2009-03-14 11:59 ` Rafael J. Wysocki @ 2009-03-14 11:59 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:59 UTC (permalink / raw) To: Frans Pop; +Cc: linux-pm, torvalds, linux-kernel On Saturday 14 March 2009, Frans Pop wrote: > On Sunday 08 March 2009, Rafael J. Wysocki wrote: > > > # These don't need restoring anymore? > > > > I think they generally do, but the restored values may (and often are) > > identical to the current ones. > > > > > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > > > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > > > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > > > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > > > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > > > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > > > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > > > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > > > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > > > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > > > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) > [...] > > > # These have moved down to late resume. > > > > That's a bit strange. It looks like the registers changed after we had > > restored them during "early" resume. So either we hadn't actually > > restored them (it would be interesting to find out why), or they really > > changed (again, it would be interesting to see why). > > > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > These changes look to have been reverted somehow with rc8 + your latest > patch set. Not sure if it's due to changes in the patches, or just an > effect of local circumstances (such as (un)suspending while the system > is docked). Or sun spots of course. > > The "restoring config space" messages now look virtually the same > as for rc5, only some messages for the ricoh-mmc module are still > "missing", but I'm not worried about that. Thanks for testing! Could you please also test the last iteration of the $subject patch series (just sent) with the appended patch applied on top and post dmesg output? Rafael --- drivers/pci/pci-driver.c | 23 +++++++++++++++++++++-- drivers/pci/pci.c | 5 +++++ 2 files changed, 26 insertions(+), 2 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -732,6 +732,9 @@ int pci_save_state(struct pci_dev *dev) { int i; + + dev_info(&dev->dev, "saving PCI config space\n"); + /* XXX: 100% dword access ok here? */ for (i = 0; i < 16; i++) pci_read_config_dword(dev, i * 4,&dev->saved_config_space[i]); @@ -753,6 +756,8 @@ pci_restore_state(struct pci_dev *dev) int i; u32 val; + dev_info(&dev->dev, "restoring PCI config space\n"); + /* PCI Express register must be restored first */ pci_restore_pcie_state(dev); Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -438,10 +438,24 @@ static int pci_restore_standard_config(s { pci_update_current_state(pci_dev, PCI_UNKNOWN); + switch (pci_dev->current_state) { + case PCI_UNKNOWN: + case PCI_POWER_ERROR: + dev_info(&pci_dev->dev, "%s: unknown power state\n", + __FUNCTION__); + break; + default: + dev_info(&pci_dev->dev, "%s: power state D%d\n", + __FUNCTION__, pci_dev->current_state); + } + if (pci_dev->current_state != PCI_D0) { int error = pci_set_power_state(pci_dev, PCI_D0); - if (error) + if (error) { + dev_err(&pci_dev->dev, + "error %d while changing power state\n", error); return error; + } } return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; @@ -449,6 +463,8 @@ static int pci_restore_standard_config(s static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { + dev_info(&pci_dev->dev, "%s: calling pci_restore_standard_config()\n", + __FUNCTION__); pci_restore_standard_config(pci_dev); pci_dev->state_saved = false; pci_fixup_device(pci_fixup_resume_early, pci_dev); @@ -615,8 +631,11 @@ static int pci_pm_resume(struct device * * This is necessary for the suspend error path in which resume is * called without restoring the standard config registers of the device. */ - if (pci_dev->state_saved) + if (pci_dev->state_saved) { + dev_info(dev, "%s: restoring standard PCI config registers\n", + __FUNCTION__); pci_restore_standard_config(pci_dev); + } if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume(dev); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-14 8:44 ` Frans Pop @ 2009-03-14 8:44 ` Frans Pop 1 sibling, 0 replies; 373+ messages in thread From: Frans Pop @ 2009-03-14 8:44 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm, torvalds, linux-kernel On Sunday 08 March 2009, Rafael J. Wysocki wrote: > > # These don't need restoring anymore? > > I think they generally do, but the restored values may (and often are) > identical to the current ones. > > > -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) > > -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) > > -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) > > -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) > > -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) > > -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) > > -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) > > -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) > > -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) > > -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) > > -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) [...] > > # These have moved down to late resume. > > That's a bit strange. It looks like the registers changed after we had > restored them during "early" resume. So either we hadn't actually > restored them (it would be interesting to find out why), or they really > changed (again, it would be interesting to see why). > > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) > > -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) > > -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) These changes look to have been reverted somehow with rc8 + your latest patch set. Not sure if it's due to changes in the patches, or just an effect of local circumstances (such as (un)suspending while the system is docked). Or sun spots of course. The "restoring config space" messages now look virtually the same as for rc5, only some messages for the ricoh-mmc module are still "missing", but I'm not worried about that. Cheers, FJP ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts 2009-03-07 10:19 ` Rafael J. Wysocki ` (15 preceding siblings ...) 2009-03-08 19:28 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Frans Pop @ 2009-03-08 19:28 ` Frans Pop 16 siblings, 0 replies; 373+ messages in thread From: Frans Pop @ 2009-03-08 19:28 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm, torvalds, linux-kernel (Most CCs dropped.) Hi Rafael, Rafael J. Wysocki wrote: > The following patches modifiy the way in which we handle disabling > interrupts during suspend and enabling them during resume. They also > change the ordering of the core suspend and hibernation code to take > advantage of the new approach to the interrupts and modify the PCI PM > core to avoid a few problems. I've given this series a try on my HP 2510p. I've seen no regressions with suspend to RAM. Below is a diff between suspend/resume dmesg from before (based on rc5) and after (rc7 + series) the patch, with some comments. Nothing looks really wrong, but there are some surprising changes. Essentially JFYI though. Cheers, FJP PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. Suspending console(s) (use no_console_suspend to debug) sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk ACPI handle has no context! ACPI handle has no context! sdhci-pci 0000:02:06.2: PME# disabled sdhci-pci 0000:02:06.2: PCI INT C disabled ACPI handle has no context! ACPI handle has no context! # Bogus: result of using wireless instead of wired networking. +iwlagn 0000:10:00.0: PCI INT A disabled ata2: port disabled. ignoring. ata_piix 0000:00:1f.1: PCI INT A disabled ehci_hcd 0000:00:1d.7: PCI INT A disabled ehci_hcd 0000:00:1d.7: PME# disabled uhci_hcd 0000:00:1d.2: PCI INT C disabled uhci_hcd 0000:00:1d.1: PCI INT B disabled uhci_hcd 0000:00:1d.0: PCI INT A disabled HDA Intel 0000:00:1b.0: PCI INT A disabled HDA Intel 0000:00:1b.0: power state changed by ACPI to D3 ehci_hcd 0000:00:1a.7: PCI INT C disabled ehci_hcd 0000:00:1a.7: PME# disabled uhci_hcd 0000:00:1a.1: PCI INT B disabled uhci_hcd 0000:00:1a.0: PCI INT A disabled e1000e 0000:00:19.0: PME# enabled e1000e 0000:00:19.0: wake-up capability enabled by ACPI e1000e 0000:00:19.0: PME# enabled e1000e 0000:00:19.0: wake-up capability enabled by ACPI e1000e 0000:00:19.0: PCI INT A disabled ACPI handle has no context! # This has moved up a bit. Looks more logical. +ricoh-mmc: Suspending. +ricoh-mmc: Controller is now re-enabled. ACPI: Preparing to enter system sleep state S3 Disabling non-boot CPUs ... CPU 1 is now offline SMP alternatives: switching to UP code CPU0 attaching NULL sched-domain. CPU1 attaching NULL sched-domain. CPU0 attaching NULL sched-domain. CPU1 is down -ricoh-mmc: Suspending. -ricoh-mmc: Controller is now re-enabled. Extended CMOS year: 2000 Back to C! +CPU0: Thermal monitoring enabled (TM2) Extended CMOS year: 2000 # This whole block has moved up before early config space restores. # No changes in the block itself. +Enabling non-boot CPUs ... +SMP alternatives: switching to SMP code +Booting processor 1 APIC 0x1 ip 0x6000 +Initializing CPU#1 +Calibrating delay using timer specific routine.. 2660.04 BogoMIPS (lpj=5320097) +CPU: L1 I cache: 32K, L1 D cache: 32K +CPU: L2 cache: 2048K +[ds] using Core 2/Atom configuration +CPU: Physical Processor ID: 0 +CPU: Processor Core ID: 1 +CPU1: Thermal monitoring enabled (TM2) +CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d +CPU0 attaching NULL sched-domain. +Switched to high resolution mode on CPU 1 +CPU0 attaching sched-domain: + domain 0: span 0-1 level MC + groups: 0 1 +CPU1 attaching sched-domain: + domain 0: span 0-1 level MC + groups: 1 0 +CPU1 is up +ACPI: Waking up from system sleep state S3 pci 0000:00:02.0: restoring config space at offset 0x8 (was 0x1, writing 0x2001) # These don't need restoring anymore? -pci 0000:00:02.1: restoring config space at offset 0x4 (was 0x4, writing 0xe0500004) -pci 0000:00:02.1: restoring config space at offset 0x1 (was 0x900000, writing 0x900007) -pci 0000:00:03.0: restoring config space at offset 0xf (was 0x100, writing 0x1ff) -pci 0000:00:03.0: restoring config space at offset 0x4 (was 0xfed12004, writing 0xe0600004) -pci 0000:00:03.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) -pci 0000:00:03.2: restoring config space at offset 0x8 (was 0x1, writing 0x2031) -pci 0000:00:03.2: restoring config space at offset 0x7 (was 0x1, writing 0x2021) -pci 0000:00:03.2: restoring config space at offset 0x6 (was 0x1, writing 0x2019) -pci 0000:00:03.2: restoring config space at offset 0x5 (was 0x1, writing 0x2011) -pci 0000:00:03.2: restoring config space at offset 0x4 (was 0x1, writing 0x2009) -pci 0000:00:03.2: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00001) serial 0000:00:03.3: restoring config space at offset 0xf (was 0x200, writing 0x20a) serial 0000:00:03.3: restoring config space at offset 0x5 (was 0x0, writing 0xe0601000) serial 0000:00:03.3: restoring config space at offset 0x4 (was 0x1, writing 0x2041) serial 0000:00:03.3: restoring config space at offset 0x1 (was 0xb00000, writing 0xb00007) e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10b) e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0x2061) e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xe0640000) e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100007) # These have moved down to late resume. -uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) -uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) -uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) ehci_hcd 0000:00:1a.7: restoring config space at offset 0xf (was 0x300, writing 0x30b) ehci_hcd 0000:00:1a.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0641000) ehci_hcd 0000:00:1a.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) HDA Intel 0000:00:1b.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) HDA Intel 0000:00:1b.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) HDA Intel 0000:00:1b.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100002) pcieport-driver 0000:00:1c.0: restoring config space at offset 0xf (was 0x100, writing 0x4010a) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x8 (was 0x0, writing 0xfff0) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x7 (was 0x0, writing 0x200000f0) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x6 (was 0x0, writing 0x80800) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) pcieport-driver 0000:00:1c.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) pcieport-driver 0000:00:1c.1: restoring config space at offset 0xf (was 0x200, writing 0x4020a) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x9 (was 0x10001, writing 0x1fff1) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x8 (was 0x0, writing 0xe000e000) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x7 (was 0x0, writing 0xf0) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x3 (was 0x810000, writing 0x810010) pcieport-driver 0000:00:1c.1: restoring config space at offset 0x1 (was 0x100000, writing 0x100407) # These have moved down to late resume. -uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) -uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) -uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) -uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) ehci_hcd 0000:00:1d.7: restoring config space at offset 0xf (was 0x100, writing 0x10a) ehci_hcd 0000:00:1d.7: restoring config space at offset 0x4 (was 0x0, writing 0xe0648000) ehci_hcd 0000:00:1d.7: restoring config space at offset 0x1 (was 0x2900000, writing 0x2900002) # These have disappeared. -pci 0000:00:1e.0: restoring config space at offset 0x9 (was 0x10001, writing 0x83f18001) -pci 0000:00:1e.0: restoring config space at offset 0x8 (was 0x0, writing 0xe030e010) -pci 0000:00:1e.0: restoring config space at offset 0x7 (was 0x228000f0, writing 0x22803030) -pci 0000:00:1e.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100107) # First two moved to late resume. # The third already happened during late resume (duplicated). -ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) -ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) -ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) iwlagn 0000:10:00.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) iwlagn 0000:10:00.0: restoring config space at offset 0x4 (was 0x4, writing 0xe0000004) iwlagn 0000:10:00.0: restoring config space at offset 0x3 (was 0x0, writing 0x10) iwlagn 0000:10:00.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100006) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xf (was 0x3000100, writing 0x580010b) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xe (was 0x0, writing 0x34fc) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xd (was 0x0, writing 0x3400) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xc (was 0x0, writing 0x30fc) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xb (was 0x0, writing 0x3000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0xa (was 0x0, writing 0x87fff000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x9 (was 0x0, writing 0x84000000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x8 (was 0x0, writing 0x83fff000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x7 (was 0x0, writing 0x80000000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x6 (was 0x0, writing 0xb0060302) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x4 (was 0x0, writing 0xe0100000) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x3 (was 0x820000, writing 0x82a800) yenta_cardbus 0000:02:06.0: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100007) ohci1394 0000:02:06.1: restoring config space at offset 0xf (was 0x4020200, writing 0x4020205) ohci1394 0000:02:06.1: restoring config space at offset 0x4 (was 0x0, writing 0xe0101000) ohci1394 0000:02:06.1: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) ohci1394 0000:02:06.1: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) sdhci-pci 0000:02:06.2: restoring config space at offset 0xf (was 0x300, writing 0x30a) sdhci-pci 0000:02:06.2: restoring config space at offset 0x4 (was 0x0, writing 0xe0102000) sdhci-pci 0000:02:06.2: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) sdhci-pci 0000:02:06.2: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) # Some changes; a lot just got dropped. -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0xf (was 0x300, writing 0x30a) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xe (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xd (was 0x80, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xc (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xb (was 0x30c9103c, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0xa (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x9 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x8 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x7 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x6 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x5 (was 0x0, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x4 (was 0x0, writing 0xe0103000) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x3 (was 0x800000, writing 0x804010) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x2 (was 0x8800011, writing 0xffffffff) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0xffffffff) +ricoh-mmc 0000:02:06.3: restoring config space at offset 0x1 (was 0x2100000, writing 0x2100006) -ricoh-mmc 0000:02:06.3: restoring config space at offset 0x0 (was 0x8431180, writing 0xffffffff) ricoh-mmc: Resuming. ricoh-mmc: Controller is now disabled. -Enabling non-boot CPUs ... -SMP alternatives: switching to SMP code -Booting processor 1 APIC 0x1 ip 0x6000 -Initializing CPU#1 -Calibrating delay using timer specific routine.. 2660.07 BogoMIPS (lpj=5320158) -CPU: L1 I cache: 32K, L1 D cache: 32K -CPU: L2 cache: 2048K -[ds] using Core 2/Atom configuration -CPU: Physical Processor ID: 0 -CPU: Processor Core ID: 1 -CPU1: Thermal monitoring enabled (TM2) -x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106 -CPU1: Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz stepping 0d -CPU0 attaching NULL sched-domain. -Switched to high resolution mode on CPU 1 -CPU0 attaching sched-domain: - domain 0: span 0-1 level MC - groups: 0 1 -CPU1 attaching sched-domain: - domain 0: span 0-1 level MC - groups: 1 0 -CPU1 is up -ACPI: Waking up from system sleep state S3 ACPI: EC: non-query interrupt received, switching to interrupt mode pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) pci 0000:00:02.0: PME# disabled pci 0000:00:02.1: PME# disabled pci 0000:00:03.0: PME# disabled pci 0000:00:03.2: PME# disabled e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 e1000e 0000:00:19.0: setting latency timer to 64 e1000e 0000:00:19.0: wake-up capability disabled by ACPI e1000e 0000:00:19.0: PME# disabled e1000e 0000:00:19.0: wake-up capability disabled by ACPI e1000e 0000:00:19.0: PME# disabled e1000e 0000:00:19.0: irq 26 for MSI/MSI-X +uhci_hcd 0000:00:1a.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x8 (was 0x1, writing 0x2081) +uhci_hcd 0000:00:1a.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 uhci_hcd 0000:00:1a.0: setting latency timer to 64 usb usb1: root hub lost power or was reset +uhci_hcd 0000:00:1a.1: restoring config space at offset 0xf (was 0x200, writing 0x20a) +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x8 (was 0x1, writing 0x20a1) +uhci_hcd 0000:00:1a.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 uhci_hcd 0000:00:1a.1: setting latency timer to 64 usb usb3: root hub lost power or was reset ehci_hcd 0000:00:1a.7: PME# disabled ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18 ehci_hcd 0000:00:1a.7: setting latency timer to 64 ehci_hcd 0000:00:1a.7: PME# disabled # Called twice now? HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 +HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 HDA Intel 0000:00:1b.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 HDA Intel 0000:00:1b.0: setting latency timer to 64 pcieport-driver 0000:00:1c.0: setting latency timer to 64 pcieport-driver 0000:00:1c.1: setting latency timer to 64 +uhci_hcd 0000:00:1d.0: restoring config space at offset 0xf (was 0x100, writing 0x10a) +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x8 (was 0x1, writing 0x20c1) +uhci_hcd 0000:00:1d.0: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 uhci_hcd 0000:00:1d.0: setting latency timer to 64 usb usb5: root hub lost power or was reset +uhci_hcd 0000:00:1d.1: restoring config space at offset 0xf (was 0x200, writing 0x20b) +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x8 (was 0x1, writing 0x20e1) +uhci_hcd 0000:00:1d.1: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 22 (level, low) -> IRQ 22 uhci_hcd 0000:00:1d.1: setting latency timer to 64 usb usb6: root hub lost power or was reset +uhci_hcd 0000:00:1d.2: restoring config space at offset 0xf (was 0x300, writing 0x30b) +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x8 (was 0x1, writing 0x2101) +uhci_hcd 0000:00:1d.2: restoring config space at offset 0x1 (was 0x2800000, writing 0x2800001) uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 uhci_hcd 0000:00:1d.2: setting latency timer to 64 usb usb7: root hub lost power or was reset ehci_hcd 0000:00:1d.7: PME# disabled ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 20 (level, low) -> IRQ 20 ehci_hcd 0000:00:1d.7: setting latency timer to 64 ehci_hcd 0000:00:1d.7: PME# disabled pci 0000:00:1e.0: setting latency timer to 64 +ata_piix 0000:00:1f.1: restoring config space at offset 0xf (was 0x100, writing 0x10a) +ata_piix 0000:00:1f.1: restoring config space at offset 0x8 (was 0xc01, writing 0x2121) ata_piix 0000:00:1f.1: restoring config space at offset 0x1 (was 0x2800005, writing 0x2880005) ata_piix 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16 ata_piix 0000:00:1f.1: setting latency timer to 64 ata2: port disabled. ignoring. ACPI Exception (exoparg2-0445): AE_AML_PACKAGE_LIMIT, Index (000000005) is beyond end of object [20081204] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C2C3] (Node ffff88007e01dea0), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.C003.C0F6.C3F3._STM] (Node ffff88007e043de0), AE_AML_PACKAGE_LIMIT ata1: ACPI set timing mode failed (status=0x300b) # Remaining differences are bogus: result of using wireless instead of wired networking. +iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 +iwlagn 0000:10:00.0: irq 27 for MSI/MSI-X ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[19] MMIO=[e0101000-e01017ff] Max Packet=[2048] IR/IT contexts=[4/4] sdhci-pci 0000:02:06.2: PCI INT C -> GSI 20 (level, low) -> IRQ 20 +Registered led device: iwl-phy0:radio +Registered led device: iwl-phy0:assoc +Registered led device: iwl-phy0:RX +Registered led device: iwl-phy0:TX sd 0:0:0:0: [sda] Starting disk ata1.01: ACPI cmd ef/03:0c:00:00:00:b0 filtered out ata1.01: ACPI cmd ef/03:40:00:00:00:b0 filtered out ata1.00: ACPI cmd ef/03:01:00:00:00:a0 filtered out ata1.00: ACPI cmd ef/03:45:00:00:00:a0 filtered out ata1.00: ACPI cmd f5/00:00:00:00:00:a0 filtered out ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 filtered out ata1.00: ACPI cmd c6/00:10:00:00:00:a0 succeeded -e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX -0000:00:19.0: eth0: 10/100 speed: disabling TSO ata1.00: configured for UDMA/100 ata1.01: configured for MWDMA2 ata1.00: configured for UDMA/100 ata1.01: configured for MWDMA2 ata1: EH complete sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors: (120 GB/111 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors: (120 GB/111 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA usb 1-1: reset full speed USB device using uhci_hcd and address 2 usb 5-2: reset full speed USB device using uhci_hcd and address 2 pci 0000:00:02.0: restoring config space at offset 0x1 (was 0x900403, writing 0x900003) pci 0000:00:02.0: setting latency timer to 64 Restarting tasks ... done. ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 0/10] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated) 2009-02-22 17:37 ` Rafael J. Wysocki ` (11 preceding siblings ...) (?) @ 2009-03-11 9:30 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:30 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner Hi, The last iteration of this series of patches didn't draw comments except for the discussion about the "wake-up" interrupts, so here's an update that I'd like to consider as more-or-less final. I would also like to use a separate 'suspend' git tree for merging these patches, if you don't mind. The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/10 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 2/10 - 4/10 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 5/10 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 6/10 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 7/10 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 8/10 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. [The second rev of the patch retains the current behavior during the "power-off" phase of hibernation, which is that the devices without drivers or without PM support in the drivers are not power managed by the core.] 9/10 fixes pci_set_power_state() so that it doesn't return error code when attempting to put a PCI device without PM support (either native or through the platform) into D0 (such devices are always in D0). 10/10 makes the PCI PM core save and restore the configuration spaces of devices that have no drivers or no PM support in the drivers during suspend and resume, respectively. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 0/10] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated) 2009-02-22 17:37 ` Rafael J. Wysocki ` (12 preceding siblings ...) (?) @ 2009-03-11 9:30 ` Rafael J. Wysocki 2009-03-11 9:36 ` [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) Rafael J. Wysocki ` (18 more replies) -1 siblings, 19 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:30 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg Hi, The last iteration of this series of patches didn't draw comments except for the discussion about the "wake-up" interrupts, so here's an update that I'd like to consider as more-or-less final. I would also like to use a separate 'suspend' git tree for merging these patches, if you don't mind. The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/10 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 2/10 - 4/10 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 5/10 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 6/10 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 7/10 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 8/10 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. [The second rev of the patch retains the current behavior during the "power-off" phase of hibernation, which is that the devices without drivers or without PM support in the drivers are not power managed by the core.] 9/10 fixes pci_set_power_state() so that it doesn't return error code when attempting to put a PCI device without PM support (either native or through the platform) into D0 (such devices are always in D0). 10/10 makes the PCI PM core save and restore the configuration spaces of devices that have no drivers or no PM support in the drivers during suspend and resume, respectively. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 9:30 ` Rafael J. Wysocki @ 2009-03-11 9:36 ` Rafael J. Wysocki 2009-03-11 9:36 ` Rafael J. Wysocki ` (17 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:36 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 9:30 ` Rafael J. Wysocki 2009-03-11 9:36 ` [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) Rafael J. Wysocki @ 2009-03-11 9:36 ` Rafael J. Wysocki 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 9:37 ` [PATCH 2/10] PM: Change suspend code ordering Rafael J. Wysocki ` (16 subsequent siblings) 18 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:36 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++-- drivers/base/power/main.c | 20 +++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 ++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 1 kernel/irq/manage.c | 2 - kernel/irq/pm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++------ kernel/power/main.c | 17 +++++--- 13 files changed, 181 insertions(+), 41 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,89 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + bool sync = false; + + spin_lock_irqsave(&desc->lock, flags); + + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + sync = true; + } + desc->status |= IRQ_SUSPENDED; + } + + spin_unlock_irqrestore(&desc->lock, flags); + + if (sync) + synchronize_irq(irq); + } +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -215,7 +215,7 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,7 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 9:36 ` Rafael J. Wysocki @ 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 10:33 ` Thomas Gleixner 1 sibling, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 10:33 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar Rafael, On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > Index: linux-2.6/kernel/irq/pm.c > =================================================================== > --- /dev/null > +++ linux-2.6/kernel/irq/pm.c > @@ -0,0 +1,89 @@ > +/* > + * linux/kernel/irq/pm.c > + * > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > + * > + * This file contains power management functions related to interrupts. > + */ > + > +#include <linux/irq.h> > +#include <linux/module.h> > +#include <linux/interrupt.h> > + > +#include "internals.h" > + > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this purpose. > + * It disables all interrupt lines that are enabled at the moment and sets the > + * IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + bool sync = false; > + > + spin_lock_irqsave(&desc->lock, flags); > + > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > + if (!desc->depth++) { > + desc->status |= IRQ_DISABLED; > + desc->chip->disable(irq); > + sync = true; > + } > + desc->status |= IRQ_SUSPENDED; This flag needs to be checked in __enable_irq(). > + } > + > + spin_unlock_irqrestore(&desc->lock, flags); > + > + if (sync) > + synchronize_irq(irq); > + } > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); I'm not too enthusiastic about this open coded implementation of disable_irq() with slightly different semantics. Can we please move the fiddling with desc->* into kernel/irq/manage.c and share the code there ? > +/** > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > + * have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + > + spin_lock_irqsave(&desc->lock, flags); > + desc->status &= ~IRQ_SUSPENDED; > + __enable_irq(desc, irq); > + spin_unlock_irqrestore(&desc->lock, flags); > + } > +} > +EXPORT_SYMBOL_GPL(resume_device_irqs); Ditto. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 9:36 ` Rafael J. Wysocki 2009-03-11 10:33 ` Thomas Gleixner @ 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki ` (3 more replies) 1 sibling, 4 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 10:33 UTC (permalink / raw) To: Rafael J. Wysocki Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg Rafael, On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > Index: linux-2.6/kernel/irq/pm.c > =================================================================== > --- /dev/null > +++ linux-2.6/kernel/irq/pm.c > @@ -0,0 +1,89 @@ > +/* > + * linux/kernel/irq/pm.c > + * > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > + * > + * This file contains power management functions related to interrupts. > + */ > + > +#include <linux/irq.h> > +#include <linux/module.h> > +#include <linux/interrupt.h> > + > +#include "internals.h" > + > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this purpose. > + * It disables all interrupt lines that are enabled at the moment and sets the > + * IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + bool sync = false; > + > + spin_lock_irqsave(&desc->lock, flags); > + > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > + if (!desc->depth++) { > + desc->status |= IRQ_DISABLED; > + desc->chip->disable(irq); > + sync = true; > + } > + desc->status |= IRQ_SUSPENDED; This flag needs to be checked in __enable_irq(). > + } > + > + spin_unlock_irqrestore(&desc->lock, flags); > + > + if (sync) > + synchronize_irq(irq); > + } > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); I'm not too enthusiastic about this open coded implementation of disable_irq() with slightly different semantics. Can we please move the fiddling with desc->* into kernel/irq/manage.c and share the code there ? > +/** > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > + * have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + > + spin_lock_irqsave(&desc->lock, flags); > + desc->status &= ~IRQ_SUSPENDED; > + __enable_irq(desc, irq); > + spin_unlock_irqrestore(&desc->lock, flags); > + } > +} > +EXPORT_SYMBOL_GPL(resume_device_irqs); Ditto. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 10:33 ` Thomas Gleixner @ 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 20:59 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > Rafael, > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > Index: linux-2.6/kernel/irq/pm.c > > =================================================================== > > --- /dev/null > > +++ linux-2.6/kernel/irq/pm.c > > @@ -0,0 +1,89 @@ > > +/* > > + * linux/kernel/irq/pm.c > > + * > > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > > + * > > + * This file contains power management functions related to interrupts. > > + */ > > + > > +#include <linux/irq.h> > > +#include <linux/module.h> > > +#include <linux/interrupt.h> > > + > > +#include "internals.h" > > + > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + bool sync = false; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > > + if (!desc->depth++) { > > + desc->status |= IRQ_DISABLED; > > + desc->chip->disable(irq); > > + sync = true; > > + } > > + desc->status |= IRQ_SUSPENDED; > > This flag needs to be checked in __enable_irq(). > > > + } > > + > > + spin_unlock_irqrestore(&desc->lock, flags); > > + > > + if (sync) > > + synchronize_irq(irq); > > + } > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > I'm not too enthusiastic about this open coded implementation of > disable_irq() with slightly different semantics. The difference in semantics is important IMO, otherwise I woulndn't have done that. In particular, IMO, the condition should be under the spinlock IMO and I'd rather not synchronize all interrupts we don't really disable here. > Can we please move the fiddling with desc->* into > kernel/irq/manage.c and share the code there ? Can you please discuss that with Ingo? I moved that from manage.c at his request. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 20:59 ` Rafael J. Wysocki @ 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 21:42 ` Thomas Gleixner 1 sibling, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:42 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > I'm not too enthusiastic about this open coded implementation of > > disable_irq() with slightly different semantics. > > The difference in semantics is important IMO, otherwise I woulndn't have > done that. In particular, IMO, the condition should be under the spinlock IMO > and I'd rather not synchronize all interrupts we don't really disable here. I don't say that the difference is not relevant. But the code is almost the same and disable_irq() could have the sync_irq optimization as well. > > Can we please move the fiddling with desc->* into > > kernel/irq/manage.c and share the code there ? > > Can you please discuss that with Ingo? I moved that from manage.c at his > request. Hmrpf. Will do. I just want to avoid that we have scattered functions which deal with the guts of the irq code all over the place. I'm fine with your loop in irq/pm.c, but the actual handling of the irq internals should remain in manage.c. I'll have a closer look how to solve this. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 21:42 ` Thomas Gleixner @ 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 22:01 ` Rafael J. Wysocki ` (2 more replies) 1 sibling, 3 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:42 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > I'm not too enthusiastic about this open coded implementation of > > disable_irq() with slightly different semantics. > > The difference in semantics is important IMO, otherwise I woulndn't have > done that. In particular, IMO, the condition should be under the spinlock IMO > and I'd rather not synchronize all interrupts we don't really disable here. I don't say that the difference is not relevant. But the code is almost the same and disable_irq() could have the sync_irq optimization as well. > > Can we please move the fiddling with desc->* into > > kernel/irq/manage.c and share the code there ? > > Can you please discuss that with Ingo? I moved that from manage.c at his > request. Hmrpf. Will do. I just want to avoid that we have scattered functions which deal with the guts of the irq code all over the place. I'm fine with your loop in irq/pm.c, but the actual handling of the irq internals should remain in manage.c. I'll have a closer look how to solve this. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:42 ` Thomas Gleixner @ 2009-03-11 22:01 ` Rafael J. Wysocki 2009-03-11 22:45 ` Thomas Gleixner 2009-03-11 22:45 ` Thomas Gleixner 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > I'm not too enthusiastic about this open coded implementation of > > > disable_irq() with slightly different semantics. > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > done that. In particular, IMO, the condition should be under the spinlock IMO > > and I'd rather not synchronize all interrupts we don't really disable here. > > I don't say that the difference is not relevant. But the code is > almost the same and disable_irq() could have the sync_irq optimization > as well. Agreed. > > > Can we please move the fiddling with desc->* into > > > kernel/irq/manage.c and share the code there ? > > > > Can you please discuss that with Ingo? I moved that from manage.c at his > > request. > > Hmrpf. Will do. I just want to avoid that we have scattered functions > which deal with the guts of the irq code all over the place. I understand your concern, I'd prefer to avoid that too. > I'm fine with your loop in irq/pm.c, but the actual handling of the irq > internals should remain in manage.c. Well, perhaps we can add a parameter to disable_irq_nosync() telling it not to disable the interrupt if it's a timer one? Something like void disable_irq_nosync(unsigned int irq, bool skip_timer) etc.? Also, it could return a value meaning whether or not the interrupt has been actually disabled. > I'll have a closer look how to solve this. Thanks! Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) @ 2009-03-11 22:01 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > I'm not too enthusiastic about this open coded implementation of > > > disable_irq() with slightly different semantics. > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > done that. In particular, IMO, the condition should be under the spinlock IMO > > and I'd rather not synchronize all interrupts we don't really disable here. > > I don't say that the difference is not relevant. But the code is > almost the same and disable_irq() could have the sync_irq optimization > as well. Agreed. > > > Can we please move the fiddling with desc->* into > > > kernel/irq/manage.c and share the code there ? > > > > Can you please discuss that with Ingo? I moved that from manage.c at his > > request. > > Hmrpf. Will do. I just want to avoid that we have scattered functions > which deal with the guts of the irq code all over the place. I understand your concern, I'd prefer to avoid that too. > I'm fine with your loop in irq/pm.c, but the actual handling of the irq > internals should remain in manage.c. Well, perhaps we can add a parameter to disable_irq_nosync() telling it not to disable the interrupt if it's a timer one? Something like void disable_irq_nosync(unsigned int irq, bool skip_timer) etc.? Also, it could return a value meaning whether or not the interrupt has been actually disabled. > I'll have a closer look how to solve this. Thanks! Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 22:01 ` Rafael J. Wysocki @ 2009-03-11 22:45 ` Thomas Gleixner 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-11 22:45 ` Thomas Gleixner 2 siblings, 2 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 22:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > I'm not too enthusiastic about this open coded implementation of > > > disable_irq() with slightly different semantics. > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > done that. In particular, IMO, the condition should be under the spinlock IMO > > and I'd rather not synchronize all interrupts we don't really disable here. > > I don't say that the difference is not relevant. But the code is > almost the same and disable_irq() could have the sync_irq optimization > as well. Thought more about that. Avoiding the sync_irq() for irqs which have no action associated is fine, but you need to catch the following case as well: driver code calls disable_irq_nosyc() from the handler (which is still running) suspend code skips the sync due to depth > 0 The sync operation is not that expensive. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:45 ` Thomas Gleixner @ 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-12 21:43 ` [update, rev. 6] " Rafael J. Wysocki 2009-03-12 21:43 ` Rafael J. Wysocki 2009-03-12 13:36 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-12 13:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > disable_irq() with slightly different semantics. > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > I don't say that the difference is not relevant. But the code is > > almost the same and disable_irq() could have the sync_irq optimization > > as well. > > Thought more about that. Avoiding the sync_irq() for irqs which have > no action associated is fine, but you need to catch the following case > as well: > > driver code calls disable_irq_nosyc() from the handler (which is > still running) > > suspend code skips the sync due to depth > 0 > > The sync operation is not that expensive. OK, what about this (untested, irrelevant parts skipped)? Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (desc->action && (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,19 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: - WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); - break; + goto err_out; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -232,6 +247,11 @@ static void __enable_irq(struct irq_desc default: desc->depth--; } + + return; + + err_out: + WARN(true, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); } /** @@ -253,7 +273,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +531,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 13:36 ` Rafael J. Wysocki @ 2009-03-12 21:43 ` Rafael J. Wysocki 2009-03-12 21:43 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-12 21:43 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds On Thursday 12 March 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > disable_irq() with slightly different semantics. > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > I don't say that the difference is not relevant. But the code is > > > almost the same and disable_irq() could have the sync_irq optimization > > > as well. > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > no action associated is fine, but you need to catch the following case > > as well: > > > > driver code calls disable_irq_nosyc() from the handler (which is > > still running) > > > > suspend code skips the sync due to depth > 0 > > > > The sync operation is not that expensive. > > OK, what about this (untested, irrelevant parts skipped)? Well, I guess I need to assume that no reaction means it's fine. ;-) Below is the complete patch. Thomas, Ingo, please let me know it it is fine with you. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 6) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++-- drivers/base/power/main.c | 20 ++++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 +++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 31 +++++++++++++----- kernel/irq/pm.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++++------ kernel/power/main.c | 17 ++++++--- 13 files changed, 195 insertions(+), 47 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (desc->action && (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,21 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -253,7 +270,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +528,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-12 21:43 ` [update, rev. 6] " Rafael J. Wysocki @ 2009-03-12 21:43 ` Rafael J. Wysocki 2009-03-13 0:39 ` Ingo Molnar ` (4 more replies) 1 sibling, 5 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-12 21:43 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Thursday 12 March 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > disable_irq() with slightly different semantics. > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > I don't say that the difference is not relevant. But the code is > > > almost the same and disable_irq() could have the sync_irq optimization > > > as well. > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > no action associated is fine, but you need to catch the following case > > as well: > > > > driver code calls disable_irq_nosyc() from the handler (which is > > still running) > > > > suspend code skips the sync due to depth > 0 > > > > The sync operation is not that expensive. > > OK, what about this (untested, irrelevant parts skipped)? Well, I guess I need to assume that no reaction means it's fine. ;-) Below is the complete patch. Thomas, Ingo, please let me know it it is fine with you. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Rework handling of interrupts during suspend-resume (rev. 6) Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. Use these functions to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 ++++++-- drivers/base/power/main.c | 20 ++++++----- drivers/base/sys.c | 8 ++++ drivers/xen/manage.c | 16 +++++---- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 31 +++++++++++++----- kernel/irq/pm.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/kexec.c | 8 ++-- kernel/power/disk.c | 39 ++++++++++++++++------ kernel/power/main.c | 17 ++++++--- 13 files changed, 195 insertions(+), 47 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (desc->action && (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,21 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -253,7 +270,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +528,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 21:43 ` Rafael J. Wysocki @ 2009-03-13 0:39 ` Ingo Molnar 2009-03-13 0:39 ` Ingo Molnar ` (3 subsequent siblings) 4 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-03-13 0:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Thomas Gleixner, Linus Torvalds, pm list * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 12 March 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > > disable_irq() with slightly different semantics. > > > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > > > I don't say that the difference is not relevant. But the code is > > > > almost the same and disable_irq() could have the sync_irq optimization > > > > as well. > > > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > > no action associated is fine, but you need to catch the following case > > > as well: > > > > > > driver code calls disable_irq_nosyc() from the handler (which is > > > still running) > > > > > > suspend code skips the sync due to depth > 0 > > > > > > The sync operation is not that expensive. > > > > OK, what about this (untested, irrelevant parts skipped)? > > Well, I guess I need to assume that no reaction means it's fine. ;-) > > Below is the complete patch. Thomas, Ingo, please let me know > it it is fine with you. looks good - but you sure want to split it up some more, right? > 13 files changed, 195 insertions(+), 47 deletions(-) We want the non-intrusive 'add new APIs' bits [which give most of the linecount] separated from the 'all hell breaks lose' functional changes ;-) Makes it easier to revert, bisect, etc. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 21:43 ` Rafael J. Wysocki 2009-03-13 0:39 ` Ingo Molnar @ 2009-03-13 0:39 ` Ingo Molnar 2009-03-13 17:07 ` Rafael J. Wysocki 2009-03-13 7:15 ` Arve Hjønnevåg ` (2 subsequent siblings) 4 siblings, 1 reply; 373+ messages in thread From: Ingo Molnar @ 2009-03-13 0:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Gleixner, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg * Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday 12 March 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > > disable_irq() with slightly different semantics. > > > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > > > I don't say that the difference is not relevant. But the code is > > > > almost the same and disable_irq() could have the sync_irq optimization > > > > as well. > > > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > > no action associated is fine, but you need to catch the following case > > > as well: > > > > > > driver code calls disable_irq_nosyc() from the handler (which is > > > still running) > > > > > > suspend code skips the sync due to depth > 0 > > > > > > The sync operation is not that expensive. > > > > OK, what about this (untested, irrelevant parts skipped)? > > Well, I guess I need to assume that no reaction means it's fine. ;-) > > Below is the complete patch. Thomas, Ingo, please let me know > it it is fine with you. looks good - but you sure want to split it up some more, right? > 13 files changed, 195 insertions(+), 47 deletions(-) We want the non-intrusive 'add new APIs' bits [which give most of the linecount] separated from the 'all hell breaks lose' functional changes ;-) Makes it easier to revert, bisect, etc. Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 0:39 ` Ingo Molnar @ 2009-03-13 17:07 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 17:07 UTC (permalink / raw) To: Ingo Molnar Cc: Thomas Gleixner, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Friday 13 March 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Thursday 12 March 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > > > disable_irq() with slightly different semantics. > > > > > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > > > > > I don't say that the difference is not relevant. But the code is > > > > > almost the same and disable_irq() could have the sync_irq optimization > > > > > as well. > > > > > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > > > no action associated is fine, but you need to catch the following case > > > > as well: > > > > > > > > driver code calls disable_irq_nosyc() from the handler (which is > > > > still running) > > > > > > > > suspend code skips the sync due to depth > 0 > > > > > > > > The sync operation is not that expensive. > > > > > > OK, what about this (untested, irrelevant parts skipped)? > > > > Well, I guess I need to assume that no reaction means it's fine. ;-) > > > > Below is the complete patch. Thomas, Ingo, please let me know > > it it is fine with you. > > looks good - but you sure want to split it up some more, right? Well, in fact I didn't think about that. > > 13 files changed, 195 insertions(+), 47 deletions(-) > > We want the non-intrusive 'add new APIs' bits [which give most > of the linecount] separated from the 'all hell breaks lose' > functional changes ;-) Makes it easier to revert, bisect, etc. I can split it into a patch adding the new functions under kernel/irq and another one making the suspend code use them, but that's going to put the new functions somewhat out of context, IMO. Still, I'll do it if you want me to. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) @ 2009-03-13 17:07 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 17:07 UTC (permalink / raw) To: Ingo Molnar Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Thomas Gleixner, Linus Torvalds, pm list On Friday 13 March 2009, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Thursday 12 March 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > > > > disable_irq() with slightly different semantics. > > > > > > > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > > > > > > > I don't say that the difference is not relevant. But the code is > > > > > almost the same and disable_irq() could have the sync_irq optimization > > > > > as well. > > > > > > > > Thought more about that. Avoiding the sync_irq() for irqs which have > > > > no action associated is fine, but you need to catch the following case > > > > as well: > > > > > > > > driver code calls disable_irq_nosyc() from the handler (which is > > > > still running) > > > > > > > > suspend code skips the sync due to depth > 0 > > > > > > > > The sync operation is not that expensive. > > > > > > OK, what about this (untested, irrelevant parts skipped)? > > > > Well, I guess I need to assume that no reaction means it's fine. ;-) > > > > Below is the complete patch. Thomas, Ingo, please let me know > > it it is fine with you. > > looks good - but you sure want to split it up some more, right? Well, in fact I didn't think about that. > > 13 files changed, 195 insertions(+), 47 deletions(-) > > We want the non-intrusive 'add new APIs' bits [which give most > of the linecount] separated from the 'all hell breaks lose' > functional changes ;-) Makes it easier to revert, bisect, etc. I can split it into a patch adding the new functions under kernel/irq and another one making the suspend code use them, but that's going to put the new functions somewhat out of context, IMO. Still, I'll do it if you want me to. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 21:43 ` Rafael J. Wysocki @ 2009-03-13 7:15 ` Arve Hjønnevåg 2009-03-13 0:39 ` Ingo Molnar ` (3 subsequent siblings) 4 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-13 7:15 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Gleixner, Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop On Thu, Mar 12, 2009 at 1:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) > +{ > + if (suspend) { > + if (desc->action && (desc->action->flags & IRQF_TIMER)) > + return; Don't you want "(!desc->action || ..." here to avoid enabling unused interrupts on resume? -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) @ 2009-03-13 7:15 ` Arve Hjønnevåg 0 siblings, 0 replies; 373+ messages in thread From: Arve Hjønnevåg @ 2009-03-13 7:15 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Thu, Mar 12, 2009 at 1:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) > +{ > + if (suspend) { > + if (desc->action && (desc->action->flags & IRQF_TIMER)) > + return; Don't you want "(!desc->action || ..." here to avoid enabling unused interrupts on resume? -- Arve Hjønnevåg ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 7:15 ` Arve Hjønnevåg (?) @ 2009-03-13 16:53 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 16:53 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Thomas Gleixner, Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop On Friday 13 March 2009, Arve Hjønnevåg wrote: > On Thu, Mar 12, 2009 at 1:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) > > +{ > > + if (suspend) { > > + if (desc->action && (desc->action->flags & IRQF_TIMER)) > > + return; > > Don't you want "(!desc->action || ..." here to avoid enabling unused > interrupts on resume? Hmm, good idea, thanks. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 7:15 ` Arve Hjønnevåg (?) (?) @ 2009-03-13 16:53 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 16:53 UTC (permalink / raw) To: Arve Hjønnevåg Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Linus Torvalds, Ingo Molnar On Friday 13 March 2009, Arve Hjønnevåg wrote: > On Thu, Mar 12, 2009 at 1:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) > > +{ > > + if (suspend) { > > + if (desc->action && (desc->action->flags & IRQF_TIMER)) > > + return; > > Don't you want "(!desc->action || ..." here to avoid enabling unused > interrupts on resume? Hmm, good idea, thanks. Best, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 21:43 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-13 7:15 ` Arve Hjønnevåg @ 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 19:55 ` Thomas Gleixner 4 siblings, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-13 19:55 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this purpose. > + * It disables all interrupt lines that are enabled at the moment and sets the > + * IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + spin_lock_irqsave(&desc->lock, flags); > + __disable_irq(desc, irq, true); > + spin_unlock_irqrestore(&desc->lock, flags); Can we move the locking into __disable_irq ? > + } > + > + for_each_irq_desc(irq, desc) > + if (desc->status & IRQ_SUSPENDED) > + synchronize_irq(irq); > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > + > +/** > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > + * have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + > + spin_lock_irqsave(&desc->lock, flags); > + __enable_irq(desc, irq, true); > + spin_unlock_irqrestore(&desc->lock, flags); Ditto Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-12 21:43 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-03-13 19:55 ` Thomas Gleixner @ 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 21:56 ` Rafael J. Wysocki ` (3 more replies) 4 siblings, 4 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-13 19:55 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > +/** > + * suspend_device_irqs - disable all currently enabled interrupt lines > + * > + * During system-wide suspend or hibernation device interrupts need to be > + * disabled at the chip level and this function is provided for this purpose. > + * It disables all interrupt lines that are enabled at the moment and sets the > + * IRQ_SUSPENDED flag for them. > + */ > +void suspend_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + spin_lock_irqsave(&desc->lock, flags); > + __disable_irq(desc, irq, true); > + spin_unlock_irqrestore(&desc->lock, flags); Can we move the locking into __disable_irq ? > + } > + > + for_each_irq_desc(irq, desc) > + if (desc->status & IRQ_SUSPENDED) > + synchronize_irq(irq); > +} > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > + > +/** > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > + * > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > + * have the IRQ_SUSPENDED flag set. > + */ > +void resume_device_irqs(void) > +{ > + struct irq_desc *desc; > + int irq; > + > + for_each_irq_desc(irq, desc) { > + unsigned long flags; > + > + if (!(desc->status & IRQ_SUSPENDED)) > + continue; > + > + spin_lock_irqsave(&desc->lock, flags); > + __enable_irq(desc, irq, true); > + spin_unlock_irqrestore(&desc->lock, flags); Ditto Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 19:55 ` Thomas Gleixner @ 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-13 21:56 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 21:56 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Friday 13 March 2009, Thomas Gleixner wrote: > On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __disable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Can we move the locking into __disable_irq ? Well, yes, but (see below) > > + } > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > > + * have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __enable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Ditto No, because of __setup_irq(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 21:56 ` Rafael J. Wysocki @ 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 0:04 ` Rafael J. Wysocki 2009-03-14 0:04 ` Rafael J. Wysocki 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-13 21:56 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Friday 13 March 2009, Thomas Gleixner wrote: > On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __disable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Can we move the locking into __disable_irq ? Well, yes, but (see below) > > + } > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > > + * have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __enable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Ditto No, because of __setup_irq(). Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 21:56 ` Rafael J. Wysocki @ 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 7:31 ` Thomas Gleixner 1 sibling, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-14 7:31 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Fri, 13 Mar 2009, Rafael J. Wysocki wrote: > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > > Ditto > > No, because of __setup_irq(). Sorry, forgot about that. Ok. Keep the locking in pm.c then. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-14 7:31 ` Thomas Gleixner @ 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 10:01 ` Rafael J. Wysocki 2009-03-14 10:01 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-14 7:31 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Fri, 13 Mar 2009, Rafael J. Wysocki wrote: > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > > Ditto > > No, because of __setup_irq(). Sorry, forgot about that. Ok. Keep the locking in pm.c then. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-14 7:31 ` Thomas Gleixner @ 2009-03-14 10:01 ` Rafael J. Wysocki 2009-03-14 10:01 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 10:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Saturday 14 March 2009, Thomas Gleixner wrote: > On Fri, 13 Mar 2009, Rafael J. Wysocki wrote: > > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > > > > Ditto > > > > No, because of __setup_irq(). > > Sorry, forgot about that. Ok. Keep the locking in pm.c then. Will do, thanks. OK, it seems we're approaching the final version. :-) I'm going to split the $subject patch as requested by Ingo (API changes and functionality changes) and post the full series once again for completness. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 10:01 ` Rafael J. Wysocki @ 2009-03-14 10:01 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 10:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Saturday 14 March 2009, Thomas Gleixner wrote: > On Fri, 13 Mar 2009, Rafael J. Wysocki wrote: > > > > + spin_unlock_irqrestore(&desc->lock, flags); > > > > > > Ditto > > > > No, because of __setup_irq(). > > Sorry, forgot about that. Ok. Keep the locking in pm.c then. Will do, thanks. OK, it seems we're approaching the final version. :-) I'm going to split the $subject patch as requested by Ingo (API changes and functionality changes) and post the full series once again for completness. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-13 21:56 ` Rafael J. Wysocki @ 2009-03-14 0:04 ` Rafael J. Wysocki 2009-03-14 0:04 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 0:04 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Friday 13 March 2009, Thomas Gleixner wrote: > On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __disable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Can we move the locking into __disable_irq ? > > > + } > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > > + * have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __enable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Ditto Well, I guess you'd prefer something like the appended patch, but Ingo probably won't like it since it contains additional #ifdefs in irq/manage.c . Sigh. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Introduce functions for suspending and resuming device interrupts Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. These functions will be used to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- include/linux/interrupt.h | 5 +++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 45 ++++++++++++++++++++++++++++--- kernel/irq/pm.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 116 insertions(+), 4 deletions(-) Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,14 @@ static inline int do_irq_select_affinity } #endif +static void __disable_irq(struct irq_desc *desc, unsigned int irq) +{ + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +190,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +220,32 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); +#ifdef CONFIG_PM_SLEEP +void suspend_irq(struct irq_desc *desc, unsigned int irq) +{ + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + __disable_irq(desc, irq); + desc->status |= IRQ_SUSPENDED; + } + spin_unlock_irqrestore(&desc->lock, flags); +} +#endif + static void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -258,6 +280,21 @@ void enable_irq(unsigned int irq) } EXPORT_SYMBOL(enable_irq); +#ifdef CONFIG_PM_SLEEP +void resume_irq(struct irq_desc *desc, unsigned int irq) +{ + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + return; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); +} +#endif + static int set_irq_wake_real(unsigned int irq, unsigned int on) { struct irq_desc *desc = irq_to_desc(irq); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void suspend_irq(struct irq_desc *desc, unsigned int irq); +extern void resume_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,66 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + suspend_irq(desc, irq); + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + resume_irq(desc, irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [update, rev. 6] Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-13 19:55 ` Thomas Gleixner ` (2 preceding siblings ...) 2009-03-14 0:04 ` Rafael J. Wysocki @ 2009-03-14 0:04 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 0:04 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds On Friday 13 March 2009, Thomas Gleixner wrote: > On Thu, 12 Mar 2009, Rafael J. Wysocki wrote: > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __disable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Can we move the locking into __disable_irq ? > > > + } > > + > > + for_each_irq_desc(irq, desc) > > + if (desc->status & IRQ_SUSPENDED) > > + synchronize_irq(irq); > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > + > > +/** > > + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() > > + * > > + * Enable all interrupt lines previously disabled by suspend_device_irqs() that > > + * have the IRQ_SUSPENDED flag set. > > + */ > > +void resume_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + > > + if (!(desc->status & IRQ_SUSPENDED)) > > + continue; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + __enable_irq(desc, irq, true); > > + spin_unlock_irqrestore(&desc->lock, flags); > > Ditto Well, I guess you'd prefer something like the appended patch, but Ingo probably won't like it since it contains additional #ifdefs in irq/manage.c . Sigh. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Introduce functions for suspending and resuming device interrupts Introduce two helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume, respectively. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. These functions will be used to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- include/linux/interrupt.h | 5 +++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 45 ++++++++++++++++++++++++++++--- kernel/irq/pm.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 116 insertions(+), 4 deletions(-) Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,14 @@ static inline int do_irq_select_affinity } #endif +static void __disable_irq(struct irq_desc *desc, unsigned int irq) +{ + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +190,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +220,32 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); +#ifdef CONFIG_PM_SLEEP +void suspend_irq(struct irq_desc *desc, unsigned int irq) +{ + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { + __disable_irq(desc, irq); + desc->status |= IRQ_SUSPENDED; + } + spin_unlock_irqrestore(&desc->lock, flags); +} +#endif + static void __enable_irq(struct irq_desc *desc, unsigned int irq) { switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -258,6 +280,21 @@ void enable_irq(unsigned int irq) } EXPORT_SYMBOL(enable_irq); +#ifdef CONFIG_PM_SLEEP +void resume_irq(struct irq_desc *desc, unsigned int irq) +{ + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + return; + + spin_lock_irqsave(&desc->lock, flags); + desc->status &= ~IRQ_SUSPENDED; + __enable_irq(desc, irq); + spin_unlock_irqrestore(&desc->lock, flags); +} +#endif + static int set_irq_wake_real(unsigned int irq, unsigned int on) { struct irq_desc *desc = irq_to_desc(irq); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void suspend_irq(struct irq_desc *desc, unsigned int irq); +extern void resume_irq(struct irq_desc *desc, unsigned int irq); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,66 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + suspend_irq(desc, irq); + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + resume_irq(desc, irq); +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:45 ` Thomas Gleixner 2009-03-12 13:36 ` Rafael J. Wysocki @ 2009-03-12 13:36 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-12 13:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > > > I'm not too enthusiastic about this open coded implementation of > > > > disable_irq() with slightly different semantics. > > > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > > done that. In particular, IMO, the condition should be under the spinlock IMO > > > and I'd rather not synchronize all interrupts we don't really disable here. > > > > I don't say that the difference is not relevant. But the code is > > almost the same and disable_irq() could have the sync_irq optimization > > as well. > > Thought more about that. Avoiding the sync_irq() for irqs which have > no action associated is fine, but you need to catch the following case > as well: > > driver code calls disable_irq_nosyc() from the handler (which is > still running) > > suspend code skips the sync due to depth > 0 > > The sync operation is not that expensive. OK, what about this (untested, irrelevant parts skipped)? Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (desc->action && (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,19 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: - WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); - break; + goto err_out; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -232,6 +247,11 @@ static void __enable_irq(struct irq_desc default: desc->depth--; } + + return; + + err_out: + WARN(true, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); } /** @@ -253,7 +273,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +531,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 22:01 ` Rafael J. Wysocki 2009-03-11 22:45 ` Thomas Gleixner @ 2009-03-11 22:45 ` Thomas Gleixner 2 siblings, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 22:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, pm list On Wed, 11 Mar 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > > > > > I'm not too enthusiastic about this open coded implementation of > > > disable_irq() with slightly different semantics. > > > > The difference in semantics is important IMO, otherwise I woulndn't have > > done that. In particular, IMO, the condition should be under the spinlock IMO > > and I'd rather not synchronize all interrupts we don't really disable here. > > I don't say that the difference is not relevant. But the code is > almost the same and disable_irq() could have the sync_irq optimization > as well. Thought more about that. Avoiding the sync_irq() for irqs which have no action associated is fine, but you need to catch the following case as well: driver code calls disable_irq_nosyc() from the handler (which is still running) suspend code skips the sync due to depth > 0 The sync operation is not that expensive. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki @ 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:15 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 20:59 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds On Wednesday 11 March 2009, Thomas Gleixner wrote: > Rafael, > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > Index: linux-2.6/kernel/irq/pm.c > > =================================================================== > > --- /dev/null > > +++ linux-2.6/kernel/irq/pm.c > > @@ -0,0 +1,89 @@ > > +/* > > + * linux/kernel/irq/pm.c > > + * > > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > > + * > > + * This file contains power management functions related to interrupts. > > + */ > > + > > +#include <linux/irq.h> > > +#include <linux/module.h> > > +#include <linux/interrupt.h> > > + > > +#include "internals.h" > > + > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + bool sync = false; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > > + if (!desc->depth++) { > > + desc->status |= IRQ_DISABLED; > > + desc->chip->disable(irq); > > + sync = true; > > + } > > + desc->status |= IRQ_SUSPENDED; > > This flag needs to be checked in __enable_irq(). > > > + } > > + > > + spin_unlock_irqrestore(&desc->lock, flags); > > + > > + if (sync) > > + synchronize_irq(irq); > > + } > > +} > > +EXPORT_SYMBOL_GPL(suspend_device_irqs); > > I'm not too enthusiastic about this open coded implementation of > disable_irq() with slightly different semantics. The difference in semantics is important IMO, otherwise I woulndn't have done that. In particular, IMO, the condition should be under the spinlock IMO and I'd rather not synchronize all interrupts we don't really disable here. > Can we please move the fiddling with desc->* into > kernel/irq/manage.c and share the code there ? Can you please discuss that with Ingo? I moved that from manage.c at his request. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 20:59 ` Rafael J. Wysocki @ 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:15 ` Rafael J. Wysocki 3 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 21:15 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar On Wednesday 11 March 2009, Thomas Gleixner wrote: > Rafael, > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > Index: linux-2.6/kernel/irq/pm.c > > =================================================================== > > --- /dev/null > > +++ linux-2.6/kernel/irq/pm.c > > @@ -0,0 +1,89 @@ > > +/* > > + * linux/kernel/irq/pm.c > > + * > > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > > + * > > + * This file contains power management functions related to interrupts. > > + */ > > + > > +#include <linux/irq.h> > > +#include <linux/module.h> > > +#include <linux/interrupt.h> > > + > > +#include "internals.h" > > + > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + bool sync = false; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > > + if (!desc->depth++) { > > + desc->status |= IRQ_DISABLED; > > + desc->chip->disable(irq); > > + sync = true; > > + } > > + desc->status |= IRQ_SUSPENDED; > > This flag needs to be checked in __enable_irq(). [I overlooked this comment, sorry.] Why does it? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 10:33 ` Thomas Gleixner ` (2 preceding siblings ...) 2009-03-11 21:15 ` Rafael J. Wysocki @ 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:35 ` Thomas Gleixner 3 siblings, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 21:15 UTC (permalink / raw) To: Thomas Gleixner Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > Rafael, > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > Index: linux-2.6/kernel/irq/pm.c > > =================================================================== > > --- /dev/null > > +++ linux-2.6/kernel/irq/pm.c > > @@ -0,0 +1,89 @@ > > +/* > > + * linux/kernel/irq/pm.c > > + * > > + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. > > + * > > + * This file contains power management functions related to interrupts. > > + */ > > + > > +#include <linux/irq.h> > > +#include <linux/module.h> > > +#include <linux/interrupt.h> > > + > > +#include "internals.h" > > + > > +/** > > + * suspend_device_irqs - disable all currently enabled interrupt lines > > + * > > + * During system-wide suspend or hibernation device interrupts need to be > > + * disabled at the chip level and this function is provided for this purpose. > > + * It disables all interrupt lines that are enabled at the moment and sets the > > + * IRQ_SUSPENDED flag for them. > > + */ > > +void suspend_device_irqs(void) > > +{ > > + struct irq_desc *desc; > > + int irq; > > + > > + for_each_irq_desc(irq, desc) { > > + unsigned long flags; > > + bool sync = false; > > + > > + spin_lock_irqsave(&desc->lock, flags); > > + > > + if (desc->action && !(desc->action->flags & IRQF_TIMER)) { > > + if (!desc->depth++) { > > + desc->status |= IRQ_DISABLED; > > + desc->chip->disable(irq); > > + sync = true; > > + } > > + desc->status |= IRQ_SUSPENDED; > > This flag needs to be checked in __enable_irq(). [I overlooked this comment, sorry.] Why does it? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:15 ` Rafael J. Wysocki @ 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:35 ` Thomas Gleixner 1 sibling, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > + desc->status |= IRQ_SUSPENDED; > > > > This flag needs to be checked in __enable_irq(). > > [I overlooked this comment, sorry.] > > Why does it? To catch abuse and callers of enable_irq() when this flag is set. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:35 ` Thomas Gleixner @ 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:50 ` Rafael J. Wysocki 1 sibling, 2 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > + desc->status |= IRQ_SUSPENDED; > > > > This flag needs to be checked in __enable_irq(). > > [I overlooked this comment, sorry.] > > Why does it? To catch abuse and callers of enable_irq() when this flag is set. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:35 ` Thomas Gleixner @ 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:50 ` Rafael J. Wysocki 1 sibling, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 21:50 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > This flag needs to be checked in __enable_irq(). > > > > [I overlooked this comment, sorry.] > > > > Why does it? > > To catch abuse and callers of enable_irq() when this flag is set. Hmm. This means you'd like to make enable_irq() fail if called with IRQ_SUSPENDED set, correct? What if someone calls irq_disable() and then irq_enable() between suspend_device_irqs() and resume_device_irqs()? That would be pointless, but surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:50 ` Rafael J. Wysocki @ 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 21:53 ` Thomas Gleixner 1 sibling, 2 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 21:50 UTC (permalink / raw) To: Thomas Gleixner Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > This flag needs to be checked in __enable_irq(). > > > > [I overlooked this comment, sorry.] > > > > Why does it? > > To catch abuse and callers of enable_irq() when this flag is set. Hmm. This means you'd like to make enable_irq() fail if called with IRQ_SUSPENDED set, correct? What if someone calls irq_disable() and then irq_enable() between suspend_device_irqs() and resume_device_irqs()? That would be pointless, but surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:50 ` Rafael J. Wysocki @ 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 21:53 ` Thomas Gleixner 1 sibling, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > > > This flag needs to be checked in __enable_irq(). > > > > > > [I overlooked this comment, sorry.] > > > > > > Why does it? > > > > To catch abuse and callers of enable_irq() when this flag is set. > > Hmm. This means you'd like to make enable_irq() fail if called with > IRQ_SUSPENDED set, correct? > > What if someone calls irq_disable() and then irq_enable() between > suspend_device_irqs() and resume_device_irqs()? That would be pointless, but > surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? I'm not worried about nested ones. > Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? At least it needs a WARN_ON() in that case. A very prominent one. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:53 ` Thomas Gleixner @ 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 22:01 ` Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 21:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > > > This flag needs to be checked in __enable_irq(). > > > > > > [I overlooked this comment, sorry.] > > > > > > Why does it? > > > > To catch abuse and callers of enable_irq() when this flag is set. > > Hmm. This means you'd like to make enable_irq() fail if called with > IRQ_SUSPENDED set, correct? > > What if someone calls irq_disable() and then irq_enable() between > suspend_device_irqs() and resume_device_irqs()? That would be pointless, but > surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? I'm not worried about nested ones. > Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? At least it needs a WARN_ON() in that case. A very prominent one. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:53 ` Thomas Gleixner @ 2009-03-11 22:01 ` Linus Torvalds 2009-03-11 22:07 ` Rafael J. Wysocki 2009-03-11 22:07 ` Rafael J. Wysocki 2 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-11 22:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Rafael J. Wysocki, pm list, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > I'm not worried about nested ones. Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one increments the disabled depth count. So _all_ disable/enable_irq calls will by definition be nested inside IRQ_SUSPENDED. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) @ 2009-03-11 22:01 ` Linus Torvalds 0 siblings, 0 replies; 373+ messages in thread From: Linus Torvalds @ 2009-03-11 22:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > I'm not worried about nested ones. Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one increments the disabled depth count. So _all_ disable/enable_irq calls will by definition be nested inside IRQ_SUSPENDED. Linus ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:01 ` Linus Torvalds (?) @ 2009-03-11 22:13 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:13 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, pm list, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Linus Torvalds wrote: > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > I'm not worried about nested ones. > > Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one > increments the disabled depth count. > > So _all_ disable/enable_irq calls will by definition be nested inside > IRQ_SUSPENDED. Still, if there's an unbalanced irq_enable() between suspend_device_irqs() and resume_device_irqs(), we'll not detect it immediately, but only in resume_device_irqs(). It would be better if the unbalanced call failed in that case IMHO. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:01 ` Linus Torvalds (?) (?) @ 2009-03-11 22:13 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:13 UTC (permalink / raw) To: Linus Torvalds Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Thomas Gleixner, Ingo Molnar On Wednesday 11 March 2009, Linus Torvalds wrote: > > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > I'm not worried about nested ones. > > Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one > increments the disabled depth count. > > So _all_ disable/enable_irq calls will by definition be nested inside > IRQ_SUSPENDED. Still, if there's an unbalanced irq_enable() between suspend_device_irqs() and resume_device_irqs(), we'll not detect it immediately, but only in resume_device_irqs(). It would be better if the unbalanced call failed in that case IMHO. Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:01 ` Linus Torvalds ` (2 preceding siblings ...) (?) @ 2009-03-11 22:25 ` Thomas Gleixner -1 siblings, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 22:25 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, pm list, LKML, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wed, 11 Mar 2009, Linus Torvalds wrote: > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > I'm not worried about nested ones. > > Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one > increments the disabled depth count. > > So _all_ disable/enable_irq calls will by definition be nested inside > IRQ_SUSPENDED. Right, if they are in disable -> enable order. But the stupid stray enable will be visible either by wrecking the suspend with hard to debug failures or trigger the depth check in the resume code. I'm burned enough by the timer failures which pop up long after the real bug happened. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 22:01 ` Linus Torvalds ` (3 preceding siblings ...) (?) @ 2009-03-11 22:25 ` Thomas Gleixner -1 siblings, 0 replies; 373+ messages in thread From: Thomas Gleixner @ 2009-03-11 22:25 UTC (permalink / raw) To: Linus Torvalds Cc: Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Ingo Molnar On Wed, 11 Mar 2009, Linus Torvalds wrote: > On Wed, 11 Mar 2009, Thomas Gleixner wrote: > > > > I'm not worried about nested ones. > > Then you shouldn't be worried about IRQ_SUSPENDED at all, since that one > increments the disabled depth count. > > So _all_ disable/enable_irq calls will by definition be nested inside > IRQ_SUSPENDED. Right, if they are in disable -> enable order. But the stupid stray enable will be visible either by wrecking the suspend with hard to debug failures or trigger the depth check in the resume code. I'm burned enough by the timer failures which pop up long after the real bug happened. Thanks, tglx ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 22:01 ` Linus Torvalds @ 2009-03-11 22:07 ` Rafael J. Wysocki 2009-03-11 22:07 ` Rafael J. Wysocki 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:07 UTC (permalink / raw) To: Thomas Gleixner Cc: pm list, LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Frans Pop, Arve Hjønnevåg On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > > > > > This flag needs to be checked in __enable_irq(). > > > > > > > > [I overlooked this comment, sorry.] > > > > > > > > Why does it? > > > > > > To catch abuse and callers of enable_irq() when this flag is set. > > > > Hmm. This means you'd like to make enable_irq() fail if called with > > IRQ_SUSPENDED set, correct? > > > > What if someone calls irq_disable() and then irq_enable() between > > suspend_device_irqs() and resume_device_irqs()? That would be pointless, but > > surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? > > I'm not worried about nested ones. > > > Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? > > At least it needs a WARN_ON() in that case. A very prominent one. I'm going to make it fail and print a warning for desc->depth == 1if IRQ_SUSPENDED is set. Hope that's fine with everyone. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 22:01 ` Linus Torvalds 2009-03-11 22:07 ` Rafael J. Wysocki @ 2009-03-11 22:07 ` Rafael J. Wysocki 2 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 22:07 UTC (permalink / raw) To: Thomas Gleixner Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, pm list, Linus Torvalds, Ingo Molnar On Wednesday 11 March 2009, Thomas Gleixner wrote: > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > On Wed, 11 Mar 2009, Rafael J. Wysocki wrote: > > > > On Wednesday 11 March 2009, Thomas Gleixner wrote: > > > > > > + desc->status |= IRQ_SUSPENDED; > > > > > > > > > > This flag needs to be checked in __enable_irq(). > > > > > > > > [I overlooked this comment, sorry.] > > > > > > > > Why does it? > > > > > > To catch abuse and callers of enable_irq() when this flag is set. > > > > Hmm. This means you'd like to make enable_irq() fail if called with > > IRQ_SUSPENDED set, correct? > > > > What if someone calls irq_disable() and then irq_enable() between > > suspend_device_irqs() and resume_device_irqs()? That would be pointless, but > > surely not a bug? Should irq_disable() also fail if IRQ_SUSPENDED is set? > > I'm not worried about nested ones. > > > Or should __enable_irq() only fail with IRQ_SUSPENDED set for desc->depth == 1? > > At least it needs a WARN_ON() in that case. A very prominent one. I'm going to make it fail and print a warning for desc->depth == 1if IRQ_SUSPENDED is set. Hope that's fine with everyone. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 2/10] PM: Change suspend code ordering 2009-03-11 9:30 ` Rafael J. Wysocki 2009-03-11 9:36 ` [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) Rafael J. Wysocki 2009-03-11 9:36 ` Rafael J. Wysocki @ 2009-03-11 9:37 ` Rafael J. Wysocki 2009-03-11 9:37 ` Rafael J. Wysocki ` (15 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:37 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 2/10] PM: Change suspend code ordering 2009-03-11 9:30 ` Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-11 9:37 ` [PATCH 2/10] PM: Change suspend code ordering Rafael J. Wysocki @ 2009-03-11 9:37 ` Rafael J. Wysocki 2009-03-11 9:38 ` [PATCH 3/10] PM: Change hibernation " Rafael J. Wysocki ` (14 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:37 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 3/10] PM: Change hibernation code ordering 2009-03-11 9:30 ` Rafael J. Wysocki ` (3 preceding siblings ...) 2009-03-11 9:37 ` Rafael J. Wysocki @ 2009-03-11 9:38 ` Rafael J. Wysocki 2009-03-11 9:38 ` Rafael J. Wysocki ` (13 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:38 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 3/10] PM: Change hibernation code ordering 2009-03-11 9:30 ` Rafael J. Wysocki ` (4 preceding siblings ...) 2009-03-11 9:38 ` [PATCH 3/10] PM: Change hibernation " Rafael J. Wysocki @ 2009-03-11 9:38 ` Rafael J. Wysocki 2009-03-11 9:39 ` Rafael J. Wysocki ` (12 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:38 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 4/10] kexec: Change kexec jump code ordering 2009-03-11 9:30 ` Rafael J. Wysocki @ 2009-03-11 9:39 ` Rafael J. Wysocki 2009-03-11 9:36 ` Rafael J. Wysocki ` (17 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:39 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 4/10] kexec: Change kexec jump code ordering @ 2009-03-11 9:39 ` Rafael J. Wysocki 0 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:39 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 5/10] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-11 9:30 ` Rafael J. Wysocki ` (6 preceding siblings ...) 2009-03-11 9:39 ` Rafael J. Wysocki @ 2009-03-11 9:41 ` Rafael J. Wysocki 2009-03-11 9:41 ` Rafael J. Wysocki ` (10 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:41 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 5/10] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-11 9:30 ` Rafael J. Wysocki ` (7 preceding siblings ...) 2009-03-11 9:41 ` [PATCH 5/10] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki @ 2009-03-11 9:41 ` Rafael J. Wysocki 2009-03-11 9:42 ` [PATCH 6/10] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki ` (9 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:41 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 6/10] PCI PM: Use pci_set_power_state during early resume 2009-03-11 9:30 ` Rafael J. Wysocki ` (8 preceding siblings ...) 2009-03-11 9:41 ` Rafael J. Wysocki @ 2009-03-11 9:42 ` Rafael J. Wysocki 2009-03-11 9:42 ` Rafael J. Wysocki ` (8 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:42 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 6/10] PCI PM: Use pci_set_power_state during early resume 2009-03-11 9:30 ` Rafael J. Wysocki ` (9 preceding siblings ...) 2009-03-11 9:42 ` [PATCH 6/10] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki @ 2009-03-11 9:42 ` Rafael J. Wysocki 2009-03-11 9:47 ` [PATCH 7/10] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki ` (7 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:42 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 7/10] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-11 9:30 ` Rafael J. Wysocki ` (10 preceding siblings ...) 2009-03-11 9:42 ` Rafael J. Wysocki @ 2009-03-11 9:47 ` Rafael J. Wysocki 2009-03-11 9:47 ` Rafael J. Wysocki ` (6 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:47 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 7/10] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-11 9:30 ` Rafael J. Wysocki ` (11 preceding siblings ...) 2009-03-11 9:47 ` [PATCH 7/10] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki @ 2009-03-11 9:47 ` Rafael J. Wysocki 2009-03-11 9:48 ` [PATCH 8/10] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki ` (5 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:47 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 8/10] PCI PM: Put devices into low power states during late suspend (rev. 2) 2009-03-11 9:30 ` Rafael J. Wysocki ` (12 preceding siblings ...) 2009-03-11 9:47 ` Rafael J. Wysocki @ 2009-03-11 9:48 ` Rafael J. Wysocki 2009-03-11 9:48 ` Rafael J. Wysocki ` (4 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:48 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 134 ++++++++++++++++++++++++++++------------------- 1 file changed, 81 insertions(+), 53 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,44 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); - if (drv && drv->pm && drv->pm->poweroff_noirq) { + if (!drv || !drv->pm) + return 0; + + if (drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 8/10] PCI PM: Put devices into low power states during late suspend (rev. 2) 2009-03-11 9:30 ` Rafael J. Wysocki ` (13 preceding siblings ...) 2009-03-11 9:48 ` [PATCH 8/10] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki @ 2009-03-11 9:48 ` Rafael J. Wysocki 2009-03-11 9:55 ` [PATCH 9/10] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki ` (3 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:48 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 134 ++++++++++++++++++++++++++++------------------- 1 file changed, 81 insertions(+), 53 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,44 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); - if (drv && drv->pm && drv->pm->poweroff_noirq) { + if (!drv || !drv->pm) + return 0; + + if (drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 9/10] PCI PM: Make pci_set_power_state() handle devices with no PM support 2009-03-11 9:30 ` Rafael J. Wysocki ` (14 preceding siblings ...) 2009-03-11 9:48 ` Rafael J. Wysocki @ 2009-03-11 9:55 ` Rafael J. Wysocki 2009-03-11 9:55 ` Rafael J. Wysocki ` (2 subsequent siblings) 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:55 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> There is a problem with PCI devices without any PM support (either native or through the platform) that pci_set_power_state() always returns error code for them, even if they are being put into D0. However, such devices are always in D0, so pci_set_power_state() should return success when attempting to put such a device into D0. It also should update the current_state field for these devices as appropriate. This modification is necessary so that the standard configuration registers of these devices are successfully restored by pci_restore_standard_config() during the "early" phase of resume. In addition, pci_set_power_state() should check the value of current_state before calling the platform to change the power state of the device to avoid doing that unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -439,6 +439,10 @@ static int pci_raw_set_power_state(struc u16 pmcsr; bool need_restore = false; + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + if (!dev->pm_cap) return -EIO; @@ -449,10 +453,7 @@ static int pci_raw_set_power_state(struc * Can enter D0 from any state, but if we can only go deeper * to sleep if we're already in a low power state */ - if (dev->current_state == state) { - /* we're already there */ - return 0; - } else if (state != PCI_D0 && dev->current_state <= PCI_D3cold + if (state != PCI_D0 && dev->current_state <= PCI_D3cold && dev->current_state > state) { dev_err(&dev->dev, "invalid power transition " "(from state %d to %d)\n", dev->current_state, state); @@ -570,12 +571,17 @@ int pci_set_power_state(struct pci_dev * */ return 0; - if (state == PCI_D0 && platform_pci_power_manageable(dev)) { + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + + if (state == PCI_D0) { /* * Allow the platform to change the state, for example via ACPI * _PR0, _PS0 and some such, but do not trust it. */ - int ret = platform_pci_set_power_state(dev, PCI_D0); + int ret = platform_pci_power_manageable(dev) ? + platform_pci_set_power_state(dev, PCI_D0) : 0; if (!ret) pci_update_current_state(dev, PCI_D0); } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 9/10] PCI PM: Make pci_set_power_state() handle devices with no PM support 2009-03-11 9:30 ` Rafael J. Wysocki ` (15 preceding siblings ...) 2009-03-11 9:55 ` [PATCH 9/10] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki @ 2009-03-11 9:55 ` Rafael J. Wysocki 2009-03-11 9:56 ` [PATCH 10/10] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki 2009-03-11 9:56 ` Rafael J. Wysocki 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:55 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> There is a problem with PCI devices without any PM support (either native or through the platform) that pci_set_power_state() always returns error code for them, even if they are being put into D0. However, such devices are always in D0, so pci_set_power_state() should return success when attempting to put such a device into D0. It also should update the current_state field for these devices as appropriate. This modification is necessary so that the standard configuration registers of these devices are successfully restored by pci_restore_standard_config() during the "early" phase of resume. In addition, pci_set_power_state() should check the value of current_state before calling the platform to change the power state of the device to avoid doing that unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -439,6 +439,10 @@ static int pci_raw_set_power_state(struc u16 pmcsr; bool need_restore = false; + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + if (!dev->pm_cap) return -EIO; @@ -449,10 +453,7 @@ static int pci_raw_set_power_state(struc * Can enter D0 from any state, but if we can only go deeper * to sleep if we're already in a low power state */ - if (dev->current_state == state) { - /* we're already there */ - return 0; - } else if (state != PCI_D0 && dev->current_state <= PCI_D3cold + if (state != PCI_D0 && dev->current_state <= PCI_D3cold && dev->current_state > state) { dev_err(&dev->dev, "invalid power transition " "(from state %d to %d)\n", dev->current_state, state); @@ -570,12 +571,17 @@ int pci_set_power_state(struct pci_dev * */ return 0; - if (state == PCI_D0 && platform_pci_power_manageable(dev)) { + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + + if (state == PCI_D0) { /* * Allow the platform to change the state, for example via ACPI * _PR0, _PS0 and some such, but do not trust it. */ - int ret = platform_pci_set_power_state(dev, PCI_D0); + int ret = platform_pci_power_manageable(dev) ? + platform_pci_set_power_state(dev, PCI_D0) : 0; if (!ret) pci_update_current_state(dev, PCI_D0); } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 10/10] PCI PM: Restore config spaces of all devices during early resume 2009-03-11 9:30 ` Rafael J. Wysocki ` (16 preceding siblings ...) 2009-03-11 9:55 ` Rafael J. Wysocki @ 2009-03-11 9:56 ` Rafael J. Wysocki 2009-03-11 9:56 ` Rafael J. Wysocki 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:56 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> At present the configuration spaces of PCI devices that have no drivers or no PM support in the drivers (either legacy or through a pm object) are not saved during suspend and, consequently, they are not restored during resume. This generally may lead to the state of the system being slightly inconsistent after the resume, so it's better to save and restore the configuration spaces of these devices as well. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -516,13 +516,13 @@ static int pci_pm_suspend(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_SUSPEND); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->suspend) { pci_power_t prev = pci_dev->current_state; int error; @@ -554,8 +554,10 @@ static int pci_pm_suspend_noirq(struct d if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (!pm) + if (!pm) { + pci_save_state(pci_dev); return 0; + } if (pm->suspend_noirq) { pci_power_t prev = pci_dev->current_state; @@ -650,13 +652,13 @@ static int pci_pm_freeze(struct device * if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_FREEZE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); return 0; } - pci_dev->state_saved = false; - if (pm->freeze) { int error; @@ -738,13 +740,13 @@ static int pci_pm_poweroff(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->poweroff) { int error; ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 10/10] PCI PM: Restore config spaces of all devices during early resume 2009-03-11 9:30 ` Rafael J. Wysocki ` (17 preceding siblings ...) 2009-03-11 9:56 ` [PATCH 10/10] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki @ 2009-03-11 9:56 ` Rafael J. Wysocki 18 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-11 9:56 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Frans Pop, Arve Hjønnevåg From: Rafael J. Wysocki <rjw@sisk.pl> At present the configuration spaces of PCI devices that have no drivers or no PM support in the drivers (either legacy or through a pm object) are not saved during suspend and, consequently, they are not restored during resume. This generally may lead to the state of the system being slightly inconsistent after the resume, so it's better to save and restore the configuration spaces of these devices as well. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -516,13 +516,13 @@ static int pci_pm_suspend(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_SUSPEND); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->suspend) { pci_power_t prev = pci_dev->current_state; int error; @@ -554,8 +554,10 @@ static int pci_pm_suspend_noirq(struct d if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (!pm) + if (!pm) { + pci_save_state(pci_dev); return 0; + } if (pm->suspend_noirq) { pci_power_t prev = pci_dev->current_state; @@ -650,13 +652,13 @@ static int pci_pm_freeze(struct device * if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_FREEZE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); return 0; } - pci_dev->state_saved = false; - if (pm->freeze) { int error; @@ -738,13 +740,13 @@ static int pci_pm_poweroff(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->poweroff) { int error; ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) 2009-02-22 17:37 ` Rafael J. Wysocki ` (13 preceding siblings ...) (?) @ 2009-03-14 11:24 ` Rafael J. Wysocki 2009-03-14 11:26 ` [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts Rafael J. Wysocki ` (23 more replies) -1 siblings, 24 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:24 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI Hi, This is an update of the patch series reworking the handling of interrupts during suspend-resume, addressing some comments from Thomas and Ingo. The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/11 introduces helper functions used by the subsequent patches. 2/11 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 3/11 - 5/11 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 6/11 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 7/11 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 8/11 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 9/11 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. [The second rev of the patch retains the current behavior during the "power-off" phase of hibernation, which is that the devices without drivers or without PM support in the drivers are not power managed by the core.] 10/11 fixes pci_set_power_state() so that it doesn't return error code when attempting to put a PCI device without PM support (either native or through the platform) into D0 (such devices are always in D0). 11/11 makes the PCI PM core save and restore the configuration spaces of devices that have no drivers or no PM support in the drivers during suspend and resume, respectively. There is a git tree containing these patches, for easier testing, at: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6.git (linux-next branch). At the moment it has a merge conflict with the PCI linux-next tree due to 6/11. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki @ 2009-03-14 11:26 ` Rafael J. Wysocki 2009-03-14 11:26 ` Rafael J. Wysocki ` (22 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:26 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Introduce helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. In turn, this allows device drivers' "late" suspend and "early" resume callbacks to sleep, execute ACPI callbacks etc. The functions introduced here will be used to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 31 +++++++++++++----- kernel/irq/pm.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 112 insertions(+), 7 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (!desc->action || (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,21 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -253,7 +270,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +528,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki 2009-03-14 11:26 ` [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts Rafael J. Wysocki @ 2009-03-14 11:26 ` Rafael J. Wysocki 2009-03-14 11:27 ` [PATCH 2/11] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki ` (21 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:26 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Introduce helper functions allowing us to prevent device drivers from getting any interrupts (without disabling interrupts on the CPU) during suspend (or hibernation) and to make them start to receive interrupts again during the subsequent resume. These functions make it possible to keep timer interrupts enabled while the "late" suspend and "early" resume callbacks provided by device drivers are being executed. In turn, this allows device drivers' "late" suspend and "early" resume callbacks to sleep, execute ACPI callbacks etc. The functions introduced here will be used to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- include/linux/interrupt.h | 5 ++ include/linux/irq.h | 1 kernel/irq/Makefile | 1 kernel/irq/internals.h | 2 + kernel/irq/manage.c | 31 +++++++++++++----- kernel/irq/pm.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 112 insertions(+), 7 deletions(-) Index: linux-2.6/include/linux/interrupt.h =================================================================== --- linux-2.6.orig/include/linux/interrupt.h +++ linux-2.6/include/linux/interrupt.h @@ -106,6 +106,11 @@ extern void disable_irq_nosync(unsigned extern void disable_irq(unsigned int irq); extern void enable_irq(unsigned int irq); +/* The following three functions are for the core kernel use only. */ +extern void suspend_device_irqs(void); +extern void resume_device_irqs(void); +extern int check_wakeup_irqs(void); + #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS) extern cpumask_var_t irq_default_affinity; Index: linux-2.6/include/linux/irq.h =================================================================== --- linux-2.6.orig/include/linux/irq.h +++ linux-2.6/include/linux/irq.h @@ -65,6 +65,7 @@ typedef void (*irq_flow_handler_t)(unsig #define IRQ_SPURIOUS_DISABLED 0x00800000 /* IRQ was disabled by the spurious trap */ #define IRQ_MOVE_PCNTXT 0x01000000 /* IRQ migration from process context */ #define IRQ_AFFINITY_SET 0x02000000 /* IRQ affinity was set from userspace*/ +#define IRQ_SUSPENDED 0x04000000 /* IRQ has gone through suspend sequence */ #ifdef CONFIG_IRQ_PER_CPU # define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU) Index: linux-2.6/kernel/irq/pm.c =================================================================== --- /dev/null +++ linux-2.6/kernel/irq/pm.c @@ -0,0 +1,79 @@ +/* + * linux/kernel/irq/pm.c + * + * Copyright (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. + * + * This file contains power management functions related to interrupts. + */ + +#include <linux/irq.h> +#include <linux/module.h> +#include <linux/interrupt.h> + +#include "internals.h" + +/** + * suspend_device_irqs - disable all currently enabled interrupt lines + * + * During system-wide suspend or hibernation device interrupts need to be + * disabled at the chip level and this function is provided for this purpose. + * It disables all interrupt lines that are enabled at the moment and sets the + * IRQ_SUSPENDED flag for them. + */ +void suspend_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + spin_lock_irqsave(&desc->lock, flags); + __disable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } + + for_each_irq_desc(irq, desc) + if (desc->status & IRQ_SUSPENDED) + synchronize_irq(irq); +} +EXPORT_SYMBOL_GPL(suspend_device_irqs); + +/** + * resume_device_irqs - enable interrupt lines disabled by suspend_device_irqs() + * + * Enable all interrupt lines previously disabled by suspend_device_irqs() that + * have the IRQ_SUSPENDED flag set. + */ +void resume_device_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) { + unsigned long flags; + + if (!(desc->status & IRQ_SUSPENDED)) + continue; + + spin_lock_irqsave(&desc->lock, flags); + __enable_irq(desc, irq, true); + spin_unlock_irqrestore(&desc->lock, flags); + } +} +EXPORT_SYMBOL_GPL(resume_device_irqs); + +/** + * check_wakeup_irqs - check if any wake-up interrupts are pending + */ +int check_wakeup_irqs(void) +{ + struct irq_desc *desc; + int irq; + + for_each_irq_desc(irq, desc) + if ((desc->status & IRQ_WAKEUP) && (desc->status & IRQ_PENDING)) + return -EBUSY; + + return 0; +} Index: linux-2.6/kernel/irq/Makefile =================================================================== --- linux-2.6.orig/kernel/irq/Makefile +++ linux-2.6/kernel/irq/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_GENERIC_IRQ_PROBE) += autop obj-$(CONFIG_PROC_FS) += proc.o obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o obj-$(CONFIG_NUMA_MIGRATE_IRQ_DESC) += numa_migrate.o +obj-$(CONFIG_PM_SLEEP) += pm.o Index: linux-2.6/kernel/irq/manage.c =================================================================== --- linux-2.6.orig/kernel/irq/manage.c +++ linux-2.6/kernel/irq/manage.c @@ -162,6 +162,20 @@ static inline int do_irq_select_affinity } #endif +void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) +{ + if (suspend) { + if (!desc->action || (desc->action->flags & IRQF_TIMER)) + return; + desc->status |= IRQ_SUSPENDED; + } + + if (!desc->depth++) { + desc->status |= IRQ_DISABLED; + desc->chip->disable(irq); + } +} + /** * disable_irq_nosync - disable an irq without waiting * @irq: Interrupt to disable @@ -182,10 +196,7 @@ void disable_irq_nosync(unsigned int irq return; spin_lock_irqsave(&desc->lock, flags); - if (!desc->depth++) { - desc->status |= IRQ_DISABLED; - desc->chip->disable(irq); - } + __disable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(disable_irq_nosync); @@ -215,15 +226,21 @@ void disable_irq(unsigned int irq) } EXPORT_SYMBOL(disable_irq); -static void __enable_irq(struct irq_desc *desc, unsigned int irq) +void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume) { + if (resume) + desc->status &= ~IRQ_SUSPENDED; + switch (desc->depth) { case 0: + err_out: WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n", irq); break; case 1: { unsigned int status = desc->status & ~IRQ_DISABLED; + if (desc->status & IRQ_SUSPENDED) + goto err_out; /* Prevent probing on this irq: */ desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); @@ -253,7 +270,7 @@ void enable_irq(unsigned int irq) return; spin_lock_irqsave(&desc->lock, flags); - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); spin_unlock_irqrestore(&desc->lock, flags); } EXPORT_SYMBOL(enable_irq); @@ -511,7 +528,7 @@ __setup_irq(unsigned int irq, struct irq */ if (shared && (desc->status & IRQ_SPURIOUS_DISABLED)) { desc->status &= ~IRQ_SPURIOUS_DISABLED; - __enable_irq(desc, irq); + __enable_irq(desc, irq, false); } spin_unlock_irqrestore(&desc->lock, flags); Index: linux-2.6/kernel/irq/internals.h =================================================================== --- linux-2.6.orig/kernel/irq/internals.h +++ linux-2.6/kernel/irq/internals.h @@ -12,6 +12,8 @@ extern void compat_irq_chip_set_default_ extern int __irq_set_trigger(struct irq_desc *desc, unsigned int irq, unsigned long flags); +extern void __disable_irq(struct irq_desc *desc, unsigned int irq, bool susp); +extern void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume); extern struct lock_class_key irq_desc_lock_class; extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 2/11] PM: Rework handling of interrupts during suspend-resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki 2009-03-14 11:26 ` [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts Rafael J. Wysocki 2009-03-14 11:26 ` Rafael J. Wysocki @ 2009-03-14 11:27 ` Rafael J. Wysocki 2009-03-14 11:27 ` Rafael J. Wysocki ` (20 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:27 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Use the functions introduced in by the previous patch, suspend_device_irqs(), resume_device_irqs() and check_wakeup_irqs(), to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++++++++---- drivers/base/power/main.c | 20 +++++++++++--------- drivers/base/sys.c | 8 ++++++++ drivers/xen/manage.c | 16 +++++++++------- kernel/kexec.c | 8 ++++---- kernel/power/disk.c | 39 +++++++++++++++++++++++++++++---------- kernel/power/main.c | 17 +++++++++++------ 7 files changed, 83 insertions(+), 40 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 2/11] PM: Rework handling of interrupts during suspend-resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (2 preceding siblings ...) 2009-03-14 11:27 ` [PATCH 2/11] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki @ 2009-03-14 11:27 ` Rafael J. Wysocki 2009-03-14 11:28 ` [PATCH 3/11] PM: Change suspend code ordering Rafael J. Wysocki ` (19 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:27 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Use the functions introduced in by the previous patch, suspend_device_irqs(), resume_device_irqs() and check_wakeup_irqs(), to rework the handling of interrupts during suspend (hibernation) and resume. Namely, interrupts will only be disabled on the CPU right before suspending sysdevs, while device drivers will be prevented from receiving interrupts, with the help of the new helper function, before their "late" suspend callbacks run (and analogously during resume). In addition, since the device interrups are now disabled before the CPU has turned all interrupts off and the CPU will ACK the interrupts setting the IRQ_PENDING bit for them, check in sysdev_suspend() if any wake-up interrupts are pending and abort suspend if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 15 +++++++++++---- drivers/base/power/main.c | 20 +++++++++++--------- drivers/base/sys.c | 8 ++++++++ drivers/xen/manage.c | 16 +++++++++------- kernel/kexec.c | 8 ++++---- kernel/power/disk.c | 39 +++++++++++++++++++++++++++++---------- kernel/power/main.c | 17 +++++++++++------ 7 files changed, 83 insertions(+), 40 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -287,17 +287,19 @@ void __attribute__ ((weak)) arch_suspend */ static int suspend_enter(suspend_state_t state) { - int error = 0; + int error; device_pm_lock(); - arch_suspend_disable_irqs(); - BUG_ON(!irqs_disabled()); - if ((error = device_power_down(PMSG_SUSPEND))) { + error = device_power_down(PMSG_SUSPEND); + if (error) { printk(KERN_ERR "PM: Some devices failed to power down\n"); goto Done; } + arch_suspend_disable_irqs(); + BUG_ON(!irqs_disabled()); + error = sysdev_suspend(PMSG_SUSPEND); if (!error) { if (!suspend_test(TEST_CORE)) @@ -305,11 +307,14 @@ static int suspend_enter(suspend_state_t sysdev_resume(); } - device_power_up(PMSG_RESUME); - Done: arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + + device_power_up(PMSG_RESUME); + + Done: device_pm_unlock(); + return error; } Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -214,7 +214,7 @@ static int create_image(int platform_mod return error; device_pm_lock(); - local_irq_disable(); + /* At this point, device_suspend() has been called, but *not* * device_power_down(). We *must* call device_power_down() now. * Otherwise, drivers for some devices (e.g. interrupt controllers) @@ -225,8 +225,11 @@ static int create_image(int platform_mod if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " @@ -252,12 +255,16 @@ static int create_image(int platform_mod /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ + Power_up_devices: + local_irq_enable(); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); - Enable_irqs: - local_irq_enable(); + + Unlock: device_pm_unlock(); + return error; } @@ -336,13 +343,16 @@ static int resume_target_kernel(void) int error; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_QUIESCE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting resume\n"); - goto Enable_irqs; + goto Unlock; } + + local_irq_disable(); + sysdev_suspend(PMSG_QUIESCE); /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); @@ -366,11 +376,16 @@ static int resume_target_kernel(void) swsusp_free(); restore_processor_state(); touch_softlockup_watchdog(); + sysdev_resume(); - device_power_up(PMSG_RECOVER); - Enable_irqs: + local_irq_enable(); + + device_power_up(PMSG_RECOVER); + + Unlock: device_pm_unlock(); + return error; } @@ -447,15 +462,16 @@ int hibernation_platform_enter(void) goto Finish; device_pm_lock(); - local_irq_disable(); + error = device_power_down(PMSG_HIBERNATE); if (!error) { + local_irq_disable(); sysdev_suspend(PMSG_HIBERNATE); hibernation_ops->enter(); /* We should never get here */ while (1); } - local_irq_enable(); + device_pm_unlock(); /* @@ -464,12 +480,15 @@ int hibernation_platform_enter(void) */ Finish: hibernation_ops->finish(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); resume_console(); + Close: hibernation_ops->end(); + return error; } Index: linux-2.6/arch/x86/kernel/apm_32.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apm_32.c +++ linux-2.6/arch/x86/kernel/apm_32.c @@ -1190,8 +1190,10 @@ static int suspend(int vetoable) struct apm_user *as; device_suspend(PMSG_SUSPEND); - local_irq_disable(); + device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1209,9 +1211,12 @@ static int suspend(int vetoable) if (err != APM_SUCCESS) apm_error("suspend", err); err = (err == APM_SUCCESS) ? 0 : -EIO; + sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); + device_resume(PMSG_RESUME); queue_event(APM_NORMAL_RESUME, NULL); spin_lock(&user_list_lock); @@ -1228,8 +1233,9 @@ static void standby(void) { int err; - local_irq_disable(); device_power_down(PMSG_SUSPEND); + + local_irq_disable(); sysdev_suspend(PMSG_SUSPEND); local_irq_enable(); @@ -1239,8 +1245,9 @@ static void standby(void) local_irq_disable(); sysdev_resume(); - device_power_up(PMSG_RESUME); local_irq_enable(); + + device_power_up(PMSG_RESUME); } static apm_event_t get_event(void) Index: linux-2.6/drivers/xen/manage.c =================================================================== --- linux-2.6.orig/drivers/xen/manage.c +++ linux-2.6/drivers/xen/manage.c @@ -39,12 +39,6 @@ static int xen_suspend(void *data) BUG_ON(!irqs_disabled()); - err = device_power_down(PMSG_SUSPEND); - if (err) { - printk(KERN_ERR "xen_suspend: device_power_down failed: %d\n", - err); - return err; - } err = sysdev_suspend(PMSG_SUSPEND); if (err) { printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n", @@ -69,7 +63,6 @@ static int xen_suspend(void *data) xen_mm_unpin_all(); sysdev_resume(); - device_power_up(PMSG_RESUME); if (!*cancelled) { xen_irq_resume(); @@ -108,6 +101,12 @@ static void do_suspend(void) /* XXX use normal device tree? */ xenbus_suspend(); + err = device_power_down(PMSG_SUSPEND); + if (err) { + printk(KERN_ERR "device_power_down failed: %d\n", err); + goto resume_devices; + } + err = stop_machine(xen_suspend, &cancelled, &cpumask_of_cpu(0)); if (err) { printk(KERN_ERR "failed to start xen_suspend: %d\n", err); @@ -120,6 +119,9 @@ static void do_suspend(void) } else xenbus_suspend_cancel(); + device_power_up(PMSG_RESUME); + +resume_devices: device_resume(PMSG_RESUME); /* Make sure timer events get retriggered on all CPUs */ Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1454,7 +1454,6 @@ int kernel_kexec(void) if (error) goto Resume_devices; device_pm_lock(); - local_irq_disable(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* * device_power_down() now. Otherwise, drivers for @@ -1464,8 +1463,9 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Enable_irqs; + goto Unlock_pm; + local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) @@ -1484,9 +1484,9 @@ int kernel_kexec(void) if (kexec_image->preserve_context) { sysdev_resume(); Power_up_devices: - device_power_up(PMSG_RESTORE); - Enable_irqs: local_irq_enable(); + device_power_up(PMSG_RESTORE); + Unlock_pm: device_pm_unlock(); enable_nonboot_cpus(); Resume_devices: Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -23,6 +23,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/interrupt.h> #include "../base.h" #include "power.h" @@ -305,7 +306,8 @@ static int resume_device_noirq(struct de * Execute the appropriate "noirq resume" callback for all devices marked * as DPM_OFF_IRQ. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. Device drivers should not receive + * interrupts while it's being executed. */ static void dpm_power_up(pm_message_t state) { @@ -326,14 +328,13 @@ static void dpm_power_up(pm_message_t st * device_power_up - Turn on all devices that need special attention. * @state: PM transition of the system being carried out. * - * Power on system devices, then devices that required we shut them down - * with interrupts disabled. - * - * Must be called with interrupts disabled. + * Call the "early" resume handlers and enable device drivers to receive + * interrupts. */ void device_power_up(pm_message_t state) { dpm_power_up(state); + resume_device_irqs(); } EXPORT_SYMBOL_GPL(device_power_up); @@ -558,16 +559,17 @@ static int suspend_device_noirq(struct d * device_power_down - Shut down special devices. * @state: PM transition of the system being carried out. * - * Power down devices that require interrupts to be disabled. - * Then power down system devices. + * Prevent device drivers from receiving interrupts and call the "late" + * suspend handlers. * - * Must be called with interrupts disabled and only one CPU running. + * Must be called under dpm_list_mtx. */ int device_power_down(pm_message_t state) { struct device *dev; int error = 0; + suspend_device_irqs(); list_for_each_entry_reverse(dev, &dpm_list, power.entry) { error = suspend_device_noirq(dev, state); if (error) { @@ -577,7 +579,7 @@ int device_power_down(pm_message_t state dev->power.status = DPM_OFF_IRQ; } if (error) - dpm_power_up(resume_event(state)); + device_power_up(resume_event(state)); return error; } EXPORT_SYMBOL_GPL(device_power_down); Index: linux-2.6/drivers/base/sys.c =================================================================== --- linux-2.6.orig/drivers/base/sys.c +++ linux-2.6/drivers/base/sys.c @@ -22,6 +22,7 @@ #include <linux/pm.h> #include <linux/device.h> #include <linux/mutex.h> +#include <linux/interrupt.h> #include "base.h" @@ -369,6 +370,13 @@ int sysdev_suspend(pm_message_t state) struct sysdev_driver *drv, *err_drv; int ret; + pr_debug("Checking wake-up interrupts\n"); + + /* Return error code if there are any wake-up interrupts pending */ + ret = check_wakeup_irqs(); + if (ret) + return ret; + pr_debug("Suspending System Devices\n"); list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 3/11] PM: Change suspend code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (3 preceding siblings ...) 2009-03-14 11:27 ` Rafael J. Wysocki @ 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:28 ` Rafael J. Wysocki ` (18 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:28 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 3/11] PM: Change suspend code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (4 preceding siblings ...) 2009-03-14 11:28 ` [PATCH 3/11] PM: Change suspend code ordering Rafael J. Wysocki @ 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:28 ` [PATCH 4/11] PM: Change hibernation " Rafael J. Wysocki ` (17 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:28 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the suspend core code so that the platform "prepare" callback is executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/main.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) Index: linux-2.6/kernel/power/main.c =================================================================== --- linux-2.6.orig/kernel/power/main.c +++ linux-2.6/kernel/power/main.c @@ -297,6 +297,19 @@ static int suspend_enter(suspend_state_t goto Done; } + if (suspend_ops->prepare) { + error = suspend_ops->prepare(); + if (error) + goto Power_up_devices; + } + + if (suspend_test(TEST_PLATFORM)) + goto Platfrom_finish; + + error = disable_nonboot_cpus(); + if (error || suspend_test(TEST_CPUS)) + goto Enable_cpus; + arch_suspend_disable_irqs(); BUG_ON(!irqs_disabled()); @@ -310,6 +323,14 @@ static int suspend_enter(suspend_state_t arch_suspend_enable_irqs(); BUG_ON(irqs_disabled()); + Enable_cpus: + enable_nonboot_cpus(); + + Platfrom_finish: + if (suspend_ops->finish) + suspend_ops->finish(); + + Power_up_devices: device_power_up(PMSG_RESUME); Done: @@ -346,23 +367,8 @@ int suspend_devices_and_enter(suspend_st if (suspend_test(TEST_DEVICES)) goto Recover_platform; - if (suspend_ops->prepare) { - error = suspend_ops->prepare(); - if (error) - goto Resume_devices; - } - - if (suspend_test(TEST_PLATFORM)) - goto Finish; + suspend_enter(state); - error = disable_nonboot_cpus(); - if (!error && !suspend_test(TEST_CPUS)) - suspend_enter(state); - - enable_nonboot_cpus(); - Finish: - if (suspend_ops->finish) - suspend_ops->finish(); Resume_devices: suspend_test_start(); device_resume(PMSG_RESUME); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 4/11] PM: Change hibernation code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (5 preceding siblings ...) 2009-03-14 11:28 ` Rafael J. Wysocki @ 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:28 ` Rafael J. Wysocki ` (16 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:28 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 4/11] PM: Change hibernation code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (6 preceding siblings ...) 2009-03-14 11:28 ` [PATCH 4/11] PM: Change hibernation " Rafael J. Wysocki @ 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:29 ` [PATCH 5/11] kexec: Change kexec jump " Rafael J. Wysocki ` (15 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:28 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the hibernation core code so that the platform "prepare" callbacks are executed and the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change (along with the previous analogous change of the suspend core code) will allow us to rework the PCI PM core so that the power state of devices is changed in the "late" phase of suspend (and analogously in the "early" phase of resume), which in turn will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/power/disk.c | 109 +++++++++++++++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 48 deletions(-) Index: linux-2.6/kernel/power/disk.c =================================================================== --- linux-2.6.orig/kernel/power/disk.c +++ linux-2.6/kernel/power/disk.c @@ -228,13 +228,22 @@ static int create_image(int platform_mod goto Unlock; } + error = platform_pre_snapshot(platform_mode); + if (error || hibernation_test(TEST_PLATFORM)) + goto Platform_finish; + + error = disable_nonboot_cpus(); + if (error || hibernation_test(TEST_CPUS) + || hibernation_testmode(HIBERNATION_TEST)) + goto Enable_cpus; + local_irq_disable(); sysdev_suspend(PMSG_FREEZE); if (error) { printk(KERN_ERR "PM: Some devices failed to power down, " "aborting hibernation\n"); - goto Power_up_devices; + goto Enable_irqs; } if (hibernation_test(TEST_CORE)) @@ -250,15 +259,22 @@ static int create_image(int platform_mod restore_processor_state(); if (!in_suspend) platform_leave(platform_mode); + Power_up: sysdev_resume(); /* NOTE: device_power_up() is just a resume() for devices * that suspended with irqs off ... no overall powerup. */ - Power_up_devices: + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Platform_finish: + platform_finish(platform_mode); + device_power_up(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -298,25 +314,9 @@ int hibernation_snapshot(int platform_mo if (hibernation_test(TEST_DEVICES)) goto Recover_platform; - error = platform_pre_snapshot(platform_mode); - if (error || hibernation_test(TEST_PLATFORM)) - goto Finish; - - error = disable_nonboot_cpus(); - if (!error) { - if (hibernation_test(TEST_CPUS)) - goto Enable_cpus; - - if (hibernation_testmode(HIBERNATION_TEST)) - goto Enable_cpus; + error = create_image(platform_mode); + /* Control returns here after successful restore */ - error = create_image(platform_mode); - /* Control returns here after successful restore */ - } - Enable_cpus: - enable_nonboot_cpus(); - Finish: - platform_finish(platform_mode); Resume_devices: device_resume(in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE); @@ -338,7 +338,7 @@ int hibernation_snapshot(int platform_mo * kernel. */ -static int resume_target_kernel(void) +static int resume_target_kernel(bool platform_mode) { int error; @@ -351,9 +351,20 @@ static int resume_target_kernel(void) goto Unlock; } + error = platform_pre_restore(platform_mode); + if (error) + goto Cleanup; + + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; + local_irq_disable(); - sysdev_suspend(PMSG_QUIESCE); + error = sysdev_suspend(PMSG_QUIESCE); + if (error) + goto Enable_irqs; + /* We'll ignore saved state, but this gets preempt count (etc) right */ save_processor_state(); error = restore_highmem(); @@ -379,8 +390,15 @@ static int resume_target_kernel(void) sysdev_resume(); + Enable_irqs: local_irq_enable(); + Enable_cpus: + enable_nonboot_cpus(); + + Cleanup: + platform_restore_cleanup(platform_mode); + device_power_up(PMSG_RECOVER); Unlock: @@ -405,19 +423,10 @@ int hibernation_restore(int platform_mod pm_prepare_console(); suspend_console(); error = device_suspend(PMSG_QUIESCE); - if (error) - goto Finish; - - error = platform_pre_restore(platform_mode); if (!error) { - error = disable_nonboot_cpus(); - if (!error) - error = resume_target_kernel(); - enable_nonboot_cpus(); + error = resume_target_kernel(platform_mode); + device_resume(PMSG_RECOVER); } - platform_restore_cleanup(platform_mode); - device_resume(PMSG_RECOVER); - Finish: resume_console(); pm_restore_console(); return error; @@ -453,34 +462,38 @@ int hibernation_platform_enter(void) goto Resume_devices; } + device_pm_lock(); + + error = device_power_down(PMSG_HIBERNATE); + if (error) + goto Unlock; + error = hibernation_ops->prepare(); if (error) - goto Resume_devices; + goto Platofrm_finish; error = disable_nonboot_cpus(); if (error) - goto Finish; - - device_pm_lock(); - - error = device_power_down(PMSG_HIBERNATE); - if (!error) { - local_irq_disable(); - sysdev_suspend(PMSG_HIBERNATE); - hibernation_ops->enter(); - /* We should never get here */ - while (1); - } + goto Platofrm_finish; - device_pm_unlock(); + local_irq_disable(); + sysdev_suspend(PMSG_HIBERNATE); + hibernation_ops->enter(); + /* We should never get here */ + while (1); /* * We don't need to reenable the nonboot CPUs or resume consoles, since * the system is going to be halted anyway. */ - Finish: + Platofrm_finish: hibernation_ops->finish(); + device_power_up(PMSG_RESTORE); + + Unlock: + device_pm_unlock(); + Resume_devices: entering_platform_hibernation = false; device_resume(PMSG_RESTORE); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 5/11] kexec: Change kexec jump code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (7 preceding siblings ...) 2009-03-14 11:28 ` Rafael J. Wysocki @ 2009-03-14 11:29 ` Rafael J. Wysocki 2009-03-14 11:29 ` Rafael J. Wysocki ` (14 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:29 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 5/11] kexec: Change kexec jump code ordering 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (8 preceding siblings ...) 2009-03-14 11:29 ` [PATCH 5/11] kexec: Change kexec jump " Rafael J. Wysocki @ 2009-03-14 11:29 ` Rafael J. Wysocki 2009-03-14 11:30 ` [PATCH 6/11] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki ` (13 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:29 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Change the ordering of the kexec jump code so that the nonboot CPUs are disabled after calling device drivers' "late suspend" methods. This change reflects the recent modifications of the power management code that is also used by kexec jump. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- kernel/kexec.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) Index: linux-2.6/kernel/kexec.c =================================================================== --- linux-2.6.orig/kernel/kexec.c +++ linux-2.6/kernel/kexec.c @@ -1450,9 +1450,6 @@ int kernel_kexec(void) error = device_suspend(PMSG_FREEZE); if (error) goto Resume_console; - error = disable_nonboot_cpus(); - if (error) - goto Resume_devices; device_pm_lock(); /* At this point, device_suspend() has been called, * but *not* device_power_down(). We *must* @@ -1463,13 +1460,15 @@ int kernel_kexec(void) */ error = device_power_down(PMSG_FREEZE); if (error) - goto Unlock_pm; - + goto Resume_devices; + error = disable_nonboot_cpus(); + if (error) + goto Enable_cpus; local_irq_disable(); /* Suspend system devices */ error = sysdev_suspend(PMSG_FREEZE); if (error) - goto Power_up_devices; + goto Enable_irqs; } else #endif { @@ -1483,13 +1482,13 @@ int kernel_kexec(void) #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { sysdev_resume(); - Power_up_devices: + Enable_irqs: local_irq_enable(); - device_power_up(PMSG_RESTORE); - Unlock_pm: - device_pm_unlock(); + Enable_cpus: enable_nonboot_cpus(); + device_power_up(PMSG_RESTORE); Resume_devices: + device_pm_unlock(); device_resume(PMSG_RESTORE); Resume_console: resume_console(); ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 6/11] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (9 preceding siblings ...) 2009-03-14 11:29 ` Rafael J. Wysocki @ 2009-03-14 11:30 ` Rafael J. Wysocki 2009-03-14 11:30 ` Rafael J. Wysocki ` (12 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:30 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, Frans Pop, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 6/11] PCI PM: Consistently use variable name "error" for pm call return values 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (10 preceding siblings ...) 2009-03-14 11:30 ` [PATCH 6/11] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki @ 2009-03-14 11:30 ` Rafael J. Wysocki 2009-03-14 11:31 ` [PATCH 7/11] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki ` (11 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:30 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI, Frans Pop From: Frans Pop <elendil@planet.nl> I noticed two functions use a variable "i" to store the return value of PM function calls while the rest of the file uses "error". As "i" normally indicates a counter of some sort it seems better to keep this consistent. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,17 +352,17 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; pci_dev->state_saved = false; - i = drv->suspend(pci_dev, state); - suspend_report_result(drv->suspend, i); - if (i) - return i; + error = drv->suspend(pci_dev, state); + suspend_report_result(drv->suspend, error); + if (error) + return error; if (pci_dev->state_saved) goto Fixup; @@ -385,20 +385,20 @@ static int pci_legacy_suspend(struct dev Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return i; + return error; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int i = 0; + int error = 0; if (drv && drv->suspend_late) { - i = drv->suspend_late(pci_dev, state); - suspend_report_result(drv->suspend_late, i); + error = drv->suspend_late(pci_dev, state); + suspend_report_result(drv->suspend_late, error); } - return i; + return error; } static int pci_legacy_resume_early(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 7/11] PCI PM: Use pci_set_power_state during early resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (11 preceding siblings ...) 2009-03-14 11:30 ` Rafael J. Wysocki @ 2009-03-14 11:31 ` Rafael J. Wysocki 2009-03-14 11:31 ` Rafael J. Wysocki ` (10 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:31 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 7/11] PCI PM: Use pci_set_power_state during early resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (12 preceding siblings ...) 2009-03-14 11:31 ` [PATCH 7/11] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki @ 2009-03-14 11:31 ` Rafael J. Wysocki 2009-03-14 11:32 ` [PATCH 8/11] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki ` (9 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:31 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the early phase of resuming devices, we are now able to use the generic pci_set_power_state() to put PCI devices into D0 at that time. Then, the platform-specific PM code will have a chance to handle devices that don't implement the native PCI PM or that require some additional, platform-specific operations to be carried out to power them up. Also, by doing this we can simplify the code quite a bit. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 48 +++++++++--------------------------------------- 1 file changed, 9 insertions(+), 39 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -426,7 +426,6 @@ static inline int platform_pci_sleep_wak * given PCI device * @dev: PCI device to handle. * @state: PCI power state (D0, D1, D2, D3hot) to put the device into. - * @wait: If 'true', wait for the device to change its power state * * RETURN VALUE: * -EINVAL if the requested state is invalid. @@ -435,8 +434,7 @@ static inline int platform_pci_sleep_wak * 0 if device already is in the requested state. * 0 if device's power state has been successfully changed. */ -static int -pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state, bool wait) +static int pci_raw_set_power_state(struct pci_dev *dev, pci_power_t state) { u16 pmcsr; bool need_restore = false; @@ -481,10 +479,8 @@ pci_raw_set_power_state(struct pci_dev * break; case PCI_UNKNOWN: /* Boot-up */ if ((pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D3hot - && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) { + && !(pmcsr & PCI_PM_CTRL_NO_SOFT_RESET)) need_restore = true; - wait = true; - } /* Fall-through: force to D0 */ default: pmcsr = 0; @@ -494,9 +490,6 @@ pci_raw_set_power_state(struct pci_dev * /* enter specified state */ pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, pmcsr); - if (!wait) - return 0; - /* Mandatory power management transition delays */ /* see PCI PM 1.1 5.6.1 table 18 */ if (state == PCI_D3hot || dev->current_state == PCI_D3hot) @@ -521,7 +514,7 @@ pci_raw_set_power_state(struct pci_dev * if (need_restore) pci_restore_bars(dev); - if (wait && dev->bus->self) + if (dev->bus->self) pcie_aspm_pm_state_change(dev->bus->self); return 0; @@ -591,7 +584,7 @@ int pci_set_power_state(struct pci_dev * if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3)) return 0; - error = pci_raw_set_power_state(dev, state, true); + error = pci_raw_set_power_state(dev, state); if (state > PCI_D0 && platform_pci_power_manageable(dev)) { /* Allow the platform to finalize the transition */ @@ -1390,37 +1383,14 @@ void pci_allocate_cap_save_buffers(struc */ int pci_restore_standard_config(struct pci_dev *dev) { - pci_power_t prev_state; - int error; - - pci_update_current_state(dev, PCI_D0); - - prev_state = dev->current_state; - if (prev_state == PCI_D0) - goto Restore; - - error = pci_raw_set_power_state(dev, PCI_D0, false); - if (error) - return error; + pci_update_current_state(dev, PCI_UNKNOWN); - /* - * This assumes that we won't get a bus in B2 or B3 from the BIOS, but - * we've made this assumption forever and it appears to be universally - * satisfied. - */ - switch(prev_state) { - case PCI_D3cold: - case PCI_D3hot: - mdelay(pci_pm_d3_delay); - break; - case PCI_D2: - udelay(PCI_PM_D2_DELAY); - break; + if (dev->current_state != PCI_D0) { + int error = pci_set_power_state(dev, PCI_D0); + if (error) + return error; } - pci_update_current_state(dev, PCI_D0); - - Restore: return dev->state_saved ? pci_restore_state(dev) : 0; } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 8/11] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (13 preceding siblings ...) 2009-03-14 11:31 ` Rafael J. Wysocki @ 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:32 ` Rafael J. Wysocki ` (8 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:32 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 8/11] PCI PM: Move pci_restore_standard_config to pci-driver.c 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (14 preceding siblings ...) 2009-03-14 11:32 ` [PATCH 8/11] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki @ 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:32 ` [PATCH 9/11] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki ` (7 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:32 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Move pci_restore_standard_config() from pci.c to pci-driver.c and make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 17 +++++++++++++++++ drivers/pci/pci.c | 21 --------------------- drivers/pci/pci.h | 1 - 3 files changed, 17 insertions(+), 22 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -423,6 +423,23 @@ static int pci_legacy_resume(struct devi /* Auxiliary functions used by the new power management framework */ +/** + * pci_restore_standard_config - restore standard config registers of PCI device + * @pci_dev: PCI device to handle + */ +static int pci_restore_standard_config(struct pci_dev *pci_dev) +{ + pci_update_current_state(pci_dev, PCI_UNKNOWN); + + if (pci_dev->current_state != PCI_D0) { + int error = pci_set_power_state(pci_dev, PCI_D0); + if (error) + return error; + } + + return pci_dev->state_saved ? pci_restore_state(pci_dev) : 0; +} + static void pci_pm_default_resume_noirq(struct pci_dev *pci_dev) { pci_restore_standard_config(pci_dev); Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -1374,27 +1374,6 @@ void pci_allocate_cap_save_buffers(struc } /** - * pci_restore_standard_config - restore standard config registers of PCI device - * @dev: PCI device to handle - * - * This function assumes that the device's configuration space is accessible. - * If the device needs to be powered up, the function will wait for it to - * change the state. - */ -int pci_restore_standard_config(struct pci_dev *dev) -{ - pci_update_current_state(dev, PCI_UNKNOWN); - - if (dev->current_state != PCI_D0) { - int error = pci_set_power_state(dev, PCI_D0); - if (error) - return error; - } - - return dev->state_saved ? pci_restore_state(dev) : 0; -} - -/** * pci_enable_ari - enable ARI forwarding if hardware support it * @dev: the PCI device */ Index: linux-2.6/drivers/pci/pci.h =================================================================== --- linux-2.6.orig/drivers/pci/pci.h +++ linux-2.6/drivers/pci/pci.h @@ -49,7 +49,6 @@ extern void pci_disable_enabled_device(s extern void pci_pm_init(struct pci_dev *dev); extern void platform_pci_wakeup_init(struct pci_dev *dev); extern void pci_allocate_cap_save_buffers(struct pci_dev *dev); -extern int pci_restore_standard_config(struct pci_dev *dev); static inline bool pci_is_bridge(struct pci_dev *pci_dev) { ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 9/11] PCI PM: Put devices into low power states during late suspend (rev. 2) 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (15 preceding siblings ...) 2009-03-14 11:32 ` Rafael J. Wysocki @ 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:32 ` Rafael J. Wysocki ` (6 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:32 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 134 ++++++++++++++++++++++++++++------------------- 1 file changed, 81 insertions(+), 53 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,44 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); - if (drv && drv->pm && drv->pm->poweroff_noirq) { + if (!drv || !drv->pm) + return 0; + + if (drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 9/11] PCI PM: Put devices into low power states during late suspend (rev. 2) 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (16 preceding siblings ...) 2009-03-14 11:32 ` [PATCH 9/11] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki @ 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:33 ` [PATCH 10/11] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki ` (5 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:32 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> Once we have allowed timer interrupts to be enabled during the late phase of suspending devices, we are now able to use the generic pci_set_power_state() to put PCI devices into low power states at that time. We can also use some related platform callbacks, like the ones preparing devices for wake-up, during the late suspend. Doing this will allow us to avoid the race condition where a device using shared interrupts is put into a low power state with interrupts enabled and then an interrupt (for another device) comes in and confuses its driver. At the same time, devices that don't support the native PCI PM or that require some additional, platform-specific operations to be carried out to put them into low power states will be handled as appropriate. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 134 ++++++++++++++++++++++++++++------------------- 1 file changed, 81 insertions(+), 53 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -352,53 +352,60 @@ static int pci_legacy_suspend(struct dev { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; + + pci_dev->state_saved = false; if (drv && drv->suspend) { pci_power_t prev = pci_dev->current_state; - - pci_dev->state_saved = false; + int error; error = drv->suspend(pci_dev, state); suspend_report_result(drv->suspend, error); if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: Device state not saved by %pF\n", drv->suspend); - goto Fixup; } } - pci_save_state(pci_dev); - /* - * This is for compatibility with existing code with legacy PM support. - */ - pci_pm_set_unknown_state(pci_dev); - - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_legacy_suspend_late(struct device *dev, pm_message_t state) { struct pci_dev * pci_dev = to_pci_dev(dev); struct pci_driver * drv = pci_dev->driver; - int error = 0; if (drv && drv->suspend_late) { + pci_power_t prev = pci_dev->current_state; + int error; + error = drv->suspend_late(pci_dev, state); suspend_report_result(drv->suspend_late, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: Device state not saved by %pF\n", + drv->suspend_late); + return 0; + } } - return error; + + if (!pci_dev->state_saved) + pci_save_state(pci_dev); + + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_legacy_resume_early(struct device *dev) @@ -460,7 +467,6 @@ static void pci_pm_default_suspend(struc /* Disable non-bridge devices without PM support */ if (!pci_is_bridge(pci_dev)) pci_disable_enabled_device(pci_dev); - pci_save_state(pci_dev); } static bool pci_has_legacy_pm_support(struct pci_dev *pci_dev) @@ -526,24 +532,14 @@ static int pci_pm_suspend(struct device if (error) return error; - if (pci_dev->state_saved) - goto Fixup; - - if (pci_dev->current_state != PCI_D0 + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->suspend); - goto Fixup; } } - if (!pci_dev->state_saved) { - pci_save_state(pci_dev); - if (!pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - } - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); @@ -553,21 +549,41 @@ static int pci_pm_suspend(struct device static int pci_pm_suspend_noirq(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); - struct device_driver *drv = dev->driver; - int error = 0; + struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (drv && drv->pm && drv->pm->suspend_noirq) { - error = drv->pm->suspend_noirq(dev); - suspend_report_result(drv->pm->suspend_noirq, error); + if (!pm) + return 0; + + if (pm->suspend_noirq) { + pci_power_t prev = pci_dev->current_state; + int error; + + error = pm->suspend_noirq(dev); + suspend_report_result(pm->suspend_noirq, error); + if (error) + return error; + + if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + && pci_dev->current_state != PCI_UNKNOWN) { + WARN_ONCE(pci_dev->current_state != prev, + "PCI PM: State of device not saved by %pF\n", + pm->suspend_noirq); + return 0; + } } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) { + pci_save_state(pci_dev); + if (!pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + } - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_resume_noirq(struct device *dev) @@ -650,9 +666,6 @@ static int pci_pm_freeze(struct device * return error; } - if (!pci_dev->state_saved) - pci_save_state(pci_dev); - return 0; } @@ -660,20 +673,25 @@ static int pci_pm_freeze_noirq(struct de { struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_FREEZE); if (drv && drv->pm && drv->pm->freeze_noirq) { + int error; + error = drv->pm->freeze_noirq(dev); suspend_report_result(drv->pm->freeze_noirq, error); + if (error) + return error; } - if (!error) - pci_pm_set_unknown_state(pci_dev); + if (!pci_dev->state_saved) + pci_save_state(pci_dev); - return error; + pci_pm_set_unknown_state(pci_dev); + + return 0; } static int pci_pm_thaw_noirq(struct device *dev) @@ -716,7 +734,6 @@ static int pci_pm_poweroff(struct device { struct pci_dev *pci_dev = to_pci_dev(dev); struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - int error = 0; if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); @@ -729,33 +746,44 @@ static int pci_pm_poweroff(struct device pci_dev->state_saved = false; if (pm->poweroff) { + int error; + error = pm->poweroff(dev); suspend_report_result(pm->poweroff, error); + if (error) + return error; } - if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) - pci_prepare_to_sleep(pci_dev); - Fixup: pci_fixup_device(pci_fixup_suspend, pci_dev); - return error; + return 0; } static int pci_pm_poweroff_noirq(struct device *dev) { + struct pci_dev *pci_dev = to_pci_dev(dev); struct device_driver *drv = dev->driver; - int error = 0; if (pci_has_legacy_pm_support(to_pci_dev(dev))) return pci_legacy_suspend_late(dev, PMSG_HIBERNATE); - if (drv && drv->pm && drv->pm->poweroff_noirq) { + if (!drv || !drv->pm) + return 0; + + if (drv->pm->poweroff_noirq) { + int error; + error = drv->pm->poweroff_noirq(dev); suspend_report_result(drv->pm->poweroff_noirq, error); + if (error) + return error; } - return error; + if (!pci_dev->state_saved && !pci_is_bridge(pci_dev)) + pci_prepare_to_sleep(pci_dev); + + return 0; } static int pci_pm_restore_noirq(struct device *dev) ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 10/11] PCI PM: Make pci_set_power_state() handle devices with no PM support 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (17 preceding siblings ...) 2009-03-14 11:32 ` Rafael J. Wysocki @ 2009-03-14 11:33 ` Rafael J. Wysocki 2009-03-14 11:33 ` Rafael J. Wysocki ` (4 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:33 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> There is a problem with PCI devices without any PM support (either native or through the platform) that pci_set_power_state() always returns error code for them, even if they are being put into D0. However, such devices are always in D0, so pci_set_power_state() should return success when attempting to put such a device into D0. It also should update the current_state field for these devices as appropriate. This modification is necessary so that the standard configuration registers of these devices are successfully restored by pci_restore_standard_config() during the "early" phase of resume. In addition, pci_set_power_state() should check the value of current_state before calling the platform to change the power state of the device to avoid doing that unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -439,6 +439,10 @@ static int pci_raw_set_power_state(struc u16 pmcsr; bool need_restore = false; + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + if (!dev->pm_cap) return -EIO; @@ -449,10 +453,7 @@ static int pci_raw_set_power_state(struc * Can enter D0 from any state, but if we can only go deeper * to sleep if we're already in a low power state */ - if (dev->current_state == state) { - /* we're already there */ - return 0; - } else if (state != PCI_D0 && dev->current_state <= PCI_D3cold + if (state != PCI_D0 && dev->current_state <= PCI_D3cold && dev->current_state > state) { dev_err(&dev->dev, "invalid power transition " "(from state %d to %d)\n", dev->current_state, state); @@ -570,12 +571,17 @@ int pci_set_power_state(struct pci_dev * */ return 0; - if (state == PCI_D0 && platform_pci_power_manageable(dev)) { + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + + if (state == PCI_D0) { /* * Allow the platform to change the state, for example via ACPI * _PR0, _PS0 and some such, but do not trust it. */ - int ret = platform_pci_set_power_state(dev, PCI_D0); + int ret = platform_pci_power_manageable(dev) ? + platform_pci_set_power_state(dev, PCI_D0) : 0; if (!ret) pci_update_current_state(dev, PCI_D0); } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 10/11] PCI PM: Make pci_set_power_state() handle devices with no PM support 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (18 preceding siblings ...) 2009-03-14 11:33 ` [PATCH 10/11] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki @ 2009-03-14 11:33 ` Rafael J. Wysocki 2009-03-14 11:34 ` [PATCH 11/11] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki ` (3 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:33 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> There is a problem with PCI devices without any PM support (either native or through the platform) that pci_set_power_state() always returns error code for them, even if they are being put into D0. However, such devices are always in D0, so pci_set_power_state() should return success when attempting to put such a device into D0. It also should update the current_state field for these devices as appropriate. This modification is necessary so that the standard configuration registers of these devices are successfully restored by pci_restore_standard_config() during the "early" phase of resume. In addition, pci_set_power_state() should check the value of current_state before calling the platform to change the power state of the device to avoid doing that unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-2.6/drivers/pci/pci.c =================================================================== --- linux-2.6.orig/drivers/pci/pci.c +++ linux-2.6/drivers/pci/pci.c @@ -439,6 +439,10 @@ static int pci_raw_set_power_state(struc u16 pmcsr; bool need_restore = false; + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + if (!dev->pm_cap) return -EIO; @@ -449,10 +453,7 @@ static int pci_raw_set_power_state(struc * Can enter D0 from any state, but if we can only go deeper * to sleep if we're already in a low power state */ - if (dev->current_state == state) { - /* we're already there */ - return 0; - } else if (state != PCI_D0 && dev->current_state <= PCI_D3cold + if (state != PCI_D0 && dev->current_state <= PCI_D3cold && dev->current_state > state) { dev_err(&dev->dev, "invalid power transition " "(from state %d to %d)\n", dev->current_state, state); @@ -570,12 +571,17 @@ int pci_set_power_state(struct pci_dev * */ return 0; - if (state == PCI_D0 && platform_pci_power_manageable(dev)) { + /* Check if we're already there */ + if (dev->current_state == state) + return 0; + + if (state == PCI_D0) { /* * Allow the platform to change the state, for example via ACPI * _PR0, _PS0 and some such, but do not trust it. */ - int ret = platform_pci_set_power_state(dev, PCI_D0); + int ret = platform_pci_power_manageable(dev) ? + platform_pci_set_power_state(dev, PCI_D0) : 0; if (!ret) pci_update_current_state(dev, PCI_D0); } ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 11/11] PCI PM: Restore config spaces of all devices during early resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (19 preceding siblings ...) 2009-03-14 11:33 ` Rafael J. Wysocki @ 2009-03-14 11:34 ` Rafael J. Wysocki 2009-03-14 11:34 ` Rafael J. Wysocki ` (2 subsequent siblings) 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:34 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner From: Rafael J. Wysocki <rjw@sisk.pl> At present the configuration spaces of PCI devices that have no drivers or no PM support in the drivers (either legacy or through a pm object) are not saved during suspend and, consequently, they are not restored during resume. This generally may lead to the state of the system being slightly inconsistent after the resume, so it's better to save and restore the configuration spaces of these devices as well. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -516,13 +516,13 @@ static int pci_pm_suspend(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_SUSPEND); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->suspend) { pci_power_t prev = pci_dev->current_state; int error; @@ -554,8 +554,10 @@ static int pci_pm_suspend_noirq(struct d if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (!pm) + if (!pm) { + pci_save_state(pci_dev); return 0; + } if (pm->suspend_noirq) { pci_power_t prev = pci_dev->current_state; @@ -650,13 +652,13 @@ static int pci_pm_freeze(struct device * if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_FREEZE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); return 0; } - pci_dev->state_saved = false; - if (pm->freeze) { int error; @@ -738,13 +740,13 @@ static int pci_pm_poweroff(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->poweroff) { int error; ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 11/11] PCI PM: Restore config spaces of all devices during early resume 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (20 preceding siblings ...) 2009-03-14 11:34 ` [PATCH 11/11] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki @ 2009-03-14 11:34 ` Rafael J. Wysocki 2009-03-14 11:43 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Ingo Molnar 2009-03-14 11:43 ` Ingo Molnar 23 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:34 UTC (permalink / raw) To: pm list Cc: LKML, Linus Torvalds, Ingo Molnar, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI From: Rafael J. Wysocki <rjw@sisk.pl> At present the configuration spaces of PCI devices that have no drivers or no PM support in the drivers (either legacy or through a pm object) are not saved during suspend and, consequently, they are not restored during resume. This generally may lead to the state of the system being slightly inconsistent after the resume, so it's better to save and restore the configuration spaces of these devices as well. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/pci/pci-driver.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) Index: linux-2.6/drivers/pci/pci-driver.c =================================================================== --- linux-2.6.orig/drivers/pci/pci-driver.c +++ linux-2.6/drivers/pci/pci-driver.c @@ -516,13 +516,13 @@ static int pci_pm_suspend(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_SUSPEND); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->suspend) { pci_power_t prev = pci_dev->current_state; int error; @@ -554,8 +554,10 @@ static int pci_pm_suspend_noirq(struct d if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); - if (!pm) + if (!pm) { + pci_save_state(pci_dev); return 0; + } if (pm->suspend_noirq) { pci_power_t prev = pci_dev->current_state; @@ -650,13 +652,13 @@ static int pci_pm_freeze(struct device * if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_FREEZE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); return 0; } - pci_dev->state_saved = false; - if (pm->freeze) { int error; @@ -738,13 +740,13 @@ static int pci_pm_poweroff(struct device if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend(dev, PMSG_HIBERNATE); + pci_dev->state_saved = false; + if (!pm) { pci_pm_default_suspend(pci_dev); goto Fixup; } - pci_dev->state_saved = false; - if (pm->poweroff) { int error; ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (21 preceding siblings ...) 2009-03-14 11:34 ` Rafael J. Wysocki @ 2009-03-14 11:43 ` Ingo Molnar 2009-03-14 11:43 ` Ingo Molnar 23 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-03-14 11:43 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, pm list, Linus Torvalds, Thomas Gleixner * Rafael J. Wysocki <rjw@sisk.pl> wrote: > Hi, > > This is an update of the patch series reworking the handling > of interrupts during suspend-resume, addressing some comments > from Thomas and Ingo. Looks very nice - thanks Rafael! Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* Re: [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki ` (22 preceding siblings ...) 2009-03-14 11:43 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Ingo Molnar @ 2009-03-14 11:43 ` Ingo Molnar 23 siblings, 0 replies; 373+ messages in thread From: Ingo Molnar @ 2009-03-14 11:43 UTC (permalink / raw) To: Rafael J. Wysocki Cc: pm list, LKML, Linus Torvalds, Eric W. Biederman, Benjamin Herrenschmidt, Jeremy Fitzhardinge, Len Brown, Jesse Barnes, Thomas Gleixner, Arve Hjønnevåg, Linux PCI * Rafael J. Wysocki <rjw@sisk.pl> wrote: > Hi, > > This is an update of the patch series reworking the handling > of interrupts during suspend-resume, addressing some comments > from Thomas and Ingo. Looks very nice - thanks Rafael! Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 373+ messages in thread
* [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) 2009-02-22 17:37 ` Rafael J. Wysocki ` (14 preceding siblings ...) (?) @ 2009-03-14 11:24 ` Rafael J. Wysocki -1 siblings, 0 replies; 373+ messages in thread From: Rafael J. Wysocki @ 2009-03-14 11:24 UTC (permalink / raw) To: pm list Cc: Arve, Jeremy Fitzhardinge, LKML, Jesse Barnes, Eric W. Biederman, Linux PCI, Ingo Molnar, Linus Torvalds, Thomas Gleixner Hi, This is an update of the patch series reworking the handling of interrupts during suspend-resume, addressing some comments from Thomas and Ingo. The following patches modifiy the way in which we handle disabling interrupts during suspend and enabling them during resume. They also change the ordering of the core suspend and hibernation code to take advantage of the new approach to the interrupts and modify the PCI PM core to avoid a few problems. Namely, interrupts are currently disabled on the boot CPU as soon as the nonboot CPUs have been disabled, which doesn't allow device drivers' "late" suspend and "early" resume callbacks to sleep. Among other things this means they cannot execute ACPI AML routines, which leads to problems with suspend-resume of PCI devices, as recently discussed. 1/11 introduces helper functions used by the subsequent patches. 2/11 modifies the [suspend|hibernation] and resume code, as well as the other code using the device PM framework, so that device drivers will not receive interrupts during the "late" suspend phase, although interrupts will only be disabled on the CPU right before calling sysdev_suspend() (and analogously during resume). 3/11 - 5/11 modify the suspend, hibernation and kexec jump code, respectively, so that the "late" phase of suspending devices will happen before executing the platform "prepare" callback and disabling nonboot CPUs (and analogously during resume). 6/11 is a patch that's already in the PCI linux-next tree and I included it in the series, because the next patches depend on it. 7/11 makes the PCI PM core use pci_set_power_state() to put devices into D0 during early resume, which allows the platform-specific operations to be carried out at that time, if necessary. 8/11 uses the opportunity to move pci_restore_standard_config() to pci-driver.c, where it belongs IMO. 9/11 makes the PCI PM core code put devices into low power states during the "late" phase of suspend which allows us to avoid a long-standing race related to shared interrupts and to handle devices that require some platform-specific operations to be put into low power states appropriately at the same time. [The second rev of the patch retains the current behavior during the "power-off" phase of hibernation, which is that the devices without drivers or without PM support in the drivers are not power managed by the core.] 10/11 fixes pci_set_power_state() so that it doesn't return error code when attempting to put a PCI device without PM support (either native or through the platform) into D0 (such devices are always in D0). 11/11 makes the PCI PM core save and restore the configuration spaces of devices that have no drivers or no PM support in the drivers during suspend and resume, respectively. There is a git tree containing these patches, for easier testing, at: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6.git (linux-next branch). At the moment it has a merge conflict with the PCI linux-next tree due to 6/11. Thanks, Rafael ^ permalink raw reply [flat|nested] 373+ messages in thread
end of thread, other threads:[~2009-03-14 22:31 UTC | newest] Thread overview: 373+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-02-22 17:37 [RFC][PATCH 0/2] Rework disabling of interrupts during suspend-resume Rafael J. Wysocki 2009-02-22 17:37 ` Rafael J. Wysocki 2009-02-22 17:38 ` [RFC][PATCH 1/2] PM: Split up sysdev_[suspend|resume] from device_power_[down|up] Rafael J. Wysocki 2009-02-22 17:38 ` Rafael J. Wysocki 2009-02-22 20:56 ` Adrian Bunk 2009-02-22 21:07 ` Linus Torvalds 2009-02-22 21:07 ` Linus Torvalds 2009-02-22 21:12 ` Ingo Molnar 2009-02-22 21:12 ` Ingo Molnar 2009-02-22 22:42 ` Adrian Bunk 2009-02-22 22:42 ` Adrian Bunk 2009-02-22 20:56 ` Adrian Bunk 2009-03-05 16:54 ` Pavel Machek 2009-03-05 16:54 ` Pavel Machek 2009-02-22 17:39 ` [RFC][PATCH 2/2] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-23 0:05 ` Linus Torvalds 2009-02-23 0:05 ` Linus Torvalds 2009-02-23 1:23 ` Linus Torvalds 2009-02-23 1:23 ` Linus Torvalds 2009-02-23 10:52 ` Rafael J. Wysocki 2009-02-23 3:04 ` Eric W. Biederman 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 8:44 ` Ingo Molnar 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:22 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 15:28 ` Eric W. Biederman 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-23 21:39 ` Rafael J. Wysocki 2009-02-24 3:30 ` Eric W. Biederman 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 22:51 ` Linus Torvalds 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-24 23:29 ` Rafael J. Wysocki 2009-02-25 13:23 ` Ingo Molnar 2009-02-25 13:23 ` Ingo Molnar 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 1:27 ` Linus Torvalds 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:00 ` Ingo Molnar 2009-02-26 3:31 ` Arve Hjønnevåg 2009-02-26 3:31 ` Arve Hjønnevåg 2009-02-26 3:37 ` Linus Torvalds 2009-02-26 3:37 ` Linus Torvalds 2009-02-26 3:50 ` Arve Hjønnevåg 2009-02-26 3:50 ` Arve Hjønnevåg 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 3:57 ` Linus Torvalds 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:13 ` Arve Hjønnevåg 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:20 ` Eric W. Biederman 2009-02-26 4:24 ` Arve Hjønnevåg 2009-02-26 4:24 ` Arve Hjønnevåg 2009-02-26 2:51 ` Linus Torvalds 2009-02-26 2:13 ` Arve Hjønnevåg 2009-02-26 9:50 ` Rafael J. Wysocki 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 20:57 ` Benjamin Herrenschmidt 2009-02-26 20:57 ` Benjamin Herrenschmidt 2009-02-26 21:20 ` Arve Hjønnevåg 2009-02-26 21:49 ` Benjamin Herrenschmidt 2009-02-26 21:49 ` Benjamin Herrenschmidt 2009-02-26 21:20 ` Arve Hjønnevåg 2009-02-26 21:58 ` Rafael J. Wysocki 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:10 ` Linus Torvalds 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 22:30 ` Arve Hjønnevåg 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-26 23:10 ` Rafael J. Wysocki 2009-02-27 0:00 ` Arve Hjønnevåg 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 3:20 ` [linux-pm] " Alan Stern 2009-02-27 4:43 ` Linus Torvalds 2009-02-27 4:43 ` Linus Torvalds 2009-02-27 14:59 ` [linux-pm] " Alan Stern 2009-02-27 20:30 ` Linus Torvalds 2009-02-27 20:30 ` [linux-pm] " Linus Torvalds 2009-02-28 3:54 ` Arve Hjønnevåg 2009-02-28 3:54 ` [linux-pm] " Arve Hjønnevåg 2009-02-28 10:06 ` Rafael J. Wysocki 2009-02-28 10:06 ` [linux-pm] " Rafael J. Wysocki 2009-02-28 17:03 ` Linus Torvalds 2009-02-28 17:03 ` Linus Torvalds 2009-02-28 22:15 ` [linux-pm] " Arve Hjønnevåg 2009-02-28 22:15 ` Arve Hjønnevåg 2009-02-27 14:59 ` Alan Stern 2009-02-27 3:20 ` Alan Stern 2009-02-27 0:27 ` Linus Torvalds 2009-02-27 0:00 ` Arve Hjønnevåg 2009-02-26 22:30 ` Rafael J. Wysocki 2009-02-26 22:30 ` Rafael J. Wysocki 2009-02-26 21:58 ` Rafael J. Wysocki 2009-02-26 20:34 ` Arve Hjønnevåg 2009-02-26 9:50 ` Rafael J. Wysocki 2009-02-26 1:17 ` Arve Hjønnevåg 2009-02-24 23:09 ` Ingo Molnar 2009-02-24 23:07 ` Rafael J. Wysocki 2009-02-25 4:16 ` Eric W. Biederman 2009-02-25 4:26 ` Linus Torvalds 2009-02-25 4:26 ` Linus Torvalds 2009-02-25 4:59 ` Eric W. Biederman 2009-02-25 4:59 ` Eric W. Biederman 2009-02-25 4:16 ` Eric W. Biederman 2009-02-25 15:32 ` Alan Stern 2009-02-25 15:32 ` [linux-pm] " Alan Stern 2009-02-25 16:19 ` Linus Torvalds 2009-02-25 16:19 ` Linus Torvalds 2009-02-24 22:42 ` Rafael J. Wysocki 2009-02-24 3:30 ` Eric W. Biederman 2009-02-23 11:03 ` Rafael J. Wysocki 2009-02-23 11:04 ` Ingo Molnar 2009-02-23 14:45 ` Rafael J. Wysocki 2009-02-23 15:06 ` Ingo Molnar 2009-02-23 15:06 ` Ingo Molnar 2009-02-23 21:59 ` Rafael J. Wysocki 2009-02-23 21:59 ` Rafael J. Wysocki 2009-02-23 14:45 ` Rafael J. Wysocki 2009-02-23 11:04 ` Ingo Molnar 2009-02-23 10:42 ` Eric W. Biederman 2009-02-23 9:44 ` Ingo Molnar 2009-02-23 10:13 ` Benjamin Herrenschmidt 2009-02-23 10:13 ` Benjamin Herrenschmidt 2009-02-23 3:04 ` Eric W. Biederman 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 8:36 ` Ingo Molnar 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 14:48 ` Rafael J. Wysocki 2009-02-23 14:48 ` Rafael J. Wysocki 2009-02-23 20:49 ` Benjamin Herrenschmidt 2009-02-23 20:49 ` Benjamin Herrenschmidt 2009-02-23 12:28 ` Ingo Molnar 2009-02-23 12:45 ` Ingo Molnar 2009-02-23 15:07 ` Rafael J. Wysocki 2009-02-23 15:07 ` Rafael J. Wysocki 2009-02-23 12:45 ` Ingo Molnar 2009-02-23 15:52 ` Johannes Berg 2009-02-23 15:52 ` Johannes Berg 2009-02-23 17:16 ` Ingo Molnar 2009-02-23 17:16 ` Ingo Molnar 2009-02-23 17:28 ` Linus Torvalds 2009-02-23 17:28 ` Linus Torvalds 2009-02-23 22:11 ` Rafael J. Wysocki 2009-02-23 22:11 ` Rafael J. Wysocki 2009-02-23 11:29 ` Rafael J. Wysocki 2009-02-22 23:48 ` Rafael J. Wysocki 2009-02-22 22:42 ` Rafael J. Wysocki 2009-02-22 18:01 ` Linus Torvalds 2009-02-23 22:11 ` Arve Hjønnevåg 2009-02-23 22:11 ` Arve Hjønnevåg 2009-02-23 22:23 ` Rafael J. Wysocki 2009-02-23 22:23 ` Rafael J. Wysocki 2009-02-23 22:44 ` Arve Hjønnevåg 2009-02-23 22:44 ` Arve Hjønnevåg 2009-02-22 17:39 ` Rafael J. Wysocki 2009-02-22 18:13 ` [RFC][PATCH 0/2] Rework disabling " Linus Torvalds 2009-02-22 18:13 ` Linus Torvalds 2009-02-22 18:18 ` Ingo Molnar 2009-02-22 18:25 ` Linus Torvalds 2009-02-22 18:25 ` Linus Torvalds 2009-02-22 18:35 ` Linus Torvalds 2009-02-22 18:35 ` Linus Torvalds 2009-02-22 18:18 ` Ingo Molnar 2009-02-22 22:37 ` Eric W. Biederman 2009-02-22 22:37 ` Eric W. Biederman 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 22:56 ` Benjamin Herrenschmidt 2009-02-22 23:02 ` Linus Torvalds 2009-02-22 23:02 ` Linus Torvalds 2009-03-01 22:21 ` [RFC][PATCH 0/4] " Rafael J. Wysocki 2009-03-01 22:21 ` Rafael J. Wysocki 2009-03-01 22:24 ` [RFC][PATCH 1/4] PM: Rework handling of interrupts during suspend-resume (rev. 4) Rafael J. Wysocki 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:01 ` Arve Hjønnevåg 2009-03-02 23:13 ` Rafael J. Wysocki 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:18 ` Arve Hjønnevåg 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-02 23:27 ` Rafael J. Wysocki 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-04 22:03 ` [Update, rev. 5] " Rafael J. Wysocki 2009-03-05 10:35 ` Ingo Molnar 2009-03-05 10:35 ` Ingo Molnar 2009-03-04 22:03 ` Rafael J. Wysocki 2009-03-03 22:56 ` Arve Hjønnevåg 2009-03-02 23:32 ` Linus Torvalds 2009-03-02 23:32 ` Linus Torvalds 2009-03-02 23:35 ` Linus Torvalds 2009-03-02 23:35 ` Linus Torvalds 2009-03-03 0:08 ` Arve Hjønnevåg 2009-03-03 0:08 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg 2009-03-03 8:41 ` Arve Hjønnevåg 2009-03-02 23:13 ` Rafael J. Wysocki 2009-03-01 22:24 ` Rafael J. Wysocki 2009-03-01 22:25 ` [RFC][PATCH 2/4] PM: Change suspend code ordering Rafael J. Wysocki 2009-03-01 22:25 ` Rafael J. Wysocki 2009-03-02 20:48 ` Linus Torvalds 2009-03-02 20:48 ` Linus Torvalds 2009-03-02 22:02 ` Rafael J. Wysocki 2009-03-02 22:02 ` Rafael J. Wysocki 2009-03-01 22:26 ` [RFC][PATCH 3/4] PM: Change hibernation " Rafael J. Wysocki 2009-03-01 22:26 ` Rafael J. Wysocki 2009-03-01 22:27 ` [RFC][PATCH 4/4] kexec: Change kexec jump " Rafael J. Wysocki 2009-03-01 22:27 ` Rafael J. Wysocki 2009-03-05 23:44 ` [RFC][PATCH 0/4] Rework disabling of interrupts during suspend-resume Linus Torvalds 2009-03-05 23:44 ` Linus Torvalds 2009-03-06 6:47 ` Sitsofe Wheeler 2009-03-06 6:47 ` Sitsofe Wheeler 2009-03-06 10:19 ` Rafael J. Wysocki 2009-03-06 10:19 ` Rafael J. Wysocki 2009-03-07 10:19 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Rafael J. Wysocki 2009-03-07 10:19 ` Rafael J. Wysocki 2009-03-07 10:20 ` [RFC][PATCH][1/8] PM: Rework handling of interrupts during suspend-resume (rev. 5) Rafael J. Wysocki 2009-03-07 10:20 ` Rafael J. Wysocki 2009-03-07 16:51 ` [linux-pm] " Alan Stern 2009-03-07 17:56 ` Rafael J. Wysocki 2009-03-07 17:56 ` [linux-pm] " Rafael J. Wysocki 2009-03-08 3:53 ` Alan Stern 2009-03-08 3:53 ` [linux-pm] " Alan Stern 2009-03-08 10:00 ` Rafael J. Wysocki 2009-03-08 10:00 ` [linux-pm] " Rafael J. Wysocki 2009-03-08 12:37 ` Alan Stern 2009-03-08 12:37 ` [linux-pm] " Alan Stern 2009-03-08 17:20 ` Linus Torvalds 2009-03-08 20:40 ` Alan Stern 2009-03-08 20:40 ` [linux-pm] " Alan Stern 2009-03-08 21:37 ` Rafael J. Wysocki 2009-03-08 21:37 ` Rafael J. Wysocki 2009-03-09 14:59 ` Linus Torvalds 2009-03-09 14:59 ` [linux-pm] " Linus Torvalds 2009-03-09 15:13 ` Alan Stern 2009-03-09 15:40 ` Linus Torvalds 2009-03-09 15:40 ` [linux-pm] " Linus Torvalds 2009-03-09 15:13 ` Alan Stern 2009-03-08 17:20 ` Linus Torvalds 2009-03-07 16:51 ` Alan Stern 2009-03-07 10:21 ` [RFC][PATCH][2/8] PM: Change suspend code ordering Rafael J. Wysocki 2009-03-07 10:21 ` Rafael J. Wysocki 2009-03-07 10:22 ` [RFC][PATCH][3/8] PM: Change hibernation " Rafael J. Wysocki 2009-03-07 10:22 ` Rafael J. Wysocki 2009-03-07 10:23 ` [RFC][PATCH][4/8] kexec: Change kexec jump " Rafael J. Wysocki 2009-03-07 10:23 ` Rafael J. Wysocki 2009-03-07 10:24 ` [RFC][PATCH][5/8] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki 2009-03-07 10:24 ` Rafael J. Wysocki 2009-03-07 10:25 ` [RFC][PATCH][6/8] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki 2009-03-07 10:25 ` Rafael J. Wysocki 2009-03-07 10:26 ` [RFC][PATCH][7/8] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki 2009-03-07 10:26 ` Rafael J. Wysocki 2009-03-07 10:27 ` [RFC][PATCH][8/8] PCI PM: Put devices into low power states during late suspend Rafael J. Wysocki 2009-03-07 10:27 ` Rafael J. Wysocki 2009-03-08 19:28 ` [RFC][PATCH][0/8] PM: Rework suspend-resume ordering to avoid problems with shared interrupts Frans Pop 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-08 20:50 ` Rafael J. Wysocki 2009-03-14 8:44 ` Frans Pop 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 14:11 ` Frans Pop 2009-03-14 14:11 ` Frans Pop 2009-03-14 22:31 ` Rafael J. Wysocki 2009-03-14 22:31 ` Rafael J. Wysocki 2009-03-14 11:59 ` Rafael J. Wysocki 2009-03-14 8:44 ` Frans Pop 2009-03-08 19:28 ` Frans Pop 2009-03-11 9:30 ` [PATCH 0/10] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated) Rafael J. Wysocki 2009-03-11 9:30 ` Rafael J. Wysocki 2009-03-11 9:36 ` [PATCH 1/10] PM: Rework handling of interrupts during suspend-resume (rev. 5) Rafael J. Wysocki 2009-03-11 9:36 ` Rafael J. Wysocki 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 10:33 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 21:42 ` Thomas Gleixner 2009-03-11 22:01 ` Rafael J. Wysocki 2009-03-11 22:01 ` Rafael J. Wysocki 2009-03-11 22:45 ` Thomas Gleixner 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-12 21:43 ` [update, rev. 6] " Rafael J. Wysocki 2009-03-12 21:43 ` Rafael J. Wysocki 2009-03-13 0:39 ` Ingo Molnar 2009-03-13 0:39 ` Ingo Molnar 2009-03-13 17:07 ` Rafael J. Wysocki 2009-03-13 17:07 ` Rafael J. Wysocki 2009-03-13 7:15 ` Arve Hjønnevåg 2009-03-13 7:15 ` Arve Hjønnevåg 2009-03-13 16:53 ` Rafael J. Wysocki 2009-03-13 16:53 ` Rafael J. Wysocki 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 19:55 ` Thomas Gleixner 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-13 21:56 ` Rafael J. Wysocki 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 7:31 ` Thomas Gleixner 2009-03-14 10:01 ` Rafael J. Wysocki 2009-03-14 10:01 ` Rafael J. Wysocki 2009-03-14 0:04 ` Rafael J. Wysocki 2009-03-14 0:04 ` Rafael J. Wysocki 2009-03-12 13:36 ` Rafael J. Wysocki 2009-03-11 22:45 ` Thomas Gleixner 2009-03-11 20:59 ` Rafael J. Wysocki 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:15 ` Rafael J. Wysocki 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:35 ` Thomas Gleixner 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:50 ` Rafael J. Wysocki 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 21:53 ` Thomas Gleixner 2009-03-11 22:01 ` Linus Torvalds 2009-03-11 22:01 ` Linus Torvalds 2009-03-11 22:13 ` Rafael J. Wysocki 2009-03-11 22:13 ` Rafael J. Wysocki 2009-03-11 22:25 ` Thomas Gleixner 2009-03-11 22:25 ` Thomas Gleixner 2009-03-11 22:07 ` Rafael J. Wysocki 2009-03-11 22:07 ` Rafael J. Wysocki 2009-03-11 9:37 ` [PATCH 2/10] PM: Change suspend code ordering Rafael J. Wysocki 2009-03-11 9:37 ` Rafael J. Wysocki 2009-03-11 9:38 ` [PATCH 3/10] PM: Change hibernation " Rafael J. Wysocki 2009-03-11 9:38 ` Rafael J. Wysocki 2009-03-11 9:39 ` [PATCH 4/10] kexec: Change kexec jump " Rafael J. Wysocki 2009-03-11 9:39 ` Rafael J. Wysocki 2009-03-11 9:41 ` [PATCH 5/10] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki 2009-03-11 9:41 ` Rafael J. Wysocki 2009-03-11 9:42 ` [PATCH 6/10] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki 2009-03-11 9:42 ` Rafael J. Wysocki 2009-03-11 9:47 ` [PATCH 7/10] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki 2009-03-11 9:47 ` Rafael J. Wysocki 2009-03-11 9:48 ` [PATCH 8/10] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki 2009-03-11 9:48 ` Rafael J. Wysocki 2009-03-11 9:55 ` [PATCH 9/10] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki 2009-03-11 9:55 ` Rafael J. Wysocki 2009-03-11 9:56 ` [PATCH 10/10] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki 2009-03-11 9:56 ` Rafael J. Wysocki 2009-03-14 11:24 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Rafael J. Wysocki 2009-03-14 11:26 ` [PATCH 1/11] PM: Introduce functions for suspending and resuming device interrupts Rafael J. Wysocki 2009-03-14 11:26 ` Rafael J. Wysocki 2009-03-14 11:27 ` [PATCH 2/11] PM: Rework handling of interrupts during suspend-resume Rafael J. Wysocki 2009-03-14 11:27 ` Rafael J. Wysocki 2009-03-14 11:28 ` [PATCH 3/11] PM: Change suspend code ordering Rafael J. Wysocki 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:28 ` [PATCH 4/11] PM: Change hibernation " Rafael J. Wysocki 2009-03-14 11:28 ` Rafael J. Wysocki 2009-03-14 11:29 ` [PATCH 5/11] kexec: Change kexec jump " Rafael J. Wysocki 2009-03-14 11:29 ` Rafael J. Wysocki 2009-03-14 11:30 ` [PATCH 6/11] PCI PM: Consistently use variable name "error" for pm call return values Rafael J. Wysocki 2009-03-14 11:30 ` Rafael J. Wysocki 2009-03-14 11:31 ` [PATCH 7/11] PCI PM: Use pci_set_power_state during early resume Rafael J. Wysocki 2009-03-14 11:31 ` Rafael J. Wysocki 2009-03-14 11:32 ` [PATCH 8/11] PCI PM: Move pci_restore_standard_config to pci-driver.c Rafael J. Wysocki 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:32 ` [PATCH 9/11] PCI PM: Put devices into low power states during late suspend (rev. 2) Rafael J. Wysocki 2009-03-14 11:32 ` Rafael J. Wysocki 2009-03-14 11:33 ` [PATCH 10/11] PCI PM: Make pci_set_power_state() handle devices with no PM support Rafael J. Wysocki 2009-03-14 11:33 ` Rafael J. Wysocki 2009-03-14 11:34 ` [PATCH 11/11] PCI PM: Restore config spaces of all devices during early resume Rafael J. Wysocki 2009-03-14 11:34 ` Rafael J. Wysocki 2009-03-14 11:43 ` [PATCH 0/11] PM: Rework suspend-resume ordering to avoid problems with shared interrupts (updated 2x) Ingo Molnar 2009-03-14 11:43 ` Ingo Molnar 2009-03-14 11:24 ` Rafael J. Wysocki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.