* Re: [Bug 10030] Suspend doesn't work when SD card is inserted [not found] <20080219232338.E9A24108068@picon.linux-foundation.org> @ 2008-02-20 16:42 ` Alan Stern 2008-02-20 17:30 ` Pierre Ossman 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-20 16:42 UTC (permalink / raw) To: Pierre Ossman; +Cc: Rafael J. Wysocki, Zdenek Kabelac, Kernel development list On Tue, 19 Feb 2008 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10030 > > > > > > ------- Comment #14 from rjw@sisk.pl 2008-02-19 15:23 ------- > Thanks a lot for the debugging work! > > First, the patch triggers, which means that the problem discovered by Alan is > troubling us. [Alan, do you have an idea how to fix that cleanly?] I suggest we ask the maintainer for the MMC subsystem. Pierre, you can find the details in the bugzilla entry. Briefly, there's a pathway in the MMC core suspend routine (if the driver doesn't implement a resume hook) which could lead to the host being removed during a system suspend. This is an illegal operation and it will deadlock. Do you have a suggestion for a way to fix it? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 16:42 ` [Bug 10030] Suspend doesn't work when SD card is inserted Alan Stern @ 2008-02-20 17:30 ` Pierre Ossman 2008-02-20 19:26 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Pierre Ossman @ 2008-02-20 17:30 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Zdenek Kabelac, Kernel development list On Wed, 20 Feb 2008 11:42:56 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > > > ------- Comment #14 from rjw@sisk.pl 2008-02-19 15:23 ------- > > Thanks a lot for the debugging work! > > > > First, the patch triggers, which means that the problem discovered by Alan is > > troubling us. [Alan, do you have an idea how to fix that cleanly?] > > I suggest we ask the maintainer for the MMC subsystem. > > Pierre, you can find the details in the bugzilla entry. Briefly, > there's a pathway in the MMC core suspend routine (if the driver > doesn't implement a resume hook) which could lead to the host being > removed during a system suspend. This is an illegal operation and it > will deadlock. > > Do you have a suggestion for a way to fix it? > Not really. But you have some things confused. What it checks is if the mmc bus handler (not a proper driver model, just a way of separating the MMC, SD and SDIO stuff) has a resume function. And if it doesn't, it removes the card (since it cannot revive it at resume). So the only thing I can think of is to delay the removal until the resume routine, if that is safer. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 17:30 ` Pierre Ossman @ 2008-02-20 19:26 ` Alan Stern 2008-02-20 20:51 ` Pierre Ossman 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-20 19:26 UTC (permalink / raw) To: Pierre Ossman; +Cc: Rafael J. Wysocki, Zdenek Kabelac, Kernel development list On Wed, 20 Feb 2008, Pierre Ossman wrote: > Not really. But you have some things confused. What it checks is if > the mmc bus handler (not a proper driver model, just a way of > separating the MMC, SD and SDIO stuff) has a resume function. And if > it doesn't, it removes the card (since it cannot revive it at > resume). > > So the only thing I can think of is to delay the removal until the > resume routine, if that is safer. Do I understand this correctly? You've got special handling for the case where a bus handler doesn't have a resume routine, but no special handling for the case where it doesn't have a suspend routine? Why bother to remove the device if neither routine exists (there won't be any need to revive it since the bus never got suspended)? And why not simply fail the suspend if the resume routine doesn't exist and the suspend routine does? Maybe with an error message in the system log. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 19:26 ` Alan Stern @ 2008-02-20 20:51 ` Pierre Ossman 2008-02-20 20:58 ` Rafael J. Wysocki 2008-02-20 21:06 ` Alan Stern 0 siblings, 2 replies; 92+ messages in thread From: Pierre Ossman @ 2008-02-20 20:51 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Zdenek Kabelac, Kernel development list On Wed, 20 Feb 2008 14:26:16 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > Do I understand this correctly? You've got special handling for the > case where a bus handler doesn't have a resume routine, but no special > handling for the case where it doesn't have a suspend routine? Hmm... There should be checks for both, but the code seems to suggest otherwise. > Why bother to remove the device if neither routine exists (there won't be > any need to revive it since the bus never got suspended)? The bus always gets suspended. The checks are to determine if state is saved or not. If it isn't, then a suspend/resume is treated as a removal/insertion. > And why not simply fail the suspend if the resume routine doesn't exist > and the suspend routine does? Maybe with an error message in the > system log. For the asymmetric case, I guess that would do. But I still want to remove devices when the bus handler has no suspend handling. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 20:51 ` Pierre Ossman @ 2008-02-20 20:58 ` Rafael J. Wysocki 2008-02-20 21:06 ` Alan Stern 1 sibling, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-20 20:58 UTC (permalink / raw) To: Pierre Ossman; +Cc: Alan Stern, Zdenek Kabelac, Kernel development list On Wednesday, 20 of February 2008, Pierre Ossman wrote: > On Wed, 20 Feb 2008 14:26:16 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > > > > > Do I understand this correctly? You've got special handling for the > > case where a bus handler doesn't have a resume routine, but no special > > handling for the case where it doesn't have a suspend routine? > > Hmm... There should be checks for both, but the code seems to suggest otherwise. > > > Why bother to remove the device if neither routine exists (there won't be > > any need to revive it since the bus never got suspended)? > > The bus always gets suspended. The checks are to determine if state is saved or not. If it isn't, then a suspend/resume is treated as a removal/insertion. > > > And why not simply fail the suspend if the resume routine doesn't exist > > and the suspend routine does? Maybe with an error message in the > > system log. > > For the asymmetric case, I guess that would do. But I still want to remove devices when the bus handler has no suspend handling. I think I know how to handle that, but there's a more urgent issue I need to fix first. Stay tuned. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 20:51 ` Pierre Ossman 2008-02-20 20:58 ` Rafael J. Wysocki @ 2008-02-20 21:06 ` Alan Stern 2008-02-20 22:15 ` Rafael J. Wysocki 1 sibling, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-20 21:06 UTC (permalink / raw) To: Pierre Ossman; +Cc: Rafael J. Wysocki, Zdenek Kabelac, Kernel development list On Wed, 20 Feb 2008, Pierre Ossman wrote: > > And why not simply fail the suspend if the resume routine doesn't exist > > and the suspend routine does? Maybe with an error message in the > > system log. > > For the asymmetric case, I guess that would do. But I still want to remove devices when the bus handler has no suspend handling. Then it appears the correct approach is to use the new device_pm_schedule_removal() routine. It schedules the removal of a suspended device for a time when the system suspend is over. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 21:06 ` Alan Stern @ 2008-02-20 22:15 ` Rafael J. Wysocki 2008-02-20 22:24 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-20 22:15 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Wednesday, 20 of February 2008, Alan Stern wrote: > On Wed, 20 Feb 2008, Pierre Ossman wrote: > > > > And why not simply fail the suspend if the resume routine doesn't exist > > > and the suspend routine does? Maybe with an error message in the > > > system log. > > > > For the asymmetric case, I guess that would do. But I still want to remove devices when the bus handler has no suspend handling. > > Then it appears the correct approach is to use the new > device_pm_schedule_removal() routine. It schedules the removal of a > suspended device for a time when the system suspend is over. Well, below is an uncompiled and untested but illustrating the idea that might allow people not to bother with device_pm_schedule_removal() explicitly and can fix the issue at hand. [There are some cases that need handling and are not covered here.] Please have a look. Thanks, Rafael --- drivers/base/core.c | 5 ++++- drivers/base/power/main.c | 22 ++++++++++++++++++++++ drivers/base/power/power.h | 5 +++++ 3 files changed, 31 insertions(+), 1 deletion(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -59,6 +59,26 @@ static DECLARE_RWSEM(pm_sleep_rwsem); int (*platform_enable_wakeup)(struct device *dev, int is_on); +static struct task_struct *suspending_task; +static DEFINE_MUTEX(suspending_task_mtx); + +bool in_suspend_context(void) +{ + bool result; + + mutex_lock(&suspending_task_mtx); + result = (suspending_task == current); + mutex_unlock(&suspending_task_mtx); + return result; +} + +static void set_suspending_task(struct task_struct *task) +{ + mutex_lock(&suspending_task_mtx); + suspending_task = task; + mutex_unlock(&suspending_task_mtx); +} + /** * device_pm_add - add a device to the list of active devices * @dev: Device to be added to the list @@ -272,6 +292,7 @@ static void dpm_resume(void) mutex_lock(&dpm_list_mtx); } mutex_unlock(&dpm_list_mtx); + set_suspending_task(NULL); } /** @@ -460,6 +481,7 @@ static int dpm_suspend(pm_message_t stat { int error = 0; + set_suspending_task(current); mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_locked)) { struct list_head *entry = dpm_locked.prev; Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c +++ linux-2.6/drivers/base/core.c @@ -1162,7 +1162,10 @@ void device_destroy(struct class *class, dev = class_find_device(class, &devt, __match_devt); if (dev) { put_device(dev); - device_unregister(dev); + if (in_suspend_context()) + device_pm_schedule_removal(dev); + else + device_unregister(dev); } } EXPORT_SYMBOL_GPL(device_destroy); Index: linux-2.6/drivers/base/power/power.h =================================================================== --- linux-2.6.orig/drivers/base/power/power.h +++ linux-2.6/drivers/base/power/power.h @@ -11,6 +11,7 @@ static inline struct device *to_device(s return container_of(entry, struct device, power.entry); } +extern bool in_suspend_context(void); extern void device_pm_add(struct device *); extern void device_pm_remove(struct device *); extern int pm_sleep_lock(void); @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); #else /* CONFIG_PM_SLEEP */ +static inline bool in_suspend_context(void) +{ + return false; +} static inline void device_pm_add(struct device *dev) { ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 22:15 ` Rafael J. Wysocki @ 2008-02-20 22:24 ` Alan Stern 2008-02-20 22:41 ` Rafael J. Wysocki 2008-02-21 0:02 ` Rafael J. Wysocki 0 siblings, 2 replies; 92+ messages in thread From: Alan Stern @ 2008-02-20 22:24 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: > Well, below is an uncompiled and untested but illustrating the idea that > might allow people not to bother with device_pm_schedule_removal() > explicitly and can fix the issue at hand. > > [There are some cases that need handling and are not covered here.] > > Please have a look. > > Thanks, > Rafael > +static struct task_struct *suspending_task; > +static DEFINE_MUTEX(suspending_task_mtx); I suspect you don't really need this mutex. > +bool in_suspend_context(void) > +{ > + bool result; > + > + mutex_lock(&suspending_task_mtx); > + result = (suspending_task == current); > + mutex_unlock(&suspending_task_mtx); > + return result; > +} If suspending_task == current then you are guaranteed to be serialized, because everything a single task does is serial. > @@ -1162,7 +1162,10 @@ void device_destroy(struct class *class, > dev = class_find_device(class, &devt, __match_devt); > if (dev) { > put_device(dev); > - device_unregister(dev); > + if (in_suspend_context()) > + device_pm_schedule_removal(dev); > + else > + device_unregister(dev); > } > } But what about device_del()? Can a similar change be made there? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 22:24 ` Alan Stern @ 2008-02-20 22:41 ` Rafael J. Wysocki 2008-02-21 0:02 ` Rafael J. Wysocki 1 sibling, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-20 22:41 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Wednesday, 20 of February 2008, Alan Stern wrote: > On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: > > > Well, below is an uncompiled and untested but illustrating the idea that > > might allow people not to bother with device_pm_schedule_removal() > > explicitly and can fix the issue at hand. > > > > [There are some cases that need handling and are not covered here.] > > > > Please have a look. > > > > Thanks, > > Rafael > > > +static struct task_struct *suspending_task; > > +static DEFINE_MUTEX(suspending_task_mtx); > > I suspect you don't really need this mutex. > > > +bool in_suspend_context(void) > > +{ > > + bool result; > > + > > + mutex_lock(&suspending_task_mtx); > > + result = (suspending_task == current); > > + mutex_unlock(&suspending_task_mtx); > > + return result; > > +} > > If suspending_task == current then you are guaranteed to be serialized, > because everything a single task does is serial. But in principle there could be a concurrent thread removind the device and that should block on dev->sem held by us. Right now that's not very likely to happen thanks to the freezer, but we're doing all this stuff, because we want to get rid of the freezer eventually. :-) > > @@ -1162,7 +1162,10 @@ void device_destroy(struct class *class, > > dev = class_find_device(class, &devt, __match_devt); > > if (dev) { > > put_device(dev); > > - device_unregister(dev); > > + if (in_suspend_context()) > > + device_pm_schedule_removal(dev); > > + else > > + device_unregister(dev); > > } > > } > > But what about device_del()? Can a similar change be made there? I believe so. I'm working on a more complete patch right now. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-20 22:24 ` Alan Stern 2008-02-20 22:41 ` Rafael J. Wysocki @ 2008-02-21 0:02 ` Rafael J. Wysocki 2008-02-21 16:27 ` Alan Stern 1 sibling, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-21 0:02 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Wednesday, 20 of February 2008, Alan Stern wrote: > On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: > > > Well, below is an uncompiled and untested but illustrating the idea that > > might allow people not to bother with device_pm_schedule_removal() > > explicitly and can fix the issue at hand. > > > > [There are some cases that need handling and are not covered here.] > > > > Please have a look. > > > > Thanks, > > Rafael > > > +static struct task_struct *suspending_task; > > +static DEFINE_MUTEX(suspending_task_mtx); > > I suspect you don't really need this mutex. > > > +bool in_suspend_context(void) > > +{ > > + bool result; > > + > > + mutex_lock(&suspending_task_mtx); > > + result = (suspending_task == current); > > + mutex_unlock(&suspending_task_mtx); > > + return result; > > +} > > If suspending_task == current then you are guaranteed to be serialized, > because everything a single task does is serial. As I said before (but that doesn't seem to reach the list, so I'm repeating), this is to protect other tasks from reading an inconsistent value of suspending_task in case they attempt to remove a device concurrently with respect to us. While this is not likely to happen right now, because of the freezer, it may very well happen when the freezer is finally removed. > > @@ -1162,7 +1162,10 @@ void device_destroy(struct class *class, > > dev = class_find_device(class, &devt, __match_devt); > > if (dev) { > > put_device(dev); > > - device_unregister(dev); > > + if (in_suspend_context()) > > + device_pm_schedule_removal(dev); > > + else > > + device_unregister(dev); > > } > > } > > But what about device_del()? Can a similar change be made there? Something like the patch below, perhaps? Again, untested, but compiled this time. :-) Thanks, Rafael --- drivers/base/core.c | 5 +++++ drivers/base/power/main.c | 22 ++++++++++++++++++++++ drivers/base/power/power.h | 5 +++++ 3 files changed, 32 insertions(+) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -59,6 +59,26 @@ static DECLARE_RWSEM(pm_sleep_rwsem); int (*platform_enable_wakeup)(struct device *dev, int is_on); +static struct task_struct *suspending_task; +static DEFINE_MUTEX(suspending_task_mtx); + +bool in_suspend_context(void) +{ + bool result; + + mutex_lock(&suspending_task_mtx); + result = (suspending_task == current); + mutex_unlock(&suspending_task_mtx); + return result; +} + +static void set_suspending_task(struct task_struct *task) +{ + mutex_lock(&suspending_task_mtx); + suspending_task = task; + mutex_unlock(&suspending_task_mtx); +} + /** * device_pm_add - add a device to the list of active devices * @dev: Device to be added to the list @@ -272,6 +292,7 @@ static void dpm_resume(void) mutex_lock(&dpm_list_mtx); } mutex_unlock(&dpm_list_mtx); + set_suspending_task(NULL); } /** @@ -460,6 +481,7 @@ static int dpm_suspend(pm_message_t stat { int error = 0; + set_suspending_task(current); mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_locked)) { struct list_head *entry = dpm_locked.prev; Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c +++ linux-2.6/drivers/base/core.c @@ -929,6 +929,11 @@ void device_del(struct device *dev) struct device *parent = dev->parent; struct class_interface *class_intf; + if (in_suspend_context()) { + get_device(dev); + device_pm_schedule_removal(dev); + return; + } device_pm_remove(dev); if (parent) klist_del(&dev->knode_parent); Index: linux-2.6/drivers/base/power/power.h =================================================================== --- linux-2.6.orig/drivers/base/power/power.h +++ linux-2.6/drivers/base/power/power.h @@ -11,6 +11,7 @@ static inline struct device *to_device(s return container_of(entry, struct device, power.entry); } +extern bool in_suspend_context(void); extern void device_pm_add(struct device *); extern void device_pm_remove(struct device *); extern int pm_sleep_lock(void); @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); #else /* CONFIG_PM_SLEEP */ +static inline bool in_suspend_context(void) +{ + return false; +} static inline void device_pm_add(struct device *dev) { ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 0:02 ` Rafael J. Wysocki @ 2008-02-21 16:27 ` Alan Stern 2008-02-21 16:38 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-21 16:27 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > > > +bool in_suspend_context(void) > > > +{ > > > + bool result; > > > + > > > + mutex_lock(&suspending_task_mtx); > > > + result = (suspending_task == current); > > > + mutex_unlock(&suspending_task_mtx); > > > + return result; > > > +} > > > > If suspending_task == current then you are guaranteed to be serialized, > > because everything a single task does is serial. > > As I said before (but that doesn't seem to reach the list, so I'm repeating), > this is to protect other tasks from reading an inconsistent value of > suspending_task in case they attempt to remove a device concurrently with > respect to us. > > While this is not likely to happen right now, because of the freezer, it may > very well happen when the freezer is finally removed. Sorry, I don't understand. Are you worried that process A might set suspending_task = A but then process B might still see suspending_task == NULL? Or that A might set suspend_task = NULL but then B might still see suspending_task == A? Neither one will cause any problem, since the only case that matters is when B sees suspending_task == B -- and that can happen if and only if B was the last process to set suspending_task. In fact, you might as well get rid of the set_suspending_task() routine entirely and just put the assignments inline. > --- linux-2.6.orig/drivers/base/core.c > +++ linux-2.6/drivers/base/core.c > @@ -929,6 +929,11 @@ void device_del(struct device *dev) > struct device *parent = dev->parent; > struct class_interface *class_intf; > > + if (in_suspend_context()) { > + get_device(dev); Where is this get_device() undone? Shouldn't there be an extra put_device() added to unregister_dropped_devices()? > + device_pm_schedule_removal(dev); > + return; > + } > device_pm_remove(dev); > if (parent) > klist_del(&dev->knode_parent); And now the change to device_destroy() isn't needed at all. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 16:27 ` Alan Stern @ 2008-02-21 16:38 ` Rafael J. Wysocki 2008-02-21 17:48 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-21 16:38 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Thursday, 21 of February 2008, Alan Stern wrote: > On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > > > > > +bool in_suspend_context(void) > > > > +{ > > > > + bool result; > > > > + > > > > + mutex_lock(&suspending_task_mtx); > > > > + result = (suspending_task == current); > > > > + mutex_unlock(&suspending_task_mtx); > > > > + return result; > > > > +} > > > > > > If suspending_task == current then you are guaranteed to be serialized, > > > because everything a single task does is serial. > > > > As I said before (but that doesn't seem to reach the list, so I'm repeating), > > this is to protect other tasks from reading an inconsistent value of > > suspending_task in case they attempt to remove a device concurrently with > > respect to us. > > > > While this is not likely to happen right now, because of the freezer, it may > > very well happen when the freezer is finally removed. > > Sorry, I don't understand. Are you worried that process A might set > suspending_task = A but then process B might still see suspending_task > == NULL? Or that A might set suspend_task = NULL but then B might > still see suspending_task == A? > > Neither one will cause any problem, since the only case that matters is > when B sees suspending_task == B -- and that can happen if and only if > B was the last process to set suspending_task. > > In fact, you might as well get rid of the set_suspending_task() routine > entirely and just put the assignments inline. OK, I will. > > --- linux-2.6.orig/drivers/base/core.c > > +++ linux-2.6/drivers/base/core.c > > @@ -929,6 +929,11 @@ void device_del(struct device *dev) > > struct device *parent = dev->parent; > > struct class_interface *class_intf; > > > > + if (in_suspend_context()) { > > + get_device(dev); > > Where is this get_device() undone? Shouldn't there be an extra > put_device() added to unregister_dropped_devices()? No, I don't think so, because unregister_dropped_devices() calls device_unregister() that does the put_device() eventually. If we are called by device_unregister(), the get_device() is needed to balance the put_device() that will be called by device_unregister() after we return. OTOH, if we are called directly, then we need to balance the put_device() that will be done by device_unregister() called from unregister_dropped_devices(). I hope I didn't miss anything. > > + device_pm_schedule_removal(dev); > > + return; > > + } > > device_pm_remove(dev); > > if (parent) > > klist_del(&dev->knode_parent); > > And now the change to device_destroy() isn't needed at all. No, it's not. Didn't I remove it? I thought I did. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 16:38 ` Rafael J. Wysocki @ 2008-02-21 17:48 ` Alan Stern 2008-02-21 22:47 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-21 17:48 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > > > --- linux-2.6.orig/drivers/base/core.c > > > +++ linux-2.6/drivers/base/core.c > > > @@ -929,6 +929,11 @@ void device_del(struct device *dev) > > > struct device *parent = dev->parent; > > > struct class_interface *class_intf; > > > > > > + if (in_suspend_context()) { > > > + get_device(dev); > > > > Where is this get_device() undone? Shouldn't there be an extra > > put_device() added to unregister_dropped_devices()? > > No, I don't think so, because unregister_dropped_devices() calls > device_unregister() that does the put_device() eventually. Ah, yes. > If we are called by device_unregister(), the get_device() is needed to balance > the put_device() that will be called by device_unregister() after we return. > > OTOH, if we are called directly, then we need to balance the put_device() > that will be done by device_unregister() called from > unregister_dropped_devices(). > > I hope I didn't miss anything. Okay, that sounds right. > > > + device_pm_schedule_removal(dev); > > > + return; > > > + } > > > device_pm_remove(dev); > > > if (parent) > > > klist_del(&dev->knode_parent); > > > > And now the change to device_destroy() isn't needed at all. > > No, it's not. Didn't I remove it? I thought I did. Oh yes, you did. I see a possible problem in device_resume(). My copy isn't fully up-to-date, but it looks like you call unregister_dropped_devices() before doing the up_write(&pm_sleep_rwsem). Won't this cause warnings in device_del() about a suspicious caller? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 17:48 ` Alan Stern @ 2008-02-21 22:47 ` Rafael J. Wysocki 2008-02-21 23:05 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-21 22:47 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Thursday, 21 of February 2008, Alan Stern wrote: > On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > > > > > --- linux-2.6.orig/drivers/base/core.c > > > > +++ linux-2.6/drivers/base/core.c > > > > @@ -929,6 +929,11 @@ void device_del(struct device *dev) > > > > struct device *parent = dev->parent; > > > > struct class_interface *class_intf; > > > > > > > > + if (in_suspend_context()) { > > > > + get_device(dev); > > > > > > Where is this get_device() undone? Shouldn't there be an extra > > > put_device() added to unregister_dropped_devices()? > > > > No, I don't think so, because unregister_dropped_devices() calls > > device_unregister() that does the put_device() eventually. > > Ah, yes. > > > If we are called by device_unregister(), the get_device() is needed to balance > > the put_device() that will be called by device_unregister() after we return. > > > > OTOH, if we are called directly, then we need to balance the put_device() > > that will be done by device_unregister() called from > > unregister_dropped_devices(). > > > > I hope I didn't miss anything. > > Okay, that sounds right. > > > > > + device_pm_schedule_removal(dev); > > > > + return; > > > > + } > > > > device_pm_remove(dev); > > > > if (parent) > > > > klist_del(&dev->knode_parent); > > > > > > And now the change to device_destroy() isn't needed at all. > > > > No, it's not. Didn't I remove it? I thought I did. > > Oh yes, you did. > > I see a possible problem in device_resume(). My copy isn't fully > up-to-date, but it looks like you call unregister_dropped_devices() > before doing the up_write(&pm_sleep_rwsem). Won't this cause > warnings in device_del() about a suspicious caller? No, it won't, because the devices' semaphores are unlocked by unregister_dropped_devices() before calling device_unregister(). BTW, below is a simplified version of the patch, without the mutex protecting suspending_task. I'd like to push it upstream if it looks good. Please also have a look at http://bugzilla.kernel.org/show_bug.cgi?id=10030. There seems to be another issue related to us holding devices' semaphores. Namely, it looks like, when the user removes the card, a concurrent thread (from a workqueue) calls device_del() and blocks on the dev->sem held by us and then something else deadlocks with this thread. I'll be looking into this tomorrow. Thanks, Rafael --- drivers/base/core.c | 5 +++++ drivers/base/power/main.c | 9 +++++++++ drivers/base/power/power.h | 5 +++++ 3 files changed, 19 insertions(+) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -59,6 +59,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); int (*platform_enable_wakeup)(struct device *dev, int is_on); +static struct task_struct *suspending_task; + +bool in_suspend_context(void) +{ + return (suspending_task == current); +} + /** * device_pm_add - add a device to the list of active devices * @dev: Device to be added to the list @@ -272,6 +279,7 @@ static void dpm_resume(void) mutex_lock(&dpm_list_mtx); } mutex_unlock(&dpm_list_mtx); + suspending_task = NULL; } /** @@ -460,6 +468,7 @@ static int dpm_suspend(pm_message_t stat { int error = 0; + suspending_task = current; mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_locked)) { struct list_head *entry = dpm_locked.prev; Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c +++ linux-2.6/drivers/base/core.c @@ -929,6 +929,11 @@ void device_del(struct device *dev) struct device *parent = dev->parent; struct class_interface *class_intf; + if (in_suspend_context()) { + get_device(dev); + device_pm_schedule_removal(dev); + return; + } device_pm_remove(dev); if (parent) klist_del(&dev->knode_parent); Index: linux-2.6/drivers/base/power/power.h =================================================================== --- linux-2.6.orig/drivers/base/power/power.h +++ linux-2.6/drivers/base/power/power.h @@ -11,6 +11,7 @@ static inline struct device *to_device(s return container_of(entry, struct device, power.entry); } +extern bool in_suspend_context(void); extern void device_pm_add(struct device *); extern void device_pm_remove(struct device *); extern int pm_sleep_lock(void); @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); #else /* CONFIG_PM_SLEEP */ +static inline bool in_suspend_context(void) +{ + return false; +} static inline void device_pm_add(struct device *dev) { ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 22:47 ` Rafael J. Wysocki @ 2008-02-21 23:05 ` Alan Stern 2008-02-23 1:30 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-21 23:05 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > BTW, below is a simplified version of the patch, without the mutex protecting > suspending_task. I'd like to push it upstream if it looks good. It does look good. Go ahead and push. Acked-by: Alan Stern <stern@rowland.harvard.edu> > Please also have a look at http://bugzilla.kernel.org/show_bug.cgi?id=10030. > There seems to be another issue related to us holding devices' semaphores. > Namely, it looks like, when the user removes the card, a concurrent thread > (from a workqueue) calls device_del() and blocks on the dev->sem held by > us and then something else deadlocks with this thread. I'll be looking into > this tomorrow. I've been too busy with other things to look at the activity on that bug report. Tonight or tomorrow... Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-21 23:05 ` Alan Stern @ 2008-02-23 1:30 ` Rafael J. Wysocki 2008-02-23 4:39 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-23 1:30 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Friday, 22 of February 2008, Alan Stern wrote: > On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: > > > BTW, below is a simplified version of the patch, without the mutex protecting > > suspending_task. I'd like to push it upstream if it looks good. > > It does look good. Go ahead and push. > > Acked-by: Alan Stern <stern@rowland.harvard.edu> > > > Please also have a look at http://bugzilla.kernel.org/show_bug.cgi?id=10030. > > There seems to be another issue related to us holding devices' semaphores. > > Namely, it looks like, when the user removes the card, a concurrent thread > > (from a workqueue) calls device_del() and blocks on the dev->sem held by > > us and then something else deadlocks with this thread. I'll be looking into > > this tomorrow. > > I've been too busy with other things to look at the activity on that > bug report. Tonight or tomorrow... Unfortunately, I missed your Bugzilla comment at http://bugzilla.kernel.org/show_bug.cgi?id=10030#c28 Well, in the face of it, I'm considering to remove the code that acquires device semaphores from the suspend core for now. Evidently, this change turns out to be painfully premature. Also, we have apparent problems with pm_sleep_lock() being take in device_add() (see http://bugzilla.kernel.org/show_bug.cgi?id=9874). Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-23 1:30 ` Rafael J. Wysocki @ 2008-02-23 4:39 ` Alan Stern 2008-02-23 20:16 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-23 4:39 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > Unfortunately, I missed your Bugzilla comment at > http://bugzilla.kernel.org/show_bug.cgi?id=10030#c28 Strange... But obviously you did see it eventually. > Well, in the face of it, I'm considering to remove the code that > acquires device semaphores from the suspend core for now. Evidently, this > change turns out to be painfully premature. I wonder if that's really the best thing. How would we then learn about drivers trying to register or unregister a device during a sleep transition? Do you think it might be possible instead to somehow allow these unregistrations to go through, while still failing or blocking registrations? It shouldn't be too hard to modify the driver core so that it calls the driver's remove() method without trying to acquire dev->sem if your in_suspend_context() test succeeds. I have to admit, I still don't understand what's going on with the MMC driver. Why is there a workqueue involved? If the workqueue fails to unregister the device, why should it bother the suspend routine? After all, if the suspend routine can afford to wait for the workqueue to finish then it could just as well afford to do the unregistration itself. > Also, we have apparent problems with pm_sleep_lock() > being take in device_add() (see > http://bugzilla.kernel.org/show_bug.cgi?id=9874). We'll have to get more information from the bug reporter to figure out what really happened there. Ultimately it may turn out some drivers just aren't very careful about when they try to register new devices. Doing the registration by way of a workqueue can be problematic if the workqueue happens to run during a system sleep transition. That will still be true if you revert the acquire-all-semaphores patch. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-23 4:39 ` Alan Stern @ 2008-02-23 20:16 ` Rafael J. Wysocki 2008-02-23 23:29 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-23 20:16 UTC (permalink / raw) To: Alan Stern; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Saturday, 23 of February 2008, Alan Stern wrote: > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > Unfortunately, I missed your Bugzilla comment at > > http://bugzilla.kernel.org/show_bug.cgi?id=10030#c28 > > Strange... But obviously you did see it eventually. > > > Well, in the face of it, I'm considering to remove the code that > > acquires device semaphores from the suspend core for now. Evidently, this > > change turns out to be painfully premature. > > I wonder if that's really the best thing. Ultimately no, it's not. However, we are now late in the -rc2 time frame and the release of -rc3 seems to be imminent. At this point, IMO, that's the safest thing to do. BTW, appended is the patch I'd like to get applied. > How would we then learn about drivers trying to register or unregister a > device during a sleep transition? I think we should fix the ones we know about and try to reintroduce this change in the 2.6.26 time frame. It also seems reasonable to reconsider what we want to achieve, as there may be a better way to do that. > Do you think it might be possible instead to somehow allow these > unregistrations to go through, while still failing or blocking > registrations? Yes, that should be possible, but as I said above, I think it's not the right time for doing that right now. > It shouldn't be too hard to modify the driver core so that it calls the > driver's remove() method without trying to acquire dev->sem if your > in_suspend_context() test succeeds. I agree that the in_suspend_context() test may be useful and this is one of the reasons why I think the entire approach requires reconsideration. > I have to admit, I still don't understand what's going on with the MMC > driver. Me neither, and that alone is a good enough reason for resigning from acquiring device semaphores in the suspend core, at least in the mainline kernel, until we understand the problem. > Why is there a workqueue involved? If the workqueue fails to > unregister the device, why should it bother the suspend routine? After > all, if the suspend routine can afford to wait for the workqueue to > finish then it could just as well afford to do the unregistration > itself. Yes, that's strange. > > Also, we have apparent problems with pm_sleep_lock() > > being take in device_add() (see > > http://bugzilla.kernel.org/show_bug.cgi?id=9874). > > We'll have to get more information from the bug reporter to figure out > what really happened there. Yes. > Ultimately it may turn out some drivers just aren't very careful about > when they try to register new devices. That's almost certain to me. > Doing the registration by way of a workqueue can be problematic if the > workqueue happens to run during a system sleep transition. That will still > be true if you revert the acquire-all-semaphores patch. Yes, but our taking of device semaphores exposes these problems in a rather drastic fashion. :-) Thanks, Rafael --- drivers/base/power/main.c | 84 ++-------------------------------------------- 1 file changed, 4 insertions(+), 80 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -48,7 +48,6 @@ */ LIST_HEAD(dpm_active); -static LIST_HEAD(dpm_locked); static LIST_HEAD(dpm_off); static LIST_HEAD(dpm_off_irq); static LIST_HEAD(dpm_destroy); @@ -81,28 +80,6 @@ void device_pm_add(struct device *dev) */ void device_pm_remove(struct device *dev) { - /* - * If this function is called during a suspend, it will be blocked, - * because we're holding the device's semaphore at that time, which may - * lead to a deadlock. In that case we want to print a warning. - * However, it may also be called by unregister_dropped_devices() with - * the device's semaphore released, in which case the warning should - * not be printed. - */ - if (down_trylock(&dev->sem)) { - if (down_read_trylock(&pm_sleep_rwsem)) { - /* No suspend in progress, wait on dev->sem */ - down(&dev->sem); - up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ - dev_warn(dev, "Suspicious %s during suspend\n", - __FUNCTION__); - dump_stack(); - /* The user has been warned ... */ - down(&dev->sem); - } - } pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); @@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev dpm_sysfs_remove(dev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); - up(&dev->sem); } /** @@ -266,7 +242,7 @@ static void dpm_resume(void) struct list_head *entry = dpm_off.next; struct device *dev = to_device(entry); - list_move_tail(entry, &dpm_locked); + list_move_tail(entry, &dpm_active); mutex_unlock(&dpm_list_mtx); resume_device(dev); mutex_lock(&dpm_list_mtx); @@ -275,25 +251,6 @@ static void dpm_resume(void) } /** - * unlock_all_devices - Release each device's semaphore - * - * Go through the dpm_off list. Put each device on the dpm_active - * list and unlock it. - */ -static void unlock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; - struct device *dev = to_device(entry); - - list_move(entry, &dpm_active); - up(&dev->sem); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * unregister_dropped_devices - Unregister devices scheduled for removal * * Unregister all devices on the dpm_destroy list. @@ -305,7 +262,6 @@ static void unregister_dropped_devices(v struct list_head *entry = dpm_destroy.next; struct device *dev = to_device(entry); - up(&dev->sem); mutex_unlock(&dpm_list_mtx); /* This also removes the device from the list */ device_unregister(dev); @@ -324,7 +280,6 @@ void device_resume(void) { might_sleep(); dpm_resume(); - unlock_all_devices(); unregister_dropped_devices(); up_write(&pm_sleep_rwsem); } @@ -461,8 +416,8 @@ static int dpm_suspend(pm_message_t stat int error = 0; mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; + while (!list_empty(&dpm_active)) { + struct list_head *entry = dpm_active.prev; struct device *dev = to_device(entry); list_del_init(&dev->power.entry); @@ -478,7 +433,7 @@ static int dpm_suspend(pm_message_t stat "")); mutex_lock(&dpm_list_mtx); if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_locked); + list_add_tail(&dev->power.entry, &dpm_active); break; } mutex_lock(&dpm_list_mtx); @@ -491,36 +446,6 @@ static int dpm_suspend(pm_message_t stat } /** - * lock_all_devices - Acquire every device's semaphore - * - * Go through the dpm_active list. Carefully lock each device's - * semaphore and put it in on the dpm_locked list. - */ -static void lock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_active)) { - struct list_head *entry = dpm_active.next; - struct device *dev = to_device(entry); - - /* Required locking order is dev->sem first, - * then dpm_list_mutex. Hence this awkward code. - */ - get_device(dev); - mutex_unlock(&dpm_list_mtx); - down(&dev->sem); - mutex_lock(&dpm_list_mtx); - - if (list_empty(entry)) - up(&dev->sem); /* Device was removed */ - else - list_move_tail(entry, &dpm_locked); - put_device(dev); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * device_suspend - Save state and stop all devices in system. * @state: new power management state * @@ -533,7 +458,6 @@ int device_suspend(pm_message_t state) might_sleep(); down_write(&pm_sleep_rwsem); - lock_all_devices(); error = dpm_suspend(state); if (error) device_resume(); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-23 20:16 ` Rafael J. Wysocki @ 2008-02-23 23:29 ` Alan Stern 2008-02-24 0:19 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-23 23:29 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > Ultimately no, it's not. However, we are now late in the -rc2 time frame and > the release of -rc3 seems to be imminent. At this point, IMO, that's the > safest thing to do. BTW, appended is the patch I'd like to get applied. In the interest of having a safe release, I guess you're right. It's unfortunate, though -- there's no good way to get thorough testing without putting the code in Linus's tree. > > How would we then learn about drivers trying to register or unregister a > > device during a sleep transition? > > I think we should fix the ones we know about and try to reintroduce this > change in the 2.6.26 time frame. It also seems reasonable to reconsider what > we want to achieve, as there may be a better way to do that. Below is my suggested approach, based on yours. It might even solve the MMC problem, who knows? And it adds some potentially useful debugging, although it would be better to dump just the stack of suspending_task. Do you know how to do that from within a timer routine? I still have no clear idea about what's going on with the ACPI bug in #9874. However perhaps the bug wouldn't occur if we blocked device registration until after the resume was finished, instead of failing it outright. Since the registration in this case was done in a workqueue thread, blocking wouldn't cause an immediate deadlock. Alan Stern Index: usb-2.6/drivers/base/power/main.c =================================================================== --- usb-2.6.orig/drivers/base/power/main.c +++ usb-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/sched.h> #include "../base.h" #include "power.h" @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); int (*platform_enable_wakeup)(struct device *dev, int is_on); +static struct task_struct *suspending_task; + +bool in_suspend_context(void) +{ + return (suspending_task == current); +} + /** * device_pm_add - add a device to the list of active devices * @dev: Device to be added to the list @@ -94,14 +102,15 @@ void device_pm_remove(struct device *dev /* No suspend in progress, wait on dev->sem */ down(&dev->sem); up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ + } else if (!in_suspend_context()) { + /* Suspending in another task, we may deadlock */ dev_warn(dev, "Suspicious %s during suspend\n", __FUNCTION__); dump_stack(); /* The user has been warned ... */ down(&dev->sem); } + /* Otherwise we're okay */ } pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", @@ -272,6 +281,7 @@ static void dpm_resume(void) mutex_lock(&dpm_list_mtx); } mutex_unlock(&dpm_list_mtx); + suspending_task = NULL; } /** @@ -410,6 +420,17 @@ int device_power_down(pm_message_t state } EXPORT_SYMBOL_GPL(device_power_down); +/* Provide debugging info if we hang or deadlock during suspend */ +static struct timer_list suspend_timer; + +static void suspend_timeout(unsigned long _dev) +{ + struct device *dev = (struct device *) _dev; + + dev_err(dev, "deadlock during suspend!\n"); + show_state(); +} + /** * suspend_device - Save state of one device. * @dev: Device. @@ -419,6 +440,10 @@ int suspend_device(struct device *dev, p { int error = 0; + /* Provide debugging output in case of a deadlock */ + setup_timer(&suspend_timer, suspend_timeout, (unsigned long) dev); + mod_timer(&suspend_timer, jiffies + 5*HZ); + if (dev->power.power_state.event) { dev_dbg(dev, "PM: suspend %d-->%d\n", dev->power.power_state.event, state.event); @@ -441,6 +466,8 @@ int suspend_device(struct device *dev, p error = dev->bus->suspend(dev, state); suspend_report_result(dev->bus->suspend, error); } + + del_timer_sync(&suspend_timer); return error; } @@ -460,6 +487,7 @@ static int dpm_suspend(pm_message_t stat { int error = 0; + suspending_task = current; mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_locked)) { struct list_head *entry = dpm_locked.prev; Index: usb-2.6/drivers/base/power/power.h =================================================================== --- usb-2.6.orig/drivers/base/power/power.h +++ usb-2.6/drivers/base/power/power.h @@ -11,6 +11,7 @@ static inline struct device *to_device(s return container_of(entry, struct device, power.entry); } +extern bool in_suspend_context(void); extern void device_pm_add(struct device *); extern void device_pm_remove(struct device *); extern int pm_sleep_lock(void); @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); #else /* CONFIG_PM_SLEEP */ +static inline bool in_suspend_context(void) +{ + return false; +} static inline void device_pm_add(struct device *dev) { Index: usb-2.6/drivers/base/dd.c =================================================================== --- usb-2.6.orig/drivers/base/dd.c +++ usb-2.6/drivers/base/dd.c @@ -325,10 +325,18 @@ void device_release_driver(struct device * If anyone calls device_release_driver() recursively from * within their ->remove callback for the same device, they * will deadlock right here. + * + * We avoid locking dev->sem if we are in the context of a + * task doing a system suspend, in order that drivers can + * unregister devices from within their suspend() methods. + * This is okay because the suspending task will already own + * all the device semaphores. */ - down(&dev->sem); + if (!in_suspend_context()) + down(&dev->sem); __device_release_driver(dev); - up(&dev->sem); + if (!in_suspend_context()) + up(&dev->sem); } EXPORT_SYMBOL_GPL(device_release_driver); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-23 23:29 ` Alan Stern @ 2008-02-24 0:19 ` Rafael J. Wysocki 2008-02-24 3:25 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 0:19 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sunday, 24 of February 2008, Alan Stern wrote: > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > Ultimately no, it's not. However, we are now late in the -rc2 time frame and > > the release of -rc3 seems to be imminent. At this point, IMO, that's the > > safest thing to do. BTW, appended is the patch I'd like to get applied. > > In the interest of having a safe release, I guess you're right. That's what I'm concerned about at the moment. I'm afraid there won't be enough time to nail down all the issues (there may be some we're not even aware of). > It's unfortunate, though -- there's no good way to get thorough testing > without putting the code in Linus's tree. Absolutely. Still, the code in question introduces unexpected behavior that we don't really understand and it's not safe to leave it in the mainline. > > > How would we then learn about drivers trying to register or unregister a > > > device during a sleep transition? > > > > I think we should fix the ones we know about and try to reintroduce this > > change in the 2.6.26 time frame. It also seems reasonable to reconsider what > > we want to achieve, as there may be a better way to do that. > > Below is my suggested approach, based on yours. It might even solve > the MMC problem, who knows? And it adds some potentially useful > debugging, although it would be better to dump just the stack of > suspending_task. Do you know how to do that from within a timer > routine? No, I don't. > I still have no clear idea about what's going on with the ACPI bug in > #9874. It seems that the ACPI code is not prepared to cope with a failing device registration, in which case it'd need fixing. Frankly, I'm afraid there may be more issues like this throughout the tree. > However perhaps the bug wouldn't occur if we blocked device > registration until after the resume was finished, instead of failing it > outright. Since the registration in this case was done in a workqueue > thread, blocking wouldn't cause an immediate deadlock. No, it wouldn't, but the fact that it happens in the ACPI code makes me worry. If we block that code and the things it's supposed to do turn out to be necessary for suspending later on, we'll be in trouble. > Index: usb-2.6/drivers/base/power/main.c > =================================================================== > --- usb-2.6.orig/drivers/base/power/main.c > +++ usb-2.6/drivers/base/power/main.c > @@ -25,6 +25,7 @@ > #include <linux/pm.h> > #include <linux/resume-trace.h> > #include <linux/rwsem.h> > +#include <linux/sched.h> > > #include "../base.h" > #include "power.h" > @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); > > int (*platform_enable_wakeup)(struct device *dev, int is_on); > > +static struct task_struct *suspending_task; > + > +bool in_suspend_context(void) > +{ > + return (suspending_task == current); > +} > + > /** > * device_pm_add - add a device to the list of active devices > * @dev: Device to be added to the list > @@ -94,14 +102,15 @@ void device_pm_remove(struct device *dev > /* No suspend in progress, wait on dev->sem */ > down(&dev->sem); > up_read(&pm_sleep_rwsem); > - } else { > - /* Suspend in progress, we may deadlock */ > + } else if (!in_suspend_context()) { > + /* Suspending in another task, we may deadlock */ Well, that shouldn't really deadlock. If it does, there is a potential design issue somewhere. I think it might be better to set up a timer in here too. Although IMO it would be even better to just set up a timer and remove the warning altogehter. > dev_warn(dev, "Suspicious %s during suspend\n", > __FUNCTION__); > dump_stack(); > /* The user has been warned ... */ > down(&dev->sem); > } > + /* Otherwise we're okay */ > } > pr_debug("PM: Removing info for %s:%s\n", > dev->bus ? dev->bus->name : "No Bus", There's a problem here, because we shouldn't release the semaphore if we're in the suspend context. > @@ -272,6 +281,7 @@ static void dpm_resume(void) > mutex_lock(&dpm_list_mtx); > } > mutex_unlock(&dpm_list_mtx); > + suspending_task = NULL; > } > > /** > @@ -410,6 +420,17 @@ int device_power_down(pm_message_t state > } > EXPORT_SYMBOL_GPL(device_power_down); > > +/* Provide debugging info if we hang or deadlock during suspend */ > +static struct timer_list suspend_timer; > + > +static void suspend_timeout(unsigned long _dev) > +{ > + struct device *dev = (struct device *) _dev; > + > + dev_err(dev, "deadlock during suspend!\n"); > + show_state(); The show_state() seems to be overkill and won't really help if the user can't collect the output on the fly using a serial console or something like this. [The debug stuff printed here should fit a typical laptop screen, ideally.] > +} > + > /** > * suspend_device - Save state of one device. > * @dev: Device. > @@ -419,6 +440,10 @@ int suspend_device(struct device *dev, p > { > int error = 0; > > + /* Provide debugging output in case of a deadlock */ > + setup_timer(&suspend_timer, suspend_timeout, (unsigned long) dev); > + mod_timer(&suspend_timer, jiffies + 5*HZ); > + > if (dev->power.power_state.event) { > dev_dbg(dev, "PM: suspend %d-->%d\n", > dev->power.power_state.event, state.event); > @@ -441,6 +466,8 @@ int suspend_device(struct device *dev, p > error = dev->bus->suspend(dev, state); > suspend_report_result(dev->bus->suspend, error); > } > + > + del_timer_sync(&suspend_timer); > return error; > } > > @@ -460,6 +487,7 @@ static int dpm_suspend(pm_message_t stat > { > int error = 0; > > + suspending_task = current; > mutex_lock(&dpm_list_mtx); > while (!list_empty(&dpm_locked)) { > struct list_head *entry = dpm_locked.prev; > Index: usb-2.6/drivers/base/power/power.h > =================================================================== > --- usb-2.6.orig/drivers/base/power/power.h > +++ usb-2.6/drivers/base/power/power.h > @@ -11,6 +11,7 @@ static inline struct device *to_device(s > return container_of(entry, struct device, power.entry); > } > > +extern bool in_suspend_context(void); > extern void device_pm_add(struct device *); > extern void device_pm_remove(struct device *); > extern int pm_sleep_lock(void); > @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); > > #else /* CONFIG_PM_SLEEP */ > > +static inline bool in_suspend_context(void) > +{ > + return false; > +} > > static inline void device_pm_add(struct device *dev) > { > Index: usb-2.6/drivers/base/dd.c > =================================================================== > --- usb-2.6.orig/drivers/base/dd.c > +++ usb-2.6/drivers/base/dd.c > @@ -325,10 +325,18 @@ void device_release_driver(struct device > * If anyone calls device_release_driver() recursively from > * within their ->remove callback for the same device, they > * will deadlock right here. > + * > + * We avoid locking dev->sem if we are in the context of a > + * task doing a system suspend, in order that drivers can > + * unregister devices from within their suspend() methods. > + * This is okay because the suspending task will already own > + * all the device semaphores. > */ > - down(&dev->sem); > + if (!in_suspend_context()) > + down(&dev->sem); > __device_release_driver(dev); > - up(&dev->sem); > + if (!in_suspend_context()) > + up(&dev->sem); > } I'd prefer if (in_suspend_context()) { __device_release_driver(dev); } else { down(&dev->sem); __device_release_driver(dev); up(&dev->sem); } > EXPORT_SYMBOL_GPL(device_release_driver); Well, I'd really feel much more comfortable if we removed the troublesome code for 2.6.25 and reintroduced it along with the above safeguards for 2.6.26. Of course, this doesn't mean we can't send debug patches for testing to the users who have alraedy reported problems with it. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 0:19 ` Rafael J. Wysocki @ 2008-02-24 3:25 ` Alan Stern 2008-02-24 4:26 ` [linux-pm] " Alan Stern ` (3 more replies) 0 siblings, 4 replies; 92+ messages in thread From: Alan Stern @ 2008-02-24 3:25 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > On Sunday, 24 of February 2008, Alan Stern wrote: > > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > > > Ultimately no, it's not. However, we are now late in the -rc2 time frame and > > > the release of -rc3 seems to be imminent. At this point, IMO, that's the > > > safest thing to do. BTW, appended is the patch I'd like to get applied. > > > > In the interest of having a safe release, I guess you're right. > > That's what I'm concerned about at the moment. I'm afraid there won't be > enough time to nail down all the issues (there may be some we're not even > aware of). All right. You'll need to enlarge your big reversal patch by reverting commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f. > > Below is my suggested approach, based on yours. It might even solve > > the MMC problem, who knows? And it adds some potentially useful > > debugging, although it would be better to dump just the stack of > > suspending_task. Do you know how to do that from within a timer > > routine? > > No, I don't. On further checking the answer turns out to be sched_show_task(). That's the routine which gets called for each process when you type Alt-SysRq-t. > > I still have no clear idea about what's going on with the ACPI bug in > > #9874. > > It seems that the ACPI code is not prepared to cope with a failing device > registration, in which case it'd need fixing. Frankly, I'm afraid there may > be more issues like this throughout the tree. The fact remains that it is unsafe to register a device during a sleep transition, although if the device's driver doesn't have a suspend or resume method you can probably get away with it. > > However perhaps the bug wouldn't occur if we blocked device > > registration until after the resume was finished, instead of failing it > > outright. Since the registration in this case was done in a workqueue > > thread, blocking wouldn't cause an immediate deadlock. > > No, it wouldn't, but the fact that it happens in the ACPI code makes me worry. It happened in a workqueue. There could be lots of similar cases: Some interrupt-driven event causes a hotplug action. Since the action can't be carried out in interrupt context, the driver has no choice but to defer it to a workqueue or kernel thread. Blocking that workqueue or kernel thread won't cause a problem _unless_ the driver's suspend/resume methods try to synchronize with it. That's the difficult case. > If we block that code and the things it's supposed to do turn out to be > necessary for suspending later on, we'll be in trouble. Exactly. > > @@ -94,14 +102,15 @@ void device_pm_remove(struct device *dev > > /* No suspend in progress, wait on dev->sem */ > > down(&dev->sem); > > up_read(&pm_sleep_rwsem); > > - } else { > > - /* Suspend in progress, we may deadlock */ > > + } else if (!in_suspend_context()) { > > + /* Suspending in another task, we may deadlock */ > > Well, that shouldn't really deadlock. If it does, there is a potential design > issue somewhere. I think it might be better to set up a timer in here too. At this point the driver core owns the device semaphore, so the unregistration task will block until the sleep is over. If the driver's resume method has to synchronize with the unregistration task then it will deadlock. I agree, it would be a design issue. In fact, it's the same design issue described earlier in this email. > Although IMO it would be even better to just set up a timer and remove the > warning altogehter. That warning was just for debugging purposes. A timer in the resume path could do the same thing with fewer false alarms, just like the timer in the suspend path. I have added it to the new patch. > > dev_warn(dev, "Suspicious %s during suspend\n", > > __FUNCTION__); > > dump_stack(); > > /* The user has been warned ... */ > > down(&dev->sem); > > } > > + /* Otherwise we're okay */ > > } > > pr_debug("PM: Removing info for %s:%s\n", > > dev->bus ? dev->bus->name : "No Bus", > > There's a problem here, because we shouldn't release the semaphore if we're > in the suspend context. Yes, you're right. It's fixed in the new version. > > +static void suspend_timeout(unsigned long _dev) > > +{ > > + struct device *dev = (struct device *) _dev; > > + > > + dev_err(dev, "deadlock during suspend!\n"); > > + show_state(); > > The show_state() seems to be overkill and won't really help if the user can't > collect the output on the fly using a serial console or something like this. > [The debug stuff printed here should fit a typical laptop screen, ideally.] Changed. > I'd prefer > > if (in_suspend_context()) { > __device_release_driver(dev); > } else { > down(&dev->sem); > __device_release_driver(dev); > up(&dev->sem); > } Okay. You know, with this new patch we probably don't need device_pm_schedule_removal() any more. However I have left it in for now. > Well, I'd really feel much more comfortable if we removed the troublesome code > for 2.6.25 and reintroduced it along with the above safeguards for 2.6.26. > > Of course, this doesn't mean we can't send debug patches for testing to the > users who have alraedy reported problems with it. :-) Certainly! Here's the new version. Zdenek, could you try it out with none of Rafael's patches installed? Alan Stern Index: usb-2.6/drivers/base/power/main.c =================================================================== --- usb-2.6.orig/drivers/base/power/main.c +++ usb-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/pm.h> #include <linux/resume-trace.h> #include <linux/rwsem.h> +#include <linux/sched.h> #include "../base.h" #include "power.h" @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); int (*platform_enable_wakeup)(struct device *dev, int is_on); +static struct task_struct *suspending_task; + +bool in_suspend_context(void) +{ + return (suspending_task == current); +} + /** * device_pm_add - add a device to the list of active devices * @dev: Device to be added to the list @@ -82,27 +90,14 @@ void device_pm_add(struct device *dev) void device_pm_remove(struct device *dev) { /* - * If this function is called during a suspend, it will be blocked, - * because we're holding the device's semaphore at that time, which may - * lead to a deadlock. In that case we want to print a warning. - * However, it may also be called by unregister_dropped_devices() with - * the device's semaphore released, in which case the warning should - * not be printed. + * If this function is called during a sleep then it will block + * because we're holding the device's semaphore at that time; + * this may lead to a deadlock. But if the calling thread is + * the same as the thread doing the sleep then it already owns + * all the device semaphores, so we make an exception. */ - if (down_trylock(&dev->sem)) { - if (down_read_trylock(&pm_sleep_rwsem)) { - /* No suspend in progress, wait on dev->sem */ - down(&dev->sem); - up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ - dev_warn(dev, "Suspicious %s during suspend\n", - __FUNCTION__); - dump_stack(); - /* The user has been warned ... */ - down(&dev->sem); - } - } + if (!in_suspend_context()) + down(&dev->sem); pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); @@ -110,7 +105,8 @@ void device_pm_remove(struct device *dev dpm_sysfs_remove(dev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); - up(&dev->sem); + if (!in_suspend_context()) + up(&dev->sem); } /** @@ -157,6 +153,18 @@ void pm_sleep_unlock(void) } +/* Provide debugging info if we hang or deadlock during suspend/resume */ +static struct timer_list method_timer; + +static void method_timeout(unsigned long _dev) +{ + struct device *dev = (struct device *) _dev; + + dev_err(dev, "deadlock during suspend or resume!\n"); + sched_show_task(suspending_task); +} + + /*------------------------- Resume routines -------------------------*/ /** @@ -230,6 +238,10 @@ static int resume_device(struct device * TRACE_DEVICE(dev); TRACE_RESUME(0); + /* Provide debugging output in case of a deadlock */ + setup_timer(&method_timer, method_timeout, (unsigned long) dev); + mod_timer(&method_timer, jiffies + 5*HZ); + if (dev->bus && dev->bus->resume) { dev_dbg(dev,"resuming\n"); error = dev->bus->resume(dev); @@ -245,6 +257,8 @@ static int resume_device(struct device * error = dev->class->resume(dev); } + del_timer_sync(&method_timer); + TRACE_RESUME(error); return error; } @@ -272,6 +286,7 @@ static void dpm_resume(void) mutex_lock(&dpm_list_mtx); } mutex_unlock(&dpm_list_mtx); + suspending_task = NULL; } /** @@ -419,6 +434,10 @@ int suspend_device(struct device *dev, p { int error = 0; + /* Provide debugging output in case of a deadlock */ + setup_timer(&method_timer, method_timeout, (unsigned long) dev); + mod_timer(&method_timer, jiffies + 5*HZ); + if (dev->power.power_state.event) { dev_dbg(dev, "PM: suspend %d-->%d\n", dev->power.power_state.event, state.event); @@ -441,6 +460,8 @@ int suspend_device(struct device *dev, p error = dev->bus->suspend(dev, state); suspend_report_result(dev->bus->suspend, error); } + + del_timer_sync(&method_timer); return error; } @@ -460,6 +481,7 @@ static int dpm_suspend(pm_message_t stat { int error = 0; + suspending_task = current; mutex_lock(&dpm_list_mtx); while (!list_empty(&dpm_locked)) { struct list_head *entry = dpm_locked.prev; Index: usb-2.6/drivers/base/power/power.h =================================================================== --- usb-2.6.orig/drivers/base/power/power.h +++ usb-2.6/drivers/base/power/power.h @@ -11,6 +11,7 @@ static inline struct device *to_device(s return container_of(entry, struct device, power.entry); } +extern bool in_suspend_context(void); extern void device_pm_add(struct device *); extern void device_pm_remove(struct device *); extern int pm_sleep_lock(void); @@ -18,6 +19,10 @@ extern void pm_sleep_unlock(void); #else /* CONFIG_PM_SLEEP */ +static inline bool in_suspend_context(void) +{ + return false; +} static inline void device_pm_add(struct device *dev) { Index: usb-2.6/drivers/base/dd.c =================================================================== --- usb-2.6.orig/drivers/base/dd.c +++ usb-2.6/drivers/base/dd.c @@ -325,10 +325,20 @@ void device_release_driver(struct device * If anyone calls device_release_driver() recursively from * within their ->remove callback for the same device, they * will deadlock right here. + * + * We avoid locking dev->sem if we are in the context of a + * task doing a system sleep, in order that drivers can + * unregister devices from within their suspend() methods. + * This is okay because the suspending task will already own + * all the device semaphores. */ - down(&dev->sem); - __device_release_driver(dev); - up(&dev->sem); + if (in_suspend_context()) { + __device_release_driver(dev); + } else { + down(&dev->sem); + __device_release_driver(dev); + up(&dev->sem); + } } EXPORT_SYMBOL_GPL(device_release_driver); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [linux-pm] [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 3:25 ` Alan Stern @ 2008-02-24 4:26 ` Alan Stern 2008-02-24 14:00 ` Rafael J. Wysocki 2008-02-24 13:33 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 4:26 UTC (permalink / raw) To: Rafael J. Wysocki Cc: pm list, Zdenek Kabelac, Kernel development list, Pierre Ossman On Sat, 23 Feb 2008, Alan Stern wrote: > It happened in a workqueue. There could be lots of similar cases: Some > interrupt-driven event causes a hotplug action. Since the action can't > be carried out in interrupt context, the driver has no choice but to > defer it to a workqueue or kernel thread. Blocking that workqueue or > kernel thread won't cause a problem _unless_ the driver's > suspend/resume methods try to synchronize with it. That's the > difficult case. In fact this turns out to be exactly the problem with the MMC subsystem. It uses a dedicated workqueue (kmmcd) to handle state changes, and mmc_suspend_host() does a flush_workqueue(). If the workqueue contains an entry to register or unregister a device, this is guaranteed to deadlock -- and that's exactly what happens when Zdenek inserts or removes a card during the disk-drive spindown time of a system sleep. It looks like the best solution is for mmc_suspend_host() not to flush the workqueue; instead the workqueue should prevent itself from running during a suspend. Right now the easiest way to accomplish this is for the workqueue routines (mmc_detect() and siblings) to acquire the mmc_host's device semaphore. If in the future the MMC subsystem adds dynamic PM support then a more complicated arrangement will be needed. So the patch below, in combination with the previous patch, might just fix the whole problem. Alan Stern Index: usb-2.6/drivers/mmc/core/core.c =================================================================== --- usb-2.6.orig/drivers/mmc/core/core.c +++ usb-2.6/drivers/mmc/core/core.c @@ -731,8 +731,6 @@ void mmc_stop_host(struct mmc_host *host */ int mmc_suspend_host(struct mmc_host *host, pm_message_t state) { - mmc_flush_scheduled_work(); - mmc_bus_get(host); if (host->bus_ops && !host->bus_dead) { if (host->bus_ops->suspend) Index: usb-2.6/drivers/mmc/core/mmc.c =================================================================== --- usb-2.6.orig/drivers/mmc/core/mmc.c +++ usb-2.6/drivers/mmc/core/mmc.c @@ -441,6 +441,7 @@ static void mmc_detect(struct mmc_host * BUG_ON(!host); BUG_ON(!host->card); + down(&mmc_classdev(host)->sem); mmc_claim_host(host); /* @@ -449,6 +450,7 @@ static void mmc_detect(struct mmc_host * err = mmc_send_status(host->card, NULL); mmc_release_host(host); + up(&mmc_classdev(host)->sem); if (err) { mmc_remove(host); Index: usb-2.6/drivers/mmc/core/sd.c =================================================================== --- usb-2.6.orig/drivers/mmc/core/sd.c +++ usb-2.6/drivers/mmc/core/sd.c @@ -500,6 +500,7 @@ static void mmc_sd_detect(struct mmc_hos BUG_ON(!host); BUG_ON(!host->card); + down(&mmc_classdev(host)->sem); mmc_claim_host(host); /* @@ -508,6 +509,7 @@ static void mmc_sd_detect(struct mmc_hos err = mmc_send_status(host->card, NULL); mmc_release_host(host); + up(&mmc_classdev(host)->sem); if (err) { mmc_sd_remove(host); Index: usb-2.6/drivers/mmc/core/sdio.c =================================================================== --- usb-2.6.orig/drivers/mmc/core/sdio.c +++ usb-2.6/drivers/mmc/core/sdio.c @@ -195,6 +195,7 @@ static void mmc_sdio_detect(struct mmc_h BUG_ON(!host); BUG_ON(!host->card); + down(&mmc_classdev(host)->sem); mmc_claim_host(host); /* @@ -203,6 +204,7 @@ static void mmc_sdio_detect(struct mmc_h err = mmc_select_card(host->card); mmc_release_host(host); + up(&mmc_classdev(host)->sem); if (err) { mmc_sdio_remove(host); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [linux-pm] [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 4:26 ` [linux-pm] " Alan Stern @ 2008-02-24 14:00 ` Rafael J. Wysocki 2008-02-24 15:33 ` Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 14:00 UTC (permalink / raw) To: Alan Stern Cc: pm list, Zdenek Kabelac, Kernel development list, Pierre Ossman On Sunday, 24 of February 2008, Alan Stern wrote: > On Sat, 23 Feb 2008, Alan Stern wrote: > > > It happened in a workqueue. There could be lots of similar cases: Some > > interrupt-driven event causes a hotplug action. Since the action can't > > be carried out in interrupt context, the driver has no choice but to > > defer it to a workqueue or kernel thread. Blocking that workqueue or > > kernel thread won't cause a problem _unless_ the driver's > > suspend/resume methods try to synchronize with it. That's the > > difficult case. > > In fact this turns out to be exactly the problem with the MMC > subsystem. It uses a dedicated workqueue (kmmcd) to handle state > changes, and mmc_suspend_host() does a flush_workqueue(). If the > workqueue contains an entry to register or unregister a device, this is > guaranteed to deadlock -- and that's exactly what happens when Zdenek > inserts or removes a card during the disk-drive spindown time of a > system sleep. > > It looks like the best solution is for mmc_suspend_host() not to flush > the workqueue; instead the workqueue should prevent itself from > running during a suspend. Right now the easiest way to accomplish this > is for the workqueue routines (mmc_detect() and siblings) to acquire > the mmc_host's device semaphore. If in the future the MMC subsystem > adds dynamic PM support then a more complicated arrangement will be > needed. > > So the patch below, in combination with the previous patch, might just > fix the whole problem. The patch looks sane and I think you're right. Still, it depends on us holding the device semaphores (if I understand it correctly). For this reason, I'd prefer to include it into the patch that adds device locking to the PM core. I think we should first remove the device locking from drivers/base/power/main.c (that's the patch I'd like to push upstream now) and then create a patch that will add it back along with all of the necessary additional changes we know of (BTW, I've just received a message from another user affected by it). but we need to be careful with all the details. IMO, it would be Thanks, Rafael > Index: usb-2.6/drivers/mmc/core/core.c > =================================================================== > --- usb-2.6.orig/drivers/mmc/core/core.c > +++ usb-2.6/drivers/mmc/core/core.c > @@ -731,8 +731,6 @@ void mmc_stop_host(struct mmc_host *host > */ > int mmc_suspend_host(struct mmc_host *host, pm_message_t state) > { > - mmc_flush_scheduled_work(); > - > mmc_bus_get(host); > if (host->bus_ops && !host->bus_dead) { > if (host->bus_ops->suspend) > Index: usb-2.6/drivers/mmc/core/mmc.c > =================================================================== > --- usb-2.6.orig/drivers/mmc/core/mmc.c > +++ usb-2.6/drivers/mmc/core/mmc.c > @@ -441,6 +441,7 @@ static void mmc_detect(struct mmc_host * > BUG_ON(!host); > BUG_ON(!host->card); > > + down(&mmc_classdev(host)->sem); > mmc_claim_host(host); > > /* > @@ -449,6 +450,7 @@ static void mmc_detect(struct mmc_host * > err = mmc_send_status(host->card, NULL); > > mmc_release_host(host); > + up(&mmc_classdev(host)->sem); > > if (err) { > mmc_remove(host); > Index: usb-2.6/drivers/mmc/core/sd.c > =================================================================== > --- usb-2.6.orig/drivers/mmc/core/sd.c > +++ usb-2.6/drivers/mmc/core/sd.c > @@ -500,6 +500,7 @@ static void mmc_sd_detect(struct mmc_hos > BUG_ON(!host); > BUG_ON(!host->card); > > + down(&mmc_classdev(host)->sem); > mmc_claim_host(host); > > /* > @@ -508,6 +509,7 @@ static void mmc_sd_detect(struct mmc_hos > err = mmc_send_status(host->card, NULL); > > mmc_release_host(host); > + up(&mmc_classdev(host)->sem); > > if (err) { > mmc_sd_remove(host); > Index: usb-2.6/drivers/mmc/core/sdio.c > =================================================================== > --- usb-2.6.orig/drivers/mmc/core/sdio.c > +++ usb-2.6/drivers/mmc/core/sdio.c > @@ -195,6 +195,7 @@ static void mmc_sdio_detect(struct mmc_h > BUG_ON(!host); > BUG_ON(!host->card); > > + down(&mmc_classdev(host)->sem); > mmc_claim_host(host); > > /* > @@ -203,6 +204,7 @@ static void mmc_sdio_detect(struct mmc_h > err = mmc_select_card(host->card); > > mmc_release_host(host); > + up(&mmc_classdev(host)->sem); > > if (err) { > mmc_sdio_remove(host); > > > -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 92+ messages in thread
* Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-24 14:00 ` Rafael J. Wysocki @ 2008-02-24 15:33 ` Alan Stern 2008-02-25 17:41 ` Pierre Ossman 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 15:33 UTC (permalink / raw) To: Pierre Ossman, Rafael J. Wysocki Cc: pm list, Zdenek Kabelac, David Brownell, Kernel development list On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > In fact this turns out to be exactly the problem with the MMC > > subsystem. It uses a dedicated workqueue (kmmcd) to handle state > > changes, and mmc_suspend_host() does a flush_workqueue(). If the > > workqueue contains an entry to register or unregister a device, this is > > guaranteed to deadlock -- and that's exactly what happens when Zdenek > > inserts or removes a card during the disk-drive spindown time of a > > system sleep. > > > > It looks like the best solution is for mmc_suspend_host() not to flush > > the workqueue; instead the workqueue should prevent itself from > > running during a suspend. Right now the easiest way to accomplish this > > is for the workqueue routines (mmc_detect() and siblings) to acquire > > the mmc_host's device semaphore. If in the future the MMC subsystem > > adds dynamic PM support then a more complicated arrangement will be > > needed. > > > > So the patch below, in combination with the previous patch, might just > > fix the whole problem. > > The patch looks sane and I think you're right. Well, the patch itself isn't really adequate; it was just a first pass intended to illustrate the nature of the problem. The problems in the MMC subsystem go deeper. The first is the way it uses flush_workqueue(). This is almost always a bad function to call, because it introduces unnecessary dependencies. cancel_delayed_work_sync() is much better. But even changing that won't solve the second issue, which is a genuine bug. There is a race between detect events and suspend events. The mmc_suspend_host() routine starts out by flushing the kmmcd workqueue before calling the host's suspend routine. So what happens if another detect event occurs in between? This whole area of synchronization between card insertion, card removal, suspend, and resume needs to be thought out better. Especially keeping in mind that device registration or unregistration during a system sleep is likely to block until the sleep is over. Lastly, there are some other questions about confusing aspects of the subsystem. What's the story with __mmc_claim_host()? Is it really nothing more than an abortable mutex_lock()? Why does it ever need to be aborted? Why is its second argument an atomic_t instead of a normal int? Why are mmc_detect() and related routines described in comments as "Card detection callback from host"? They don't handle card _detection_ -- they handle card _removal_. What's the purpose of the card-counting loop in host/omap.c:mmc_omap_switch_handler()? It looks like dead code. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-24 15:33 ` Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] Alan Stern @ 2008-02-25 17:41 ` Pierre Ossman 2008-02-25 17:58 ` Alan Stern ` (2 more replies) 0 siblings, 3 replies; 92+ messages in thread From: Pierre Ossman @ 2008-02-25 17:41 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Sun, 24 Feb 2008 10:33:34 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > Well, the patch itself isn't really adequate; it was just a first pass > intended to illustrate the nature of the problem. > > The problems in the MMC subsystem go deeper. > > The first is the way it uses flush_workqueue(). This is almost always > a bad function to call, because it introduces unnecessary dependencies. > cancel_delayed_work_sync() is much better. > > But even changing that won't solve the second issue, which is a genuine > bug. There is a race between detect events and suspend events. The > mmc_suspend_host() routine starts out by flushing the kmmcd workqueue > before calling the host's suspend routine. So what happens if another > detect event occurs in between? > The idea is that host drivers shouldn't do that. Once they've called mmc_suspend_host(), then they shouldn't be poking the MMC core in any other way. None of this is of course properly documented. :/ > This whole area of synchronization between card insertion, card > removal, suspend, and resume needs to be thought out better. > Especially keeping in mind that device registration or unregistration > during a system sleep is likely to block until the sleep is over. Trying to keep up with the PM changes is a full time job. For now I've mostly ignored it as I don't even have time for the other stuff. Patches welcome. > > Lastly, there are some other questions about confusing aspects of the > subsystem. What's the story with __mmc_claim_host()? Is it really > nothing more than an abortable mutex_lock()? Why does it ever need to > be aborted? Why is its second argument an atomic_t instead of a normal > int? > It got that way when SDIO got in. It was needed to make it abortable to solve a rather nasty deadlock situation. I don't remember the details right now, but it should be in the LKML archives. > Why are mmc_detect() and related routines described in comments as > "Card detection callback from host"? They don't handle card > _detection_ -- they handle card _removal_. They handle both. > > What's the purpose of the card-counting loop in > host/omap.c:mmc_omap_switch_handler()? It looks like dead code. > I'm not too familiar with that driver, but they've been playing around with multiplexing several cards into one controller. Might be bits and pieces of that. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 17:41 ` Pierre Ossman @ 2008-02-25 17:58 ` Alan Stern 2008-02-25 18:31 ` Pierre Ossman 2008-02-25 22:51 ` Felipe Balbi 2008-03-03 21:59 ` David Brownell 2 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-25 17:58 UTC (permalink / raw) To: Pierre Ossman Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list Thanks for the explanations. On Mon, 25 Feb 2008, Pierre Ossman wrote: > Trying to keep up with the PM changes is a full time job. For now I've > mostly ignored it as I don't even have time for the other stuff. > Patches welcome. What do you think of the patch attached to comment #40 in the Bugzilla entry? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 17:58 ` Alan Stern @ 2008-02-25 18:31 ` Pierre Ossman 2008-02-25 20:00 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Pierre Ossman @ 2008-02-25 18:31 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Mon, 25 Feb 2008 12:58:30 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > Thanks for the explanations. > > On Mon, 25 Feb 2008, Pierre Ossman wrote: > > > Trying to keep up with the PM changes is a full time job. For now I've > > mostly ignored it as I don't even have time for the other stuff. > > Patches welcome. > > What do you think of the patch attached to comment #40 in the Bugzilla > entry? > Looks ok. As long as those two synchronization points are guaranteed, then I'm happy. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 18:31 ` Pierre Ossman @ 2008-02-25 20:00 ` Alan Stern 2008-03-01 14:11 ` Pierre Ossman 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-25 20:00 UTC (permalink / raw) To: Pierre Ossman Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Mon, 25 Feb 2008, Pierre Ossman wrote: > > What do you think of the patch attached to comment #40 in the Bugzilla > > entry? > > > > Looks ok. As long as those two synchronization points are guaranteed, > then I'm happy. Maybe a better approach would be to leave the workqueue unfreezable, and keep cancel_delayed_work_sync() in mmc_suspend_host(). It would then be necessary to add a test to verify, if there is a card attached, that the card is indeed suspended. After all, it's possible that the cancel_delayed_work_sync() ended up waiting for a job already running on the workqueue to register a new card! (The same would be true even with flush_scheduled_work.) Also, as a bit of defensive programming, it might be a good idea to add a "suspended" flag to the mmc_host structure. If mmc_rescan() sees that the flag is set then it should return immediately. This would protect against host drivers that aren't careful to disable detect IRQs before calling mmc_suspend_host(). Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 20:00 ` Alan Stern @ 2008-03-01 14:11 ` Pierre Ossman 2008-03-01 14:36 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Pierre Ossman @ 2008-03-01 14:11 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Mon, 25 Feb 2008 15:00:51 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > Maybe a better approach would be to leave the workqueue unfreezable, > and keep cancel_delayed_work_sync() in mmc_suspend_host(). It would > then be necessary to add a test to verify, if there is a card attached, > that the card is indeed suspended. After all, it's possible that the > cancel_delayed_work_sync() ended up waiting for a job already running > on the workqueue to register a new card! (The same would be true even > with flush_scheduled_work.) How would that be handled? I'd prefer a patch as I'm evidently not up to date with how to fondle the pm stuff properly. :) > > Also, as a bit of defensive programming, it might be a good idea to add > a "suspended" flag to the mmc_host structure. If mmc_rescan() sees > that the flag is set then it should return immediately. This would > protect against host drivers that aren't careful to disable detect > IRQs before calling mmc_suspend_host(). Indeed. I'll add that to my todo. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-01 14:11 ` Pierre Ossman @ 2008-03-01 14:36 ` Alan Stern 2008-03-01 14:47 ` Pierre Ossman 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-01 14:36 UTC (permalink / raw) To: Pierre Ossman Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Sat, 1 Mar 2008, Pierre Ossman wrote: > On Mon, 25 Feb 2008 15:00:51 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > > > > > Maybe a better approach would be to leave the workqueue unfreezable, > > and keep cancel_delayed_work_sync() in mmc_suspend_host(). It would > > then be necessary to add a test to verify, if there is a card attached, > > that the card is indeed suspended. After all, it's possible that the > > cancel_delayed_work_sync() ended up waiting for a job already running > > on the workqueue to register a new card! (The same would be true even > > with flush_scheduled_work.) > > How would that be handled? I'd prefer a patch as I'm evidently not up to date with how to fondle the pm stuff properly. :) The easiest way to do this will be to wait until some planned updates are added to the PM core; then this will be a quick and simple change. That probably means waiting until after 2.6.25 is released, however. > > Also, as a bit of defensive programming, it might be a good idea to add > > a "suspended" flag to the mmc_host structure. If mmc_rescan() sees > > that the flag is set then it should return immediately. This would > > protect against host drivers that aren't careful to disable detect > > IRQs before calling mmc_suspend_host(). > > Indeed. I'll add that to my todo. That could go in along with the previous change. The PM update mentioned above involves adding a "suspended" flag to every struct device. With that available, this amounts to nothing more than an extra test added at the beginning of mmc_rescan(). Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-01 14:36 ` Alan Stern @ 2008-03-01 14:47 ` Pierre Ossman 0 siblings, 0 replies; 92+ messages in thread From: Pierre Ossman @ 2008-03-01 14:47 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list On Sat, 1 Mar 2008 09:36:36 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > > The easiest way to do this will be to wait until some planned updates > are added to the PM core; then this will be a quick and simple change. > That probably means waiting until after 2.6.25 is released, however. > Ok. Just give me a poke when it's time. -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 17:41 ` Pierre Ossman 2008-02-25 17:58 ` Alan Stern @ 2008-02-25 22:51 ` Felipe Balbi 2008-03-03 21:59 ` David Brownell 2 siblings, 0 replies; 92+ messages in thread From: Felipe Balbi @ 2008-02-25 22:51 UTC (permalink / raw) To: Pierre Ossman, carlos.aguiar Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, David Brownell, Kernel development list <snip> > > What's the purpose of the card-counting loop in > > host/omap.c:mmc_omap_switch_handler()? It looks like dead code. > > > > I'm not too familiar with that driver, but they've been playing around > with multiplexing several cards into one controller. Might be bits and > pieces of that. > AFAIK, that came after mmc multislot support. Omap2 has one mmc controller for several (2 in practice) mm slots (cards). It's not really a dead code since it's used in n800 and n810, it's in linux-omap archives and git tree. Carlos Aguiar, added in the loop, might have some comments about it. -- Best Regards, Felipe Balbi felipebalbi@users.sourceforge.net ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-02-25 17:41 ` Pierre Ossman 2008-02-25 17:58 ` Alan Stern 2008-02-25 22:51 ` Felipe Balbi @ 2008-03-03 21:59 ` David Brownell 2008-03-04 6:03 ` Pierre Ossman 2008-03-04 17:50 ` Alan Stern 2 siblings, 2 replies; 92+ messages in thread From: David Brownell @ 2008-03-03 21:59 UTC (permalink / raw) To: Pierre Ossman Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Monday 25 February 2008, Pierre Ossman wrote: > On Sun, 24 Feb 2008 10:33:34 -0500 (EST) > Alan Stern <stern@rowland.harvard.edu> wrote: > > > But even changing that won't solve the second issue, which is a genuine > > bug. There is a race between detect events and suspend events. The > > mmc_suspend_host() routine starts out by flushing the kmmcd workqueue > > before calling the host's suspend routine. So what happens if another > > detect event occurs in between? > > > > The idea is that host drivers shouldn't do that. Once they've called > mmc_suspend_host(), then they shouldn't be poking the MMC core in any > other way. None of this is of course properly documented. :/ Card insert/remove events can be system wake events though. Which makes that restriction impractical. I think hosts need to be able to call mmc_detect_change() as soon as they see a stable signal. The MMC core can hold off handling that for a while, if it needs to wait until the code walking the device tree gets around to resuming that host. It's a lot more natural to hold off such stuff one time there than in N host drivers; especially since the MMC core already has such hold-off code. - Dave ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-03 21:59 ` David Brownell @ 2008-03-04 6:03 ` Pierre Ossman 2008-03-04 9:44 ` David Brownell 2008-03-04 17:50 ` Alan Stern 1 sibling, 1 reply; 92+ messages in thread From: Pierre Ossman @ 2008-03-04 6:03 UTC (permalink / raw) To: David Brownell Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Mon, 3 Mar 2008 13:59:37 -0800 David Brownell <david-b@pacbell.net> wrote: > > Card insert/remove events can be system wake events though. Which > makes that restriction impractical. > > I think hosts need to be able to call mmc_detect_change() as soon as > they see a stable signal. The MMC core can hold off handling that > for a while, if it needs to wait until the code walking the device > tree gets around to resuming that host. It's a lot more natural to > hold off such stuff one time there than in N host drivers; especially > since the MMC core already has such hold-off code. > That actually sorts itself out as the MMC core reprobes on wakeup, but I see your point. Right now things will work peachy if the controllers just make sure to disable their card detection logic before telling the core to suspend. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 6:03 ` Pierre Ossman @ 2008-03-04 9:44 ` David Brownell 2008-03-04 9:58 ` Pierre Ossman 2008-03-04 17:53 ` Alan Stern 0 siblings, 2 replies; 92+ messages in thread From: David Brownell @ 2008-03-04 9:44 UTC (permalink / raw) To: Pierre Ossman Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Monday 03 March 2008, Pierre Ossman wrote: > On Mon, 3 Mar 2008 13:59:37 -0800 > David Brownell <david-b@pacbell.net> wrote: > > > > > Card insert/remove events can be system wake events though. Which > > makes that restriction impractical. This part seems to be ignored by your comment ... wake events. > > I think hosts need to be able to call mmc_detect_change() as soon as > > they see a stable signal. The MMC core can hold off handling that > > for a while, if it needs to wait until the code walking the device > > tree gets around to resuming that host. It's a lot more natural to > > hold off such stuff one time there than in N host drivers; especially > > since the MMC core already has such hold-off code. > > > > That actually sorts itself out as the MMC core reprobes on wakeup, but > I see your point. Right now things will work peachy if the controllers > just make sure to disable their card detection logic before telling the > core to suspend. That is, the MMC core doesn't understand wakeup events. Or, as pointed out elsewhere, well-behaved MMC hosts ... which don't need either such reprobing or the associated remove-on-suspend. - Dave ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 9:44 ` David Brownell @ 2008-03-04 9:58 ` Pierre Ossman 2008-03-06 21:23 ` David Brownell 2008-03-04 17:53 ` Alan Stern 1 sibling, 1 reply; 92+ messages in thread From: Pierre Ossman @ 2008-03-04 9:58 UTC (permalink / raw) To: David Brownell Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tue, 4 Mar 2008 01:44:09 -0800 David Brownell <david-b@pacbell.net> wrote: > On Monday 03 March 2008, Pierre Ossman wrote: > > On Mon, 3 Mar 2008 13:59:37 -0800 > > David Brownell <david-b@pacbell.net> wrote: > > > > > > > > Card insert/remove events can be system wake events though. Which > > > makes that restriction impractical. > > This part seems to be ignored by your comment ... wake events. > How so? How is a wake caused by the MMC controller different than any other source? > > That is, the MMC core doesn't understand wakeup events. > > Or, as pointed out elsewhere, well-behaved MMC hosts ... which don't > need either such reprobing or the associated remove-on-suspend. > You need well behaved _systems_, not just hosts to achieve that guarantee. SDHCI controllers have the ability to wake up the system on card removal, but there is zero guarantee that the platform actually wired the controller up in a way that actually allows it to do this. Throw suspend to disk, where the system might completely lose power, into the mix and you're completely screwed. So for the default behaviour to change, we need one of two things: - Certainty that removals cannot go unnoticed. (even then, you also need wakeup latency guarantees) - Ability to detect a removal after the fact. For the general case, the first one is impossible given todays hardware. The second might be solvable, but noone has done the work. (If your complete system can satisfy the first option, feel free to add the "unsafe" option to you defconfig, but that is hardly adequate reason to have it on by default.) Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 9:58 ` Pierre Ossman @ 2008-03-06 21:23 ` David Brownell 0 siblings, 0 replies; 92+ messages in thread From: David Brownell @ 2008-03-06 21:23 UTC (permalink / raw) To: Pierre Ossman Cc: Alan Stern, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list > > > > Card insert/remove events can be system wake events though. Which > > > > makes that restriction impractical. > > > > This part seems to be ignored by your comment ... wake events. > > How so? How is a wake caused by the MMC controller different > than any other source? They aren't. Which is part of why the way MMC currently assumes that insert/remove events don't work is a problem. > > That is, the MMC core doesn't understand wakeup events. > > > > Or, as pointed out elsewhere, well-behaved MMC hosts ... which don't > > need either such reprobing or the associated remove-on-suspend. > > You need well behaved _systems_, not just hosts to achieve that > guarantee. Sure, but a host can't be well behaved all by itself! And in any case, I had already made clear I was talking about _systems_ that behave properly. > SDHCI controllers have the ability to wake up the > system on card removal, but there is zero guarantee that the > platform actually wired the controller up in a way that actually > allows it to do this. Throw suspend to disk, where the system might > completely lose power, into the mix and you're completely screwed. I'm talking about generic MMC/SD controllers of the type that have been around for years ... on systems which won't use hibernation ("suspend to disk"), but do use real system sleep states where card detection (by IRQs) works. > So for the default behaviour to change, we need one of two things: > > - Certainty that removals cannot go unnoticed. (even then, you > also need wakeup latency guarantees) > - Ability to detect a removal after the fact. > > For the general case, the first one is impossible given todays > hardware. Odd that it's very possible on the systems I mentioned. > The second might be solvable, but noone has done the work. I don't know what you mean by "detect a removal after the fact", or why it'd be needed if you detected it in the first place. > (If your complete system can satisfy the first option, feel > free to add the "unsafe" option to you defconfig, but that is > hardly adequate reason to have it on by default.) Thing is, I had also pointed out that it wasn't "unsafe" in the least on many systems. I'll refresh the patch which improves that mechanism and updates its description. - Dave ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 9:44 ` David Brownell 2008-03-04 9:58 ` Pierre Ossman @ 2008-03-04 17:53 ` Alan Stern 2008-03-04 18:53 ` David Brownell 1 sibling, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-04 17:53 UTC (permalink / raw) To: David Brownell Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tue, 4 Mar 2008, David Brownell wrote: > Or, as pointed out elsewhere, well-behaved MMC hosts ... which don't > need either such reprobing or the associated remove-on-suspend. I don't understand this comment. Suppose a card was inserted while the system was hibernating. If the core didn't reprobe, when would that card be discovered? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 17:53 ` Alan Stern @ 2008-03-04 18:53 ` David Brownell 2008-03-04 19:51 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: David Brownell @ 2008-03-04 18:53 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tuesday 04 March 2008, Alan Stern wrote: > On Tue, 4 Mar 2008, David Brownell wrote: > > > Or, as pointed out elsewhere, well-behaved MMC hosts ... which don't > > need either such reprobing or the associated remove-on-suspend. > > I don't understand this comment. Suppose a card was inserted while the > system was hibernating. If the core didn't reprobe, when would that > card be discovered? The host controller would tell the core to check for a card, exactly like it does at all other times. That's the natural alternative to having the MMC core assume that card detection was broken in low power states, so that the core needed to forcibly remove the cards before suspend, and reprobe during resume processing. Having the MMC core make such needless assumptions can cause problems for the upper layers, including filesystems. - Dave ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 18:53 ` David Brownell @ 2008-03-04 19:51 ` Alan Stern 2008-03-04 20:30 ` David Brownell 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-04 19:51 UTC (permalink / raw) To: David Brownell Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tue, 4 Mar 2008, David Brownell wrote: > > I don't understand this comment. Suppose a card was inserted while the > > system was hibernating. If the core didn't reprobe, when would that > > card be discovered? > > The host controller would tell the core to check for a card, exactly > like it does at all other times. But how would the host controller know to do that? Isn't card insertion detection often driven by interrupts? If a card is inserted while the computer is off, no interrupt will be generated. > That's the natural alternative to having the MMC core assume that card > detection was broken in low power states, so that the core needed to > forcibly remove the cards before suspend, and reprobe during resume > processing. Is that the assumption the MMC core was really making? Are you sure it wasn't assuming something else (perhaps equally as bad)? > Having the MMC core make such needless assumptions can cause problems > for the upper layers, including filesystems. What's wrong with a superfluous probe at resume time, besides the waste of a few milliseconds? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 19:51 ` Alan Stern @ 2008-03-04 20:30 ` David Brownell 2008-03-04 21:00 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: David Brownell @ 2008-03-04 20:30 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tuesday 04 March 2008, Alan Stern wrote: > On Tue, 4 Mar 2008, David Brownell wrote: > > > > I don't understand this comment. Suppose a card was inserted while the > > > system was hibernating. If the core didn't reprobe, when would that > > > card be discovered? > > > > The host controller would tell the core to check for a card, exactly > > like it does at all other times. > > But how would the host controller know to do that? Isn't card > insertion detection often driven by interrupts? If a card is > inserted while the computer is off, no interrupt will be generated. I'm talking about Real Suspend States (tm), not hibernation. The systems in question tend to not support hibernation. In particular: system sleep states where IRQs work ... that's a common way for non-x86 platforms to work. > > That's the natural alternative to having the MMC core assume that card > > detection was broken in low power states, so that the core needed to > > forcibly remove the cards before suspend, and reprobe during resume > > processing. > > Is that the assumption the MMC core was really making? Are you sure it > wasn't assuming something else (perhaps equally as bad)? On the systems I was looking at, that assumption seems equivalent to what the latest MMC core is assuming. (I think this came as part of the rewrite/refactoring a while back.) But since the assumptions aren't documented as such, you could say all guesses are equal. ;) > > Having the MMC core make such needless assumptions can cause problems > > for the upper layers, including filesystems. > > What's wrong with a superfluous probe at resume time, besides the waste > of a few milliseconds? I'm more concerned with the undesirable removal of devices at suspend time ... ones with mounted filesystems etc. - Dave ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 20:30 ` David Brownell @ 2008-03-04 21:00 ` Alan Stern 2008-03-06 15:55 ` Pavel Machek 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-04 21:00 UTC (permalink / raw) To: David Brownell Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tue, 4 Mar 2008, David Brownell wrote: > > What's wrong with a superfluous probe at resume time, besides the waste > > of a few milliseconds? > > I'm more concerned with the undesirable removal of devices at suspend > time ... ones with mounted filesystems etc. On that we can agree. The removal is done if the host doesn't define a resume method. There doesn't seem to be any point to that, given that the probing during resume will determine whether a card has in fact been removed. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-04 21:00 ` Alan Stern @ 2008-03-06 15:55 ` Pavel Machek 2008-03-06 20:33 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Pavel Machek @ 2008-03-06 15:55 UTC (permalink / raw) To: Alan Stern Cc: David Brownell, Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Tue 2008-03-04 16:00:51, Alan Stern wrote: > On Tue, 4 Mar 2008, David Brownell wrote: > > > > What's wrong with a superfluous probe at resume time, besides the waste > > > of a few milliseconds? > > > > I'm more concerned with the undesirable removal of devices at suspend > > time ... ones with mounted filesystems etc. > > On that we can agree. The removal is done if the host doesn't define a > resume method. There doesn't seem to be any point to that, given that > the probing during resume will determine whether a card has in fact > been removed. Hmm, if the driver is sleeping too deeply, user might have removed the card and put in different one, without driver noticing. That would be _bad_. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-06 15:55 ` Pavel Machek @ 2008-03-06 20:33 ` Alan Stern 2008-03-06 20:53 ` Zdenek Kabelac 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-06 20:33 UTC (permalink / raw) To: Pavel Machek Cc: David Brownell, Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Thu, 6 Mar 2008, Pavel Machek wrote: > On Tue 2008-03-04 16:00:51, Alan Stern wrote: > > On Tue, 4 Mar 2008, David Brownell wrote: > > > > > > What's wrong with a superfluous probe at resume time, besides the waste > > > > of a few milliseconds? > > > > > > I'm more concerned with the undesirable removal of devices at suspend > > > time ... ones with mounted filesystems etc. > > > > On that we can agree. The removal is done if the host doesn't define a > > resume method. There doesn't seem to be any point to that, given that > > the probing during resume will determine whether a card has in fact > > been removed. > > Hmm, if the driver is sleeping too deeply, user might have removed the > card and put in different one, without driver noticing. That would be > _bad_. Ironically, the very same problem now exists with the USB mass-storage driver. I don't see any way for the driver itself to solve it, especially during a hibernation (which can be a _very_ deep sleep). One thing that could be done is for filesystems to verify, after a system sleep, that their superblocks haven't changed. There could still be issues with non-mounted partitions, if they have live entries in the block cache, but it would be an improvement. Do you know the right people to mention this to? Anybody in filesystem development interested in suspend/hibernation issues? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-06 20:33 ` Alan Stern @ 2008-03-06 20:53 ` Zdenek Kabelac 2008-03-06 21:31 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Zdenek Kabelac @ 2008-03-06 20:53 UTC (permalink / raw) To: Alan Stern Cc: Pavel Machek, David Brownell, Pierre Ossman, Rafael J. Wysocki, pm list, Kernel development list 2008/3/6, Alan Stern <stern@rowland.harvard.edu>: > On Thu, 6 Mar 2008, Pavel Machek wrote: > > > On Tue 2008-03-04 16:00:51, Alan Stern wrote: > > > On Tue, 4 Mar 2008, David Brownell wrote: > > > > > > > > What's wrong with a superfluous probe at resume time, besides the waste > > > > > of a few milliseconds? > > > > > > > > I'm more concerned with the undesirable removal of devices at suspend > > > > time ... ones with mounted filesystems etc. > > > > > > On that we can agree. The removal is done if the host doesn't define a > > > resume method. There doesn't seem to be any point to that, given that > > > the probing during resume will determine whether a card has in fact > > > been removed. > > > > Hmm, if the driver is sleeping too deeply, user might have removed the > > card and put in different one, without driver noticing. That would be > > _bad_. > > > Ironically, the very same problem now exists with the USB mass-storage > driver. I don't see any way for the driver itself to solve it, > especially during a hibernation (which can be a _very_ deep sleep). > > One thing that could be done is for filesystems to verify, after a > system sleep, that their superblocks haven't changed. There could > still be issues with non-mounted partitions, if they have live entries > in the block cache, but it would be an improvement. > > Do you know the right people to mention this to? Anybody in filesystem > development interested in suspend/hibernation issues? IMHO the way would be to try to unmount fs if it's possible - if not - user should be notified on suspend/hibernation that he must preserve media in its place after resume and it should be checked and user should be notified if different devices/fs were find... Zdenek ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-06 20:53 ` Zdenek Kabelac @ 2008-03-06 21:31 ` Rafael J. Wysocki 0 siblings, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-03-06 21:31 UTC (permalink / raw) To: Zdenek Kabelac Cc: Alan Stern, Pavel Machek, David Brownell, Pierre Ossman, pm list, Kernel development list On Thursday, 6 of March 2008, Zdenek Kabelac wrote: > 2008/3/6, Alan Stern <stern@rowland.harvard.edu>: > > On Thu, 6 Mar 2008, Pavel Machek wrote: > > > > > On Tue 2008-03-04 16:00:51, Alan Stern wrote: > > > > On Tue, 4 Mar 2008, David Brownell wrote: > > > > > > > > > > What's wrong with a superfluous probe at resume time, besides the waste > > > > > > of a few milliseconds? > > > > > > > > > > I'm more concerned with the undesirable removal of devices at suspend > > > > > time ... ones with mounted filesystems etc. > > > > > > > > On that we can agree. The removal is done if the host doesn't define a > > > > resume method. There doesn't seem to be any point to that, given that > > > > the probing during resume will determine whether a card has in fact > > > > been removed. > > > > > > Hmm, if the driver is sleeping too deeply, user might have removed the > > > card and put in different one, without driver noticing. That would be > > > _bad_. > > > > > > Ironically, the very same problem now exists with the USB mass-storage > > driver. I don't see any way for the driver itself to solve it, > > especially during a hibernation (which can be a _very_ deep sleep). > > > > One thing that could be done is for filesystems to verify, after a > > system sleep, that their superblocks haven't changed. There could > > still be issues with non-mounted partitions, if they have live entries > > in the block cache, but it would be an improvement. > > > > Do you know the right people to mention this to? Anybody in filesystem > > development interested in suspend/hibernation issues? I don't really think so (but I may be wrong ...) > IMHO the way would be to try to unmount fs if it's possible - if not - > user should be notified on suspend/hibernation that he must preserve > media in its place after resume and it should be checked and user > should be notified if different devices/fs were find... This is a very long standing issue which IMO can only be solved by making filesystems suspend-aware. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] 2008-03-03 21:59 ` David Brownell 2008-03-04 6:03 ` Pierre Ossman @ 2008-03-04 17:50 ` Alan Stern 1 sibling, 0 replies; 92+ messages in thread From: Alan Stern @ 2008-03-04 17:50 UTC (permalink / raw) To: David Brownell Cc: Pierre Ossman, Rafael J. Wysocki, pm list, Zdenek Kabelac, Kernel development list On Mon, 3 Mar 2008, David Brownell wrote: > Card insert/remove events can be system wake events though. Which > makes that restriction impractical. > > I think hosts need to be able to call mmc_detect_change() as soon as > they see a stable signal. The MMC core can hold off handling that > for a while, if it needs to wait until the code walking the device > tree gets around to resuming that host. It's a lot more natural to > hold off such stuff one time there than in N host drivers; especially > since the MMC core already has such hold-off code. That's what ended up happening. The workqueue used by mmc_detect_change() was made freezable, so hosts could call the routine at any time but it wouldn't do anything until the system sleep was over. A more flexible approach would avoid freezing the workqueue, and allow it to process card removals at any time. But card insertions would be ignored if the mmc_host device was suspended; at resume time the core probes for changes that occurred during the sleep. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 3:25 ` Alan Stern 2008-02-24 4:26 ` [linux-pm] " Alan Stern @ 2008-02-24 13:33 ` Rafael J. Wysocki 2008-02-24 20:25 ` Rafael J. Wysocki 2008-02-24 13:51 ` Rafael J. Wysocki 2008-02-24 18:21 ` Pavel Machek 3 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 13:33 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sunday, 24 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > On Sunday, 24 of February 2008, Alan Stern wrote: > > > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > > > > > Ultimately no, it's not. However, we are now late in the -rc2 time frame and > > > > the release of -rc3 seems to be imminent. At this point, IMO, that's the > > > > safest thing to do. BTW, appended is the patch I'd like to get applied. > > > > > > In the interest of having a safe release, I guess you're right. > > > > That's what I'm concerned about at the moment. I'm afraid there won't be > > enough time to nail down all the issues (there may be some we're not even > > aware of). > > All right. You'll need to enlarge your big reversal patch by reverting > commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f. Thanks for the hint. In fact, I had to add a couple of fixes in device_power_down() and dpm_suspend(). The entire patch is appended. I'll comment your new patch in a separate message. Thanks, Rafael --- drivers/base/power/main.c | 97 +++------------------------------------------- drivers/usb/core/usb.c | 2 2 files changed, 8 insertions(+), 91 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -48,7 +48,6 @@ */ LIST_HEAD(dpm_active); -static LIST_HEAD(dpm_locked); static LIST_HEAD(dpm_off); static LIST_HEAD(dpm_off_irq); static LIST_HEAD(dpm_destroy); @@ -81,28 +80,6 @@ void device_pm_add(struct device *dev) */ void device_pm_remove(struct device *dev) { - /* - * If this function is called during a suspend, it will be blocked, - * because we're holding the device's semaphore at that time, which may - * lead to a deadlock. In that case we want to print a warning. - * However, it may also be called by unregister_dropped_devices() with - * the device's semaphore released, in which case the warning should - * not be printed. - */ - if (down_trylock(&dev->sem)) { - if (down_read_trylock(&pm_sleep_rwsem)) { - /* No suspend in progress, wait on dev->sem */ - down(&dev->sem); - up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ - dev_warn(dev, "Suspicious %s during suspend\n", - __FUNCTION__); - dump_stack(); - /* The user has been warned ... */ - down(&dev->sem); - } - } pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); @@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev dpm_sysfs_remove(dev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); - up(&dev->sem); } /** @@ -266,7 +242,7 @@ static void dpm_resume(void) struct list_head *entry = dpm_off.next; struct device *dev = to_device(entry); - list_move_tail(entry, &dpm_locked); + list_move_tail(entry, &dpm_active); mutex_unlock(&dpm_list_mtx); resume_device(dev); mutex_lock(&dpm_list_mtx); @@ -275,25 +251,6 @@ static void dpm_resume(void) } /** - * unlock_all_devices - Release each device's semaphore - * - * Go through the dpm_off list. Put each device on the dpm_active - * list and unlock it. - */ -static void unlock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; - struct device *dev = to_device(entry); - - list_move(entry, &dpm_active); - up(&dev->sem); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * unregister_dropped_devices - Unregister devices scheduled for removal * * Unregister all devices on the dpm_destroy list. @@ -305,7 +262,6 @@ static void unregister_dropped_devices(v struct list_head *entry = dpm_destroy.next; struct device *dev = to_device(entry); - up(&dev->sem); mutex_unlock(&dpm_list_mtx); /* This also removes the device from the list */ device_unregister(dev); @@ -324,7 +280,6 @@ void device_resume(void) { might_sleep(); dpm_resume(); - unlock_all_devices(); unregister_dropped_devices(); up_write(&pm_sleep_rwsem); } @@ -388,18 +343,15 @@ int device_power_down(pm_message_t state struct list_head *entry = dpm_off.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); error = suspend_device_late(dev, state); if (error) { printk(KERN_ERR "Could not power down device %s: " "error %d\n", kobject_name(&dev->kobj), error); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); break; } - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off_irq); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off_irq); } if (!error) @@ -461,11 +413,10 @@ static int dpm_suspend(pm_message_t stat int error = 0; mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; + while (!list_empty(&dpm_active)) { + struct list_head *entry = dpm_active.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); error = suspend_device(dev, state); if (error) { @@ -476,14 +427,11 @@ static int dpm_suspend(pm_message_t stat (error == -EAGAIN ? " (please convert to suspend_late)" : "")); - mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_locked); break; } mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off); } mutex_unlock(&dpm_list_mtx); @@ -491,36 +439,6 @@ static int dpm_suspend(pm_message_t stat } /** - * lock_all_devices - Acquire every device's semaphore - * - * Go through the dpm_active list. Carefully lock each device's - * semaphore and put it in on the dpm_locked list. - */ -static void lock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_active)) { - struct list_head *entry = dpm_active.next; - struct device *dev = to_device(entry); - - /* Required locking order is dev->sem first, - * then dpm_list_mutex. Hence this awkward code. - */ - get_device(dev); - mutex_unlock(&dpm_list_mtx); - down(&dev->sem); - mutex_lock(&dpm_list_mtx); - - if (list_empty(entry)) - up(&dev->sem); /* Device was removed */ - else - list_move_tail(entry, &dpm_locked); - put_device(dev); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * device_suspend - Save state and stop all devices in system. * @state: new power management state * @@ -533,7 +451,6 @@ int device_suspend(pm_message_t state) might_sleep(); down_write(&pm_sleep_rwsem); - lock_all_devices(); error = dpm_suspend(state); if (error) device_resume(); Index: linux-2.6/drivers/usb/core/usb.c =================================================================== --- linux-2.6.orig/drivers/usb/core/usb.c +++ linux-2.6/drivers/usb/core/usb.c @@ -234,7 +234,7 @@ static int ksuspend_usb_init(void) * singlethreaded. Its job doesn't justify running on more * than one CPU. */ - ksuspend_usb_wq = create_singlethread_workqueue("ksuspend_usbd"); + ksuspend_usb_wq = create_freezeable_workqueue("ksuspend_usbd"); if (!ksuspend_usb_wq) return -ENOMEM; return 0; ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 13:33 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki @ 2008-02-24 20:25 ` Rafael J. Wysocki 2008-02-24 20:45 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 20:25 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list, Pavel Machek On Sunday, 24 of February 2008, Rafael J. Wysocki wrote: > On Sunday, 24 of February 2008, Alan Stern wrote: > > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > > > On Sunday, 24 of February 2008, Alan Stern wrote: > > > > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > > > > > > > Ultimately no, it's not. However, we are now late in the -rc2 time frame and > > > > > the release of -rc3 seems to be imminent. At this point, IMO, that's the > > > > > safest thing to do. BTW, appended is the patch I'd like to get applied. > > > > > > > > In the interest of having a safe release, I guess you're right. > > > > > > That's what I'm concerned about at the moment. I'm afraid there won't be > > > enough time to nail down all the issues (there may be some we're not even > > > aware of). > > > > All right. You'll need to enlarge your big reversal patch by reverting > > commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f. > > Thanks for the hint. > > In fact, I had to add a couple of fixes in device_power_down() and > dpm_suspend(). The entire patch is appended. I hope it's fine by everyone to push the patch below to Greg for 2.6.25. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Remove the code that acquires all device semaphores from the suspend code path as it causes multiple problems to appear (most notably, http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f depending on the code being removed. Remove pm_sleep_lock()/pm_sleep_unlock() from device add to avoid the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/base/core.c | 8 --- drivers/base/power/main.c | 97 +++------------------------------------------- drivers/usb/core/usb.c | 2 3 files changed, 8 insertions(+), 99 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -48,7 +48,6 @@ */ LIST_HEAD(dpm_active); -static LIST_HEAD(dpm_locked); static LIST_HEAD(dpm_off); static LIST_HEAD(dpm_off_irq); static LIST_HEAD(dpm_destroy); @@ -81,28 +80,6 @@ void device_pm_add(struct device *dev) */ void device_pm_remove(struct device *dev) { - /* - * If this function is called during a suspend, it will be blocked, - * because we're holding the device's semaphore at that time, which may - * lead to a deadlock. In that case we want to print a warning. - * However, it may also be called by unregister_dropped_devices() with - * the device's semaphore released, in which case the warning should - * not be printed. - */ - if (down_trylock(&dev->sem)) { - if (down_read_trylock(&pm_sleep_rwsem)) { - /* No suspend in progress, wait on dev->sem */ - down(&dev->sem); - up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ - dev_warn(dev, "Suspicious %s during suspend\n", - __FUNCTION__); - dump_stack(); - /* The user has been warned ... */ - down(&dev->sem); - } - } pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); @@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev dpm_sysfs_remove(dev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); - up(&dev->sem); } /** @@ -266,7 +242,7 @@ static void dpm_resume(void) struct list_head *entry = dpm_off.next; struct device *dev = to_device(entry); - list_move_tail(entry, &dpm_locked); + list_move_tail(entry, &dpm_active); mutex_unlock(&dpm_list_mtx); resume_device(dev); mutex_lock(&dpm_list_mtx); @@ -275,25 +251,6 @@ static void dpm_resume(void) } /** - * unlock_all_devices - Release each device's semaphore - * - * Go through the dpm_off list. Put each device on the dpm_active - * list and unlock it. - */ -static void unlock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; - struct device *dev = to_device(entry); - - list_move(entry, &dpm_active); - up(&dev->sem); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * unregister_dropped_devices - Unregister devices scheduled for removal * * Unregister all devices on the dpm_destroy list. @@ -305,7 +262,6 @@ static void unregister_dropped_devices(v struct list_head *entry = dpm_destroy.next; struct device *dev = to_device(entry); - up(&dev->sem); mutex_unlock(&dpm_list_mtx); /* This also removes the device from the list */ device_unregister(dev); @@ -324,7 +280,6 @@ void device_resume(void) { might_sleep(); dpm_resume(); - unlock_all_devices(); unregister_dropped_devices(); up_write(&pm_sleep_rwsem); } @@ -388,18 +343,15 @@ int device_power_down(pm_message_t state struct list_head *entry = dpm_off.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); error = suspend_device_late(dev, state); if (error) { printk(KERN_ERR "Could not power down device %s: " "error %d\n", kobject_name(&dev->kobj), error); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); break; } - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off_irq); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off_irq); } if (!error) @@ -461,11 +413,10 @@ static int dpm_suspend(pm_message_t stat int error = 0; mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; + while (!list_empty(&dpm_active)) { + struct list_head *entry = dpm_active.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); error = suspend_device(dev, state); if (error) { @@ -476,14 +427,11 @@ static int dpm_suspend(pm_message_t stat (error == -EAGAIN ? " (please convert to suspend_late)" : "")); - mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_locked); break; } mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off); } mutex_unlock(&dpm_list_mtx); @@ -491,36 +439,6 @@ static int dpm_suspend(pm_message_t stat } /** - * lock_all_devices - Acquire every device's semaphore - * - * Go through the dpm_active list. Carefully lock each device's - * semaphore and put it in on the dpm_locked list. - */ -static void lock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_active)) { - struct list_head *entry = dpm_active.next; - struct device *dev = to_device(entry); - - /* Required locking order is dev->sem first, - * then dpm_list_mutex. Hence this awkward code. - */ - get_device(dev); - mutex_unlock(&dpm_list_mtx); - down(&dev->sem); - mutex_lock(&dpm_list_mtx); - - if (list_empty(entry)) - up(&dev->sem); /* Device was removed */ - else - list_move_tail(entry, &dpm_locked); - put_device(dev); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * device_suspend - Save state and stop all devices in system. * @state: new power management state * @@ -533,7 +451,6 @@ int device_suspend(pm_message_t state) might_sleep(); down_write(&pm_sleep_rwsem); - lock_all_devices(); error = dpm_suspend(state); if (error) device_resume(); Index: linux-2.6/drivers/usb/core/usb.c =================================================================== --- linux-2.6.orig/drivers/usb/core/usb.c +++ linux-2.6/drivers/usb/core/usb.c @@ -234,7 +234,7 @@ static int ksuspend_usb_init(void) * singlethreaded. Its job doesn't justify running on more * than one CPU. */ - ksuspend_usb_wq = create_singlethread_workqueue("ksuspend_usbd"); + ksuspend_usb_wq = create_freezeable_workqueue("ksuspend_usbd"); if (!ksuspend_usb_wq) return -ENOMEM; return 0; Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c +++ linux-2.6/drivers/base/core.c @@ -770,13 +770,6 @@ int device_add(struct device *dev) struct class_interface *class_intf; int error; - error = pm_sleep_lock(); - if (error) { - dev_warn(dev, "Suspicious %s during suspend\n", __FUNCTION__); - dump_stack(); - return error; - } - dev = get_device(dev); if (!dev || !strlen(dev->bus_id)) { error = -EINVAL; @@ -843,7 +836,6 @@ int device_add(struct device *dev) } Done: put_device(dev); - pm_sleep_unlock(); return error; BusError: device_pm_remove(dev); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 20:25 ` Rafael J. Wysocki @ 2008-02-24 20:45 ` Alan Stern 2008-02-24 20:56 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 20:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list, Pavel Machek On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Remove the code that acquires all device semaphores from the suspend > code path as it causes multiple problems to appear (most notably, > http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the > change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f > depending on the code being removed. > > Remove pm_sleep_lock()/pm_sleep_unlock() from device add to avoid > the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. Do you actually know whether removing that lock fixes the reported bug? I agree, the lock should be removed for now. But I'd sure like to get some feedback about what's going wrong. It's starting to look as though we'd be a lot better off blocking device registration during sleep instead of failing it. Shouldn't resume_device() and suspend_device() now acquire the device semaphore before calling the various methods? That's the way they used to work. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 20:45 ` Alan Stern @ 2008-02-24 20:56 ` Rafael J. Wysocki 2008-02-24 21:11 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 20:56 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list, Pavel Machek On Sunday, 24 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Remove the code that acquires all device semaphores from the suspend > > code path as it causes multiple problems to appear (most notably, > > http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the > > change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f > > depending on the code being removed. > > > > Remove pm_sleep_lock()/pm_sleep_unlock() from device add to avoid > > the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. > > Do you actually know whether removing that lock fixes the reported bug? In fact I'm waiting for the confirmation, but given the symptoms described by Lukas I'm quite confident that the problem shouldn't be reproducible with the lock removed. > I agree, the lock should be removed for now. But I'd sure like to get > some feedback about what's going wrong. No question about that. > It's starting to look as though we'd be a lot better off blocking device > registration during sleep instead of failing it. Agreed. > Shouldn't resume_device() and suspend_device() now acquire the device > semaphore before calling the various methods? That's the way they > used to work. Yes, thanks for reminding me about that. Updated patch follows. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Remove the code that acquires all device semaphores from the suspend code path as it causes multiple problems to appear (most notably, http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f depending on the code being removed. Remove pm_sleep_lock()/pm_sleep_unlock() from device_add() to avoid the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/base/core.c | 8 --- drivers/base/power/main.c | 106 ++++++---------------------------------------- drivers/usb/core/usb.c | 2 3 files changed, 17 insertions(+), 99 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -48,7 +48,6 @@ */ LIST_HEAD(dpm_active); -static LIST_HEAD(dpm_locked); static LIST_HEAD(dpm_off); static LIST_HEAD(dpm_off_irq); static LIST_HEAD(dpm_destroy); @@ -81,28 +80,6 @@ void device_pm_add(struct device *dev) */ void device_pm_remove(struct device *dev) { - /* - * If this function is called during a suspend, it will be blocked, - * because we're holding the device's semaphore at that time, which may - * lead to a deadlock. In that case we want to print a warning. - * However, it may also be called by unregister_dropped_devices() with - * the device's semaphore released, in which case the warning should - * not be printed. - */ - if (down_trylock(&dev->sem)) { - if (down_read_trylock(&pm_sleep_rwsem)) { - /* No suspend in progress, wait on dev->sem */ - down(&dev->sem); - up_read(&pm_sleep_rwsem); - } else { - /* Suspend in progress, we may deadlock */ - dev_warn(dev, "Suspicious %s during suspend\n", - __FUNCTION__); - dump_stack(); - /* The user has been warned ... */ - down(&dev->sem); - } - } pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); @@ -110,7 +87,6 @@ void device_pm_remove(struct device *dev dpm_sysfs_remove(dev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); - up(&dev->sem); } /** @@ -230,6 +206,8 @@ static int resume_device(struct device * TRACE_DEVICE(dev); TRACE_RESUME(0); + down(&dev->sem); + if (dev->bus && dev->bus->resume) { dev_dbg(dev,"resuming\n"); error = dev->bus->resume(dev); @@ -245,6 +223,8 @@ static int resume_device(struct device * error = dev->class->resume(dev); } + up(&dev->sem); + TRACE_RESUME(error); return error; } @@ -266,7 +246,7 @@ static void dpm_resume(void) struct list_head *entry = dpm_off.next; struct device *dev = to_device(entry); - list_move_tail(entry, &dpm_locked); + list_move_tail(entry, &dpm_active); mutex_unlock(&dpm_list_mtx); resume_device(dev); mutex_lock(&dpm_list_mtx); @@ -275,25 +255,6 @@ static void dpm_resume(void) } /** - * unlock_all_devices - Release each device's semaphore - * - * Go through the dpm_off list. Put each device on the dpm_active - * list and unlock it. - */ -static void unlock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; - struct device *dev = to_device(entry); - - list_move(entry, &dpm_active); - up(&dev->sem); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * unregister_dropped_devices - Unregister devices scheduled for removal * * Unregister all devices on the dpm_destroy list. @@ -305,7 +266,6 @@ static void unregister_dropped_devices(v struct list_head *entry = dpm_destroy.next; struct device *dev = to_device(entry); - up(&dev->sem); mutex_unlock(&dpm_list_mtx); /* This also removes the device from the list */ device_unregister(dev); @@ -324,7 +284,6 @@ void device_resume(void) { might_sleep(); dpm_resume(); - unlock_all_devices(); unregister_dropped_devices(); up_write(&pm_sleep_rwsem); } @@ -388,18 +347,15 @@ int device_power_down(pm_message_t state struct list_head *entry = dpm_off.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); error = suspend_device_late(dev, state); if (error) { printk(KERN_ERR "Could not power down device %s: " "error %d\n", kobject_name(&dev->kobj), error); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); break; } - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off_irq); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off_irq); } if (!error) @@ -419,6 +375,8 @@ static int suspend_device(struct device { int error = 0; + down(&dev->sem); + if (dev->power.power_state.event) { dev_dbg(dev, "PM: suspend %d-->%d\n", dev->power.power_state.event, state.event); @@ -441,6 +399,9 @@ static int suspend_device(struct device error = dev->bus->suspend(dev, state); suspend_report_result(dev->bus->suspend, error); } + + up(&dev->sem); + return error; } @@ -461,11 +422,10 @@ static int dpm_suspend(pm_message_t stat int error = 0; mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_locked)) { - struct list_head *entry = dpm_locked.prev; + while (!list_empty(&dpm_active)) { + struct list_head *entry = dpm_active.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); error = suspend_device(dev, state); if (error) { @@ -476,14 +436,11 @@ static int dpm_suspend(pm_message_t stat (error == -EAGAIN ? " (please convert to suspend_late)" : "")); - mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_locked); break; } mutex_lock(&dpm_list_mtx); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off); } mutex_unlock(&dpm_list_mtx); @@ -491,36 +448,6 @@ static int dpm_suspend(pm_message_t stat } /** - * lock_all_devices - Acquire every device's semaphore - * - * Go through the dpm_active list. Carefully lock each device's - * semaphore and put it in on the dpm_locked list. - */ -static void lock_all_devices(void) -{ - mutex_lock(&dpm_list_mtx); - while (!list_empty(&dpm_active)) { - struct list_head *entry = dpm_active.next; - struct device *dev = to_device(entry); - - /* Required locking order is dev->sem first, - * then dpm_list_mutex. Hence this awkward code. - */ - get_device(dev); - mutex_unlock(&dpm_list_mtx); - down(&dev->sem); - mutex_lock(&dpm_list_mtx); - - if (list_empty(entry)) - up(&dev->sem); /* Device was removed */ - else - list_move_tail(entry, &dpm_locked); - put_device(dev); - } - mutex_unlock(&dpm_list_mtx); -} - -/** * device_suspend - Save state and stop all devices in system. * @state: new power management state * @@ -533,7 +460,6 @@ int device_suspend(pm_message_t state) might_sleep(); down_write(&pm_sleep_rwsem); - lock_all_devices(); error = dpm_suspend(state); if (error) device_resume(); Index: linux-2.6/drivers/usb/core/usb.c =================================================================== --- linux-2.6.orig/drivers/usb/core/usb.c +++ linux-2.6/drivers/usb/core/usb.c @@ -234,7 +234,7 @@ static int ksuspend_usb_init(void) * singlethreaded. Its job doesn't justify running on more * than one CPU. */ - ksuspend_usb_wq = create_singlethread_workqueue("ksuspend_usbd"); + ksuspend_usb_wq = create_freezeable_workqueue("ksuspend_usbd"); if (!ksuspend_usb_wq) return -ENOMEM; return 0; Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c +++ linux-2.6/drivers/base/core.c @@ -770,13 +770,6 @@ int device_add(struct device *dev) struct class_interface *class_intf; int error; - error = pm_sleep_lock(); - if (error) { - dev_warn(dev, "Suspicious %s during suspend\n", __FUNCTION__); - dump_stack(); - return error; - } - dev = get_device(dev); if (!dev || !strlen(dev->bus_id)) { error = -EINVAL; @@ -843,7 +836,6 @@ int device_add(struct device *dev) } Done: put_device(dev); - pm_sleep_unlock(); return error; BusError: device_pm_remove(dev); ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 20:56 ` Rafael J. Wysocki @ 2008-02-24 21:11 ` Alan Stern 2008-02-24 22:18 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 21:11 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list, Pavel Machek On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Remove the code that acquires all device semaphores from the suspend > code path as it causes multiple problems to appear (most notably, > http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the > change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f > depending on the code being removed. > > Remove pm_sleep_lock()/pm_sleep_unlock() from device_add() to avoid > the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. This looks okay. Are you going to submit it just for 2.6.25, leaving the existing code as-is for 2.6.26? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 21:11 ` Alan Stern @ 2008-02-24 22:18 ` Rafael J. Wysocki 0 siblings, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 22:18 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list, Pavel Machek On Sunday, 24 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Remove the code that acquires all device semaphores from the suspend > > code path as it causes multiple problems to appear (most notably, > > http://bugzilla.kernel.org/show_bug.cgi?id=10030) and revert the > > change introduced by commit 4145ed6dc597a9bea5f6ae8c574653b2de10620f > > depending on the code being removed. > > > > Remove pm_sleep_lock()/pm_sleep_unlock() from device_add() to avoid > > the issue reported at http://bugzilla.kernel.org/show_bug.cgi?id=9874. > > This looks okay. > > Are you going to submit it just for 2.6.25, leaving the existing code > as-is for 2.6.26? No, I'd like to start over on top of 2.6.25. That will be much cleaner from the upstream point of view. We seem to have all the pieces anyway, so that's only a matter of putting them back together. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 3:25 ` Alan Stern 2008-02-24 4:26 ` [linux-pm] " Alan Stern 2008-02-24 13:33 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki @ 2008-02-24 13:51 ` Rafael J. Wysocki 2008-02-24 19:27 ` Alan Stern 2008-02-24 18:21 ` Pavel Machek 3 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 13:51 UTC (permalink / raw) To: Alan Stern Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sunday, 24 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > On Sunday, 24 of February 2008, Alan Stern wrote: > > > On Sat, 23 Feb 2008, Rafael J. Wysocki wrote: > > > > I still have no clear idea about what's going on with the ACPI bug in > > > #9874. > > > > It seems that the ACPI code is not prepared to cope with a failing device > > registration, in which case it'd need fixing. Frankly, I'm afraid there may > > be more issues like this throughout the tree. > > The fact remains that it is unsafe to register a device during a sleep > transition, although if the device's driver doesn't have a suspend or > resume method you can probably get away with it. > > > > However perhaps the bug wouldn't occur if we blocked device > > > registration until after the resume was finished, instead of failing it > > > outright. Since the registration in this case was done in a workqueue > > > thread, blocking wouldn't cause an immediate deadlock. > > > > No, it wouldn't, but the fact that it happens in the ACPI code makes me worry. > > It happened in a workqueue. There could be lots of similar cases: Some > interrupt-driven event causes a hotplug action. Since the action can't > be carried out in interrupt context, the driver has no choice but to > defer it to a workqueue or kernel thread. Blocking that workqueue or > kernel thread won't cause a problem _unless_ the driver's > suspend/resume methods try to synchronize with it. That's the > difficult case. Yes. > > If we block that code and the things it's supposed to do turn out to be > > necessary for suspending later on, we'll be in trouble. > > Exactly. > > > > @@ -94,14 +102,15 @@ void device_pm_remove(struct device *dev > > > /* No suspend in progress, wait on dev->sem */ > > > down(&dev->sem); > > > up_read(&pm_sleep_rwsem); > > > - } else { > > > - /* Suspend in progress, we may deadlock */ > > > + } else if (!in_suspend_context()) { > > > + /* Suspending in another task, we may deadlock */ > > > > Well, that shouldn't really deadlock. If it does, there is a potential design > > issue somewhere. I think it might be better to set up a timer in here too. > > At this point the driver core owns the device semaphore, so the > unregistration task will block until the sleep is over. If the > driver's resume method has to synchronize with the unregistration task > then it will deadlock. I agree, it would be a design issue. In fact, > it's the same design issue described earlier in this email. > > > Although IMO it would be even better to just set up a timer and remove the > > warning altogehter. > > That warning was just for debugging purposes. A timer in the resume > path could do the same thing with fewer false alarms, just like the > timer in the suspend path. I have added it to the new patch. OK > > > dev_warn(dev, "Suspicious %s during suspend\n", > > > __FUNCTION__); > > > dump_stack(); > > > /* The user has been warned ... */ > > > down(&dev->sem); > > > } > > > + /* Otherwise we're okay */ > > > } > > > pr_debug("PM: Removing info for %s:%s\n", > > > dev->bus ? dev->bus->name : "No Bus", > > > > There's a problem here, because we shouldn't release the semaphore if we're > > in the suspend context. > > Yes, you're right. It's fixed in the new version. > > > > +static void suspend_timeout(unsigned long _dev) > > > +{ > > > + struct device *dev = (struct device *) _dev; > > > + > > > + dev_err(dev, "deadlock during suspend!\n"); > > > + show_state(); > > > > The show_state() seems to be overkill and won't really help if the user can't > > collect the output on the fly using a serial console or something like this. > > [The debug stuff printed here should fit a typical laptop screen, ideally.] > > Changed. > > > I'd prefer > > > > if (in_suspend_context()) { > > __device_release_driver(dev); > > } else { > > down(&dev->sem); > > __device_release_driver(dev); > > up(&dev->sem); > > } > > Okay. > > You know, with this new patch we probably don't need > device_pm_schedule_removal() any more. No, we don't. However, because of that dpm_suspend() and device_power_down() need to be changed not to assume that the removed devices will end up on dpm_destroy. > However I have left it in for now. Well, it's used by the b43, leds and hwrng drivers as well as by some CPU hotplug notifiers. They all will have to be changed along with removing it. > > Well, I'd really feel much more comfortable if we removed the troublesome code > > for 2.6.25 and reintroduced it along with the above safeguards for 2.6.26. > > > > Of course, this doesn't mean we can't send debug patches for testing to the > > users who have alraedy reported problems with it. :-) > > Certainly! > > Here's the new version. IMO you also should add this change in device_power_down(): @@ -388,18 +343,15 @@ int device_power_down(pm_message_t state struct list_head *entry = dpm_off.prev; struct device *dev = to_device(entry); - list_del_init(&dev->power.entry); error = suspend_device_late(dev, state); if (error) { printk(KERN_ERR "Could not power down device %s: " "error %d\n", kobject_name(&dev->kobj), error); - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off); break; } - if (list_empty(&dev->power.entry)) - list_add(&dev->power.entry, &dpm_off_irq); + if (!list_empty(&dev->power.entry)) + list_move(&dev->power.entry, &dpm_off_irq); } if (!error) and the analogous one in dpm_suspend(). Otherwise there would be a problem if suspend_device_late() itself removed the device (same for device_suspend()). Of course, that would lead to a problem if device_pm_schedule_removal() were used for the removal, so the last "!list_empty(&dev->power.entry)" is really not sufficient - until device_pm_schedule_removal() is removed. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 13:51 ` Rafael J. Wysocki @ 2008-02-24 19:27 ` Alan Stern 2008-02-24 19:42 ` Zdenek Kabelac 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 19:27 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > You know, with this new patch we probably don't need > > device_pm_schedule_removal() any more. > > No, we don't. However, because of that dpm_suspend() and device_power_down() > need to be changed not to assume that the removed devices will end up on > dpm_destroy. Yes. The safest approach will be to check whether the device is still at the end of the dpm_locked or dpm_off list; if it hasn't then we can assume that it has been removed (or scheduled for removal). > IMO you also should add this change in device_power_down(): > > @@ -388,18 +343,15 @@ int device_power_down(pm_message_t state > struct list_head *entry = dpm_off.prev; > struct device *dev = to_device(entry); > > - list_del_init(&dev->power.entry); > error = suspend_device_late(dev, state); > if (error) { > printk(KERN_ERR "Could not power down device %s: " > "error %d\n", > kobject_name(&dev->kobj), error); > - if (list_empty(&dev->power.entry)) > - list_add(&dev->power.entry, &dpm_off); > break; > } > - if (list_empty(&dev->power.entry)) > - list_add(&dev->power.entry, &dpm_off_irq); > + if (!list_empty(&dev->power.entry)) > + list_move(&dev->power.entry, &dpm_off_irq); > } Actually I changed the last two lines to: + + /* Don't move the entry if the device has been unregistered */ + if (entry == dpm_off.prev) + list_move(entry, &dpm_off_irq); with analogous changes to dpm_suspend(). Any more suggestions for updates? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 19:27 ` Alan Stern @ 2008-02-24 19:42 ` Zdenek Kabelac 2008-02-24 20:09 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Zdenek Kabelac @ 2008-02-24 19:42 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list 2008/2/24, Alan Stern <stern@rowland.harvard.edu>: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > > > You know, with this new patch we probably don't need > > > device_pm_schedule_removal() any more. > > > > No, we don't. However, because of that dpm_suspend() and device_power_down() > > need to be changed not to assume that the removed devices will end up on > > dpm_destroy. > > > with analogous changes to dpm_suspend(). > > Any more suggestions for updates? > Is there any current patch I should actually test - it looks there is ongoing discussion from the time Rafael has proposed his last patch to check by me: http://bugzilla.kernel.org/attachment.cgi?id=14961&action=view But I think it's visible the only suspend update will not resolve my problems and the problems is deeper in the mmc. So should I still test yesterdays' patch ? Zdenek Zdenek ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 19:42 ` Zdenek Kabelac @ 2008-02-24 20:09 ` Rafael J. Wysocki 0 siblings, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 20:09 UTC (permalink / raw) To: Zdenek Kabelac Cc: Alan Stern, Pierre Ossman, Kernel development list, pm list On Sunday, 24 of February 2008, Zdenek Kabelac wrote: > 2008/2/24, Alan Stern <stern@rowland.harvard.edu>: > > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > > > > > > You know, with this new patch we probably don't need > > > > device_pm_schedule_removal() any more. > > > > > > No, we don't. However, because of that dpm_suspend() and device_power_down() > > > need to be changed not to assume that the removed devices will end up on > > > dpm_destroy. > > > > > > with analogous changes to dpm_suspend(). > > > > Any more suggestions for updates? > > > > Is there any current patch I should actually test - it looks there is > ongoing discussion from the time Rafael has proposed his last patch to > check by me: http://bugzilla.kernel.org/attachment.cgi?id=14961&action=view > > But I think it's visible the only suspend update will not resolve my > problems and the problems is deeper in the mmc. Yes, there are problems in there, but your feedback is needed nevertheless. > So should I still test yesterdays' patch ? Yes please. It will be valuable information to us whether it fixes the suspend issue at least. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 3:25 ` Alan Stern ` (2 preceding siblings ...) 2008-02-24 13:51 ` Rafael J. Wysocki @ 2008-02-24 18:21 ` Pavel Machek 2008-02-24 19:03 ` Alan Stern 3 siblings, 1 reply; 92+ messages in thread From: Pavel Machek @ 2008-02-24 18:21 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list Hi! > Index: usb-2.6/drivers/base/power/main.c > =================================================================== > --- usb-2.6.orig/drivers/base/power/main.c > +++ usb-2.6/drivers/base/power/main.c > @@ -25,6 +25,7 @@ > #include <linux/pm.h> > #include <linux/resume-trace.h> > #include <linux/rwsem.h> > +#include <linux/sched.h> > > #include "../base.h" > #include "power.h" > @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); > > int (*platform_enable_wakeup)(struct device *dev, int is_on); > > +static struct task_struct *suspending_task; What locking protects this variable? What happens when suspending_task exits? (Hmm, that would probably be bug, anyway?) Or are we running UP when this is accessed? This at least needs a big fat comment. > +bool in_suspend_context(void) > +{ > + return (suspending_task == current); > +} Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 18:21 ` Pavel Machek @ 2008-02-24 19:03 ` Alan Stern 2008-02-24 20:11 ` Pavel Machek 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 19:03 UTC (permalink / raw) To: Pavel Machek Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Pavel Machek wrote: > Hi! > > > Index: usb-2.6/drivers/base/power/main.c > > =================================================================== > > --- usb-2.6.orig/drivers/base/power/main.c > > +++ usb-2.6/drivers/base/power/main.c > > @@ -25,6 +25,7 @@ > > #include <linux/pm.h> > > #include <linux/resume-trace.h> > > #include <linux/rwsem.h> > > +#include <linux/sched.h> > > > > #include "../base.h" > > #include "power.h" > > @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); > > > > int (*platform_enable_wakeup)(struct device *dev, int is_on); > > > > +static struct task_struct *suspending_task; > > What locking protects this variable? What happens when suspending_task > exits? (Hmm, that would probably be bug, anyway?) It's protected by whatever existing locking scheme allows only one task to start a system sleep at a time. For example, the suspending task has to get a write lock on pm_sleep_rwsem. Yes, if the suspending task exits before the system has woken up, you're in trouble regardless. > Or are we running UP when this is accessed? This at least needs a big > fat comment. > > > +bool in_suspend_context(void) > > +{ > > + return (suspending_task == current); > > +} We aren't necessarily UP. But since all that matters is whether or not suspend_task is equal to the current task, no extra locking is needed. I'll add a comment explaining all this. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 19:03 ` Alan Stern @ 2008-02-24 20:11 ` Pavel Machek 2008-02-24 20:33 ` Alan Stern 0 siblings, 1 reply; 92+ messages in thread From: Pavel Machek @ 2008-02-24 20:11 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list Hi! > > > @@ -25,6 +25,7 @@ > > > #include <linux/pm.h> > > > #include <linux/resume-trace.h> > > > #include <linux/rwsem.h> > > > +#include <linux/sched.h> > > > > > > #include "../base.h" > > > #include "power.h" > > > @@ -59,6 +60,13 @@ static DECLARE_RWSEM(pm_sleep_rwsem); > > > > > > int (*platform_enable_wakeup)(struct device *dev, int is_on); > > > > > > +static struct task_struct *suspending_task; > > > > What locking protects this variable? What happens when suspending_task > > exits? (Hmm, that would probably be bug, anyway?) > > It's protected by whatever existing locking scheme allows only one > task to start a system sleep at a time. For example, the suspending > task has to get a write lock on pm_sleep_rwsem. And readers of suspending_task are protected by? At the very least, you'd need rmb() before reading it and wmb() after writing to it, but I'm not sure if that's enough on every obscure architecture out there. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 20:11 ` Pavel Machek @ 2008-02-24 20:33 ` Alan Stern 2008-02-24 21:42 ` Pavel Machek 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-24 20:33 UTC (permalink / raw) To: Pavel Machek Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Pavel Machek wrote: > > > What locking protects this variable? What happens when suspending_task > > > exits? (Hmm, that would probably be bug, anyway?) > > > > It's protected by whatever existing locking scheme allows only one > > task to start a system sleep at a time. For example, the suspending > > task has to get a write lock on pm_sleep_rwsem. > > And readers of suspending_task are protected by? I added a comment about that too. > At the very least, you'd need rmb() before reading it and wmb() after > writing to it, but I'm not sure if that's enough on every obscure > architecture out there. No, neither one is needed because of the way suspending_task is used. It's not necessary for a reader R to see the variable's actual value; all R needs to know is whether or not suspending_task is equal to R. Since the only process which can set suspending_task to R is R itself, and since R will set suspending_task back to NULL before releasing the write lock on pm_sleep_rwsem, there's never any ambiguity. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 20:33 ` Alan Stern @ 2008-02-24 21:42 ` Pavel Machek 2008-02-24 22:21 ` Rafael J. Wysocki 2008-02-25 2:19 ` Alan Stern 0 siblings, 2 replies; 92+ messages in thread From: Pavel Machek @ 2008-02-24 21:42 UTC (permalink / raw) To: Alan Stern Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun 2008-02-24 15:33:01, Alan Stern wrote: > On Sun, 24 Feb 2008, Pavel Machek wrote: > > > > > What locking protects this variable? What happens when suspending_task > > > > exits? (Hmm, that would probably be bug, anyway?) > > > > > > It's protected by whatever existing locking scheme allows only one > > > task to start a system sleep at a time. For example, the suspending > > > task has to get a write lock on pm_sleep_rwsem. > > > > And readers of suspending_task are protected by? > > I added a comment about that too. > > > At the very least, you'd need rmb() before reading it and wmb() after > > writing to it, but I'm not sure if that's enough on every obscure > > architecture out there. > > No, neither one is needed because of the way suspending_task is used. > > It's not necessary for a reader R to see the variable's actual value; > all R needs to know is whether or not suspending_task is equal to R. > Since the only process which can set suspending_task to R is R itself, > and since R will set suspending_task back to NULL before releasing the > write lock on pm_sleep_rwsem, there's never any ambiguity. Subtle. Very subtly wrong ;-). imagine suspending_task == 0xabcdef01. Now task "R" with current == 0xabcd0000 reads suspending_task while the other cpu is writing to it, and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly believes that "R" == suspending_task. I agree it is very unlikely, and it will not happen on i386. But what about just using atomic_t suspending_task, and store current->pid into it? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 21:42 ` Pavel Machek @ 2008-02-24 22:21 ` Rafael J. Wysocki 2008-02-25 2:21 ` Alan Stern 2008-02-25 2:19 ` Alan Stern 1 sibling, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-24 22:21 UTC (permalink / raw) To: Pavel Machek Cc: Alan Stern, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sunday, 24 of February 2008, Pavel Machek wrote: > On Sun 2008-02-24 15:33:01, Alan Stern wrote: > > On Sun, 24 Feb 2008, Pavel Machek wrote: > > > > > > > What locking protects this variable? What happens when suspending_task > > > > > exits? (Hmm, that would probably be bug, anyway?) > > > > > > > > It's protected by whatever existing locking scheme allows only one > > > > task to start a system sleep at a time. For example, the suspending > > > > task has to get a write lock on pm_sleep_rwsem. > > > > > > And readers of suspending_task are protected by? > > > > I added a comment about that too. > > > > > At the very least, you'd need rmb() before reading it and wmb() after > > > writing to it, but I'm not sure if that's enough on every obscure > > > architecture out there. > > > > No, neither one is needed because of the way suspending_task is used. > > > > It's not necessary for a reader R to see the variable's actual value; > > all R needs to know is whether or not suspending_task is equal to R. > > Since the only process which can set suspending_task to R is R itself, > > and since R will set suspending_task back to NULL before releasing the > > write lock on pm_sleep_rwsem, there's never any ambiguity. > > Subtle. > > Very subtly wrong ;-). > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > believes that "R" == suspending_task. > > I agree it is very unlikely, and it will not happen on i386. But what > about just using atomic_t suspending_task, and store current->pid into > it? I'd rather use a lock, frankly. For example, we can require the readers to take pm_sleep_rwsem for reading in order to access that. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 22:21 ` Rafael J. Wysocki @ 2008-02-25 2:21 ` Alan Stern 2008-02-25 11:41 ` Rafael J. Wysocki 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-25 2:21 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > Very subtly wrong ;-). > > > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > > believes that "R" == suspending_task. > > > > I agree it is very unlikely, and it will not happen on i386. But what > > about just using atomic_t suspending_task, and store current->pid into > > it? > > I'd rather use a lock, frankly. For example, we can require the readers to > take pm_sleep_rwsem for reading in order to access that. That certainly won't work. Imagine what would happen when the reader _was_ the suspending task. But if you really twist my arm, I'll go along with Pavel's suggestion. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-25 2:21 ` Alan Stern @ 2008-02-25 11:41 ` Rafael J. Wysocki [not found] ` <c4e36d110802250402q7312a488idf752e07db7504e8@mail.gmail.com> 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-25 11:41 UTC (permalink / raw) To: Alan Stern Cc: Pavel Machek, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Monday, 25 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Rafael J. Wysocki wrote: > > > > Very subtly wrong ;-). > > > > > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > > > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > > > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > > > believes that "R" == suspending_task. > > > > > > I agree it is very unlikely, and it will not happen on i386. But what > > > about just using atomic_t suspending_task, and store current->pid into > > > it? > > > > I'd rather use a lock, frankly. For example, we can require the readers to > > take pm_sleep_rwsem for reading in order to access that. > > That certainly won't work. Imagine what would happen when the reader > _was_ the suspending task. Yeah, I've already realized it was a stupid idea. :-) > But if you really twist my arm, I'll go along with Pavel's suggestion. So can you do that, pretty please? Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
[parent not found: <c4e36d110802250402q7312a488idf752e07db7504e8@mail.gmail.com>]
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted [not found] ` <c4e36d110802250402q7312a488idf752e07db7504e8@mail.gmail.com> @ 2008-02-25 12:31 ` Rafael J. Wysocki 0 siblings, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-25 12:31 UTC (permalink / raw) To: Zdenek Kabelac Cc: Alan Stern, Pierre Ossman, Kernel development list, pm list, Pavel Machek, David Brownell On Monday, 25 of February 2008, Zdenek Kabelac wrote: > Hi Hi, [CCs restored, added CC to Dave] > I'm finally going to test some kernel - because I'd been trying it > against the HEAD - but unfortunately it looks like there is something > seriously broken with -rc3 and sata merge - anyway - I'm going to make > a test with this head commit 85b80ebfa4384b8ea30cc1af9617db30319a9ccd > which should be the last one before merge of SATA. > > So I'm going to check this tree with you patch: > pm-remove-locking-from-core.patch If that's the one from http://marc.info/?l=linux-acpi&m=120389632114090&w=2 , please do it. > --- > drivers/base/core.c | 8 --- > drivers/base/power/main.c | 97 +++------------------------------------------- > drivers/usb/core/usb.c | 2 > 3 files changed, 8 insertions(+), 99 deletions(-) > > Do you wan to test also the other patch ? Yes. Please also test the patch that Alan asked you to test here: http://lkml.org/lkml/2008/2/23/402 It's experimantal, but it is important to us to know if you see the symptoms (and which ones if you do) with this patch applied. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-24 21:42 ` Pavel Machek 2008-02-24 22:21 ` Rafael J. Wysocki @ 2008-02-25 2:19 ` Alan Stern 2008-02-25 9:03 ` using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) Pavel Machek 2008-02-25 11:40 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki 1 sibling, 2 replies; 92+ messages in thread From: Alan Stern @ 2008-02-25 2:19 UTC (permalink / raw) To: Pavel Machek Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Sun, 24 Feb 2008, Pavel Machek wrote: > > > At the very least, you'd need rmb() before reading it and wmb() after > > > writing to it, but I'm not sure if that's enough on every obscure > > > architecture out there. > > > > No, neither one is needed because of the way suspending_task is used. > > > > It's not necessary for a reader R to see the variable's actual value; > > all R needs to know is whether or not suspending_task is equal to R. > > Since the only process which can set suspending_task to R is R itself, > > and since R will set suspending_task back to NULL before releasing the > > write lock on pm_sleep_rwsem, there's never any ambiguity. > > Subtle. > > Very subtly wrong ;-). > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > believes that "R" == suspending_task. I always thought that reads and writes of pointers are atomic, just like reads and writes of longs. Is that wrong? Now if pointers were the same size as long long then I would agree with you. > I agree it is very unlikely, and it will not happen on i386. But what > about just using atomic_t suspending_task, and store current->pid into > it? That would work just as well. Except that it wouldn't need to be atomic_t, because current->pid is always an integer (not guaranteed, I suppose, but that's what it is on all current architectures) and reads/writes of ints _are_ atomic. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) 2008-02-25 2:19 ` Alan Stern @ 2008-02-25 9:03 ` Pavel Machek 2008-02-25 14:46 ` Alan Stern 2008-02-25 11:40 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki 1 sibling, 1 reply; 92+ messages in thread From: Pavel Machek @ 2008-02-25 9:03 UTC (permalink / raw) To: Alan Stern, zdenek.kabelac, davem Cc: Rafael J. Wysocki, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list Hi! Alan thinks that `subj` is correct... > > > > At the very least, you'd need rmb() before reading it and wmb() after > > > > writing to it, but I'm not sure if that's enough on every obscure > > > > architecture out there. > > > > > > No, neither one is needed because of the way suspending_task is used. > > > > > > It's not necessary for a reader R to see the variable's actual value; > > > all R needs to know is whether or not suspending_task is equal to R. > > > Since the only process which can set suspending_task to R is R itself, > > > and since R will set suspending_task back to NULL before releasing the > > > write lock on pm_sleep_rwsem, there's never any ambiguity. > > > > Subtle. > > > > Very subtly wrong ;-). > > > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > > believes that "R" == suspending_task. > > I always thought that reads and writes of pointers are atomic, just > like reads and writes of longs. Is that wrong? ...but I'm not that sure. Can someone clarify? I guess it only works as long as longs are aligned? Should it be written down to atomic_ops.txt? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) 2008-02-25 9:03 ` using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) Pavel Machek @ 2008-02-25 14:46 ` Alan Stern 2008-03-03 12:08 ` [patch] Re: using long instead of atomic_t when only set/read is required Pavel Machek 0 siblings, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-02-25 14:46 UTC (permalink / raw) To: Pavel Machek Cc: Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, 25 Feb 2008, Pavel Machek wrote: > Hi! > > Alan thinks that `subj` is correct... More precisely, reads and writes of pointers are always atomic. That is, if a write and a read occur concurrently, it is guaranteed that the read will obtain either the old or the new value of the pointer, never a mish-mash of the two. If this were not so then RCU wouldn't work. > I guess it only works as long as longs are aligned? Should it be > written down to atomic_ops.txt? Forget it, I'm going to withdraw the entire thing. The whole approach is wrong. More details in a new email thread, to follow soon. Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* [patch] Re: using long instead of atomic_t when only set/read is required 2008-02-25 14:46 ` Alan Stern @ 2008-03-03 12:08 ` Pavel Machek 2008-03-03 15:42 ` Alan Stern 2008-03-03 15:48 ` Alan Cox 0 siblings, 2 replies; 92+ messages in thread From: Pavel Machek @ 2008-03-03 12:08 UTC (permalink / raw) To: Alan Stern, Linus Torvalds, Andrew Morton Cc: Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list Hi! > > Alan thinks that `subj` is correct... > > More precisely, reads and writes of pointers are always atomic. That > is, if a write and a read occur concurrently, it is guaranteed that the > read will obtain either the old or the new value of the pointer, never > a mish-mash of the two. If this were not so then RCU wouldn't work. Ok, so linux actually atomicity of long? If so, this should probably be applied... Signed-off-by: Pavel Machek <pavel@suse.cz> diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt index 4ef2450..0a7d180 100644 --- a/Documentation/atomic_ops.txt +++ b/Documentation/atomic_ops.txt @@ -21,6 +21,9 @@ local_t is very similar to atomic_t. If updated by one CPU, local_t is probably more appropriate. Please see Documentation/local_ops.txt for the semantics of local_t. +long (and int and void *) can be used instead of atomic_t, if all you +need is atomic setting and atomic reading. + The first operations to implement for atomic_t's are the initializers and plain reads. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 12:08 ` [patch] Re: using long instead of atomic_t when only set/read is required Pavel Machek @ 2008-03-03 15:42 ` Alan Stern 2008-03-03 15:53 ` Alan Cox 2008-03-03 17:22 ` Paul E. McKenney 2008-03-03 15:48 ` Alan Cox 1 sibling, 2 replies; 92+ messages in thread From: Alan Stern @ 2008-03-03 15:42 UTC (permalink / raw) To: Pavel Machek Cc: Paul E. McKenney, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, 3 Mar 2008, Pavel Machek wrote: > Hi! > > > > Alan thinks that `subj` is correct... > > > > More precisely, reads and writes of pointers are always atomic. That > > is, if a write and a read occur concurrently, it is guaranteed that the > > read will obtain either the old or the new value of the pointer, never > > a mish-mash of the two. If this were not so then RCU wouldn't work. Right, Paul? > Ok, so linux actually atomicity of long? > > If so, this should probably be applied... > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt > index 4ef2450..0a7d180 100644 > --- a/Documentation/atomic_ops.txt > +++ b/Documentation/atomic_ops.txt > @@ -21,6 +21,9 @@ local_t is very similar to atomic_t. If > updated by one CPU, local_t is probably more appropriate. Please see > Documentation/local_ops.txt for the semantics of local_t. > > +long (and int and void *) can be used instead of atomic_t, if all you > +need is atomic setting and atomic reading. > + > The first operations to implement for atomic_t's are the initializers and > plain reads. Yes indeed. This fact doesn't seem to be documented anywhere, but it is clearly a requirement of the kernel. I would make the text a little more explicit, see below. Alan Stern ------------------------------------------------------- Atomicity of reads of write for pointers and integral types (other than long long) should be documented. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> --- Index: usb-2.6/Documentation/atomic_ops.txt =================================================================== --- usb-2.6.orig/Documentation/atomic_ops.txt +++ usb-2.6/Documentation/atomic_ops.txt @@ -21,6 +21,21 @@ local_t is very similar to atomic_t. If updated by one CPU, local_t is probably more appropriate. Please see Documentation/local_ops.txt for the semantics of local_t. +For all properly-aligned pointer and integral types other than long +long, the kernel requires simple reads and writes to be atomic with +respect to each other. That is, if one CPU reads a pointer value at +the same time as another CPU overwrites the pointer, it is guaranteed +that the reader will obtain either the old or the new value of the +pointer, never some mish-mash combination of the two. Likewise, if +one CPU writes a long value at the same time as another CPU does, it +is guaranteed that one or the other of the values will end up stored +in memory, not some mish-mash combination of bits. + +Thus, if all you need is atomicity of reading and writing then you can +use plain old ints, longs, or pointers; you don't need to use +atomic_t. (But note: This guarantee emphatically does not apply to +long long values or unaligned values!) + The first operations to implement for atomic_t's are the initializers and plain reads. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 15:42 ` Alan Stern @ 2008-03-03 15:53 ` Alan Cox 2008-03-03 17:11 ` Alan Stern 2008-03-03 17:16 ` Nick Piggin 2008-03-03 17:22 ` Paul E. McKenney 1 sibling, 2 replies; 92+ messages in thread From: Alan Cox @ 2008-03-03 15:53 UTC (permalink / raw) To: Alan Stern Cc: Pavel Machek, Paul E. McKenney, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list > Atomicity of reads of write for pointers and integral types (other than > long long) should be documented. NAK. Atomicity of reads or writes for pointers and integral types is NOT guaranteed. Gcc doesn't believe in your guarantee. Alan ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 15:53 ` Alan Cox @ 2008-03-03 17:11 ` Alan Stern 2008-03-03 17:26 ` Linus Torvalds 2008-03-03 17:16 ` Nick Piggin 1 sibling, 1 reply; 92+ messages in thread From: Alan Stern @ 2008-03-03 17:11 UTC (permalink / raw) To: Alan Cox Cc: Pavel Machek, Paul E. McKenney, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, 3 Mar 2008, Alan Cox wrote: > > Atomicity of reads of write for pointers and integral types (other than > > long long) should be documented. > > NAK. > > Atomicity of reads or writes for pointers and integral types is NOT > guaranteed. Gcc doesn't believe in your guarantee. Miscommunication and lack of clarity. CPU reads and writes _are_ guaranteed to be atomic. What is not guaranteed is that the compiler will generate a single read or write instruction corresponding to a particular expression in C. Consider a routine like the following: static task_struct *the_task; void store_task(void) { the_task = current; } Is it possible to say whether readers examining "the_task" are guaranteed to see a coherent value? Alan Stern ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:11 ` Alan Stern @ 2008-03-03 17:26 ` Linus Torvalds 2008-03-03 17:44 ` Pavel Machek 2008-03-06 15:58 ` Mark Lord 0 siblings, 2 replies; 92+ messages in thread From: Linus Torvalds @ 2008-03-03 17:26 UTC (permalink / raw) To: Alan Stern Cc: Alan Cox, Pavel Machek, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, 3 Mar 2008, Alan Stern wrote: > > Consider a routine like the following: > > static task_struct *the_task; > > void store_task(void) > { > the_task = current; > } > > Is it possible to say whether readers examining "the_task" are > guaranteed to see a coherent value? Yes, we do depend on this. All the RCU stuff (and in general *anything* that depends on memory ordering as opposed to full locking, and we have quite a lot of it) is very fundamentally dependent on the fact that things like pointers get read and written atomically. HOWEVER, it is worth pointing out that it's generally true in a "different" sense than the actual atomic accesses. For example, if you test a single bit of a word, it's still quite possible that gcc will have turned that "atomic" read into a single byte read, so it's not necessarily the case that we'll actually even read the whole word. (Writes are different: if you do things like bitwise updates they simply *will*not* be atomic, but that's simply not what we depend on anyway). So in that sense, the atomicity guarantees are a lot weaker than the ones we do for IO accesses, but that's all fine. Memory isn't IO, and doesn't have side effects. Linus ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:26 ` Linus Torvalds @ 2008-03-03 17:44 ` Pavel Machek 2008-03-03 19:27 ` Alan Stern 2008-03-06 15:58 ` Mark Lord 1 sibling, 1 reply; 92+ messages in thread From: Pavel Machek @ 2008-03-03 17:44 UTC (permalink / raw) To: Linus Torvalds Cc: Alan Stern, Alan Cox, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list Hi! > > Consider a routine like the following: > > > > static task_struct *the_task; > > > > void store_task(void) > > { > > the_task = current; > > } > > > > Is it possible to say whether readers examining "the_task" are > > guaranteed to see a coherent value? > > Yes, we do depend on this. All the RCU stuff (and in general *anything* > that depends on memory ordering as opposed to full locking, and we have > quite a lot of it) is very fundamentally dependent on the fact that things > like pointers get read and written atomically. > > HOWEVER, it is worth pointing out that it's generally true in a > "different" sense than the actual atomic accesses. For example, if you > test a single bit of a word, it's still quite possible that gcc will have > turned that "atomic" read into a single byte read, so it's not necessarily > the case that we'll actually even read the whole word. > > (Writes are different: if you do things like bitwise updates they simply > *will*not* be atomic, but that's simply not what we depend on anyway). Ok... can we get Alan Stern's patch into Documentation/atomic_ops.txt , then? I was not aware of this, and there seems to be lot of confusion around... Plus... I really don't think we can "just access" this as normal pointers... due to the compiler issues Alan Cox mentioned, and due to the ACCESS_ONCE() issue. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:44 ` Pavel Machek @ 2008-03-03 19:27 ` Alan Stern 0 siblings, 0 replies; 92+ messages in thread From: Alan Stern @ 2008-03-03 19:27 UTC (permalink / raw) To: Pavel Machek Cc: Linus Torvalds, Alan Cox, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, 3 Mar 2008, Pavel Machek wrote: > Ok... can we get Alan Stern's patch into Documentation/atomic_ops.txt > , then? I was not aware of this, and there seems to be lot of > confusion around... > > Plus... I really don't think we can "just access" this as normal > pointers... due to the compiler issues Alan Cox mentioned, and due to > the ACCESS_ONCE() issue. Here's an updated version of the patch, including the issue Alan Cox brought up. Alan Stern ----------------------------------------------------------------- Atomicity of reads of write for pointers and integral types (other than long long) should be documented, along with the limitations imposed by the compiler. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> --- Index: usb-2.6/Documentation/atomic_ops.txt =================================================================== --- usb-2.6.orig/Documentation/atomic_ops.txt +++ usb-2.6/Documentation/atomic_ops.txt @@ -21,6 +21,24 @@ local_t is very similar to atomic_t. If updated by one CPU, local_t is probably more appropriate. Please see Documentation/local_ops.txt for the semantics of local_t. +For all properly-aligned pointer and integral types other than long +long, the kernel requires simple reads and writes to be atomic with +respect to each other. That is, if one CPU reads a pointer value at +the same time as another CPU overwrites the pointer, it is guaranteed +that the reader will obtain either the old or the new value of the +pointer, never some mish-mash combination of the two. Likewise, if +one CPU writes a long value at the same time as another CPU does, it +is guaranteed that one or the other of the values will end up stored +in memory, not some mish-mash combination of bits. + +Thus, if all you need is atomicity of reading and writing then you can +use plain old ints, longs, or pointers; you don't need to use +atomic_t. But note: This guarantee emphatically does not apply to +long long values or unaligned values! Note also that gcc does not +guarantee to compile all C assignment expressions into simple writes. +For example, a statement like "x = a + b" might cause gcc to emit code +equivalent to "x = a; x += b", which is decidedly non-atomic. + The first operations to implement for atomic_t's are the initializers and plain reads. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:26 ` Linus Torvalds 2008-03-03 17:44 ` Pavel Machek @ 2008-03-06 15:58 ` Mark Lord 2008-03-06 16:11 ` Linus Torvalds 1 sibling, 1 reply; 92+ messages in thread From: Mark Lord @ 2008-03-06 15:58 UTC (permalink / raw) To: Linus Torvalds Cc: Alan Stern, Alan Cox, Pavel Machek, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list Linus Torvalds wrote: > > On Mon, 3 Mar 2008, Alan Stern wrote: >> Consider a routine like the following: >> >> static task_struct *the_task; >> >> void store_task(void) >> { >> the_task = current; >> } >> >> Is it possible to say whether readers examining "the_task" are >> guaranteed to see a coherent value? > > Yes, we do depend on this. All the RCU stuff (and in general *anything* > that depends on memory ordering as opposed to full locking, and we have > quite a lot of it) is very fundamentally dependent on the fact that things > like pointers get read and written atomically. .. But also consider something like this: void store_task(void) { *the_task = current; } In this case, there is no guarantee that the assignment can be done atomically on all CPU types. Some RISC archs (eg. MIPS R2xxx) require an (interruptible) instruction pair to store values to a potentially unaligned address. This was a BIG issue on a different system that I once worked on. Cheers ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-06 15:58 ` Mark Lord @ 2008-03-06 16:11 ` Linus Torvalds 2008-03-06 16:27 ` Mark Lord 0 siblings, 1 reply; 92+ messages in thread From: Linus Torvalds @ 2008-03-06 16:11 UTC (permalink / raw) To: Mark Lord Cc: Alan Stern, Alan Cox, Pavel Machek, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Thu, 6 Mar 2008, Mark Lord wrote: > > But also consider something like this: > > void store_task(void) > { > *the_task = current; > } > > In this case, there is no guarantee that the assignment > can be done atomically on all CPU types. Some RISC archs > (eg. MIPS R2xxx) require an (interruptible) instruction pair > to store values to a potentially unaligned address. You'd better not be using unaligned accesses for memory-ordering-sensitive things (I think x86 happens get even that right for most cases, but I don't think the architecture specification guarantees it, and I'm pretty sure that you might find problems on cache crossing writes, for example) But quite frankly, if you have an architecture that can't do the above as a single write when it's a pointer, then you have a totally broken architecture. It's not worth supporting. (There are data structures that are harder than native words: bytes and shorts can require load-modify-write cycles, and "u64" and friends can obviously be multiple words, so you shouldn't depend on things for those "complex" cases. But we *definitely* depend on atomicity for regular word accesses). Linus ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-06 16:11 ` Linus Torvalds @ 2008-03-06 16:27 ` Mark Lord 0 siblings, 0 replies; 92+ messages in thread From: Mark Lord @ 2008-03-06 16:27 UTC (permalink / raw) To: Linus Torvalds Cc: Alan Stern, Alan Cox, Pavel Machek, Paul E. McKenney, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list Linus Torvalds wrote: > > On Thu, 6 Mar 2008, Mark Lord wrote: >> But also consider something like this: >> >> void store_task(void) >> { >> *the_task = current; >> } >> >> In this case, there is no guarantee that the assignment >> can be done atomically on all CPU types. Some RISC archs >> (eg. MIPS R2xxx) require an (interruptible) instruction pair >> to store values to a potentially unaligned address. > > You'd better not be using unaligned accesses for memory-ordering-sensitive > things (I think x86 happens get even that right for most cases, but I > don't think the architecture specification guarantees it, and I'm pretty > sure that you might find problems on cache crossing writes, for example) .. Yeah. For the MIPS R2xxx CPU, the question was whether the compiler could guarantee the alignment of the things the pointer could point at. In cases where it could not, it would emit the interruptable instruction pair instead of a single load instruction. I don't know what gcc does on MIPS for this. Perhaps it simply always assumes an aligned access? In that case, there's no issue unless some putz actually creates/uses a pointer to an unaligned data object. What about other architectures like ARM ? Probably also assumes aligned, in which case all is well. > But quite frankly, if you have an architecture that can't do the above as > a single write when it's a pointer, then you have a totally broken > architecture. It's not worth supporting. > > (There are data structures that are harder than native words: bytes and > shorts can require load-modify-write cycles, and "u64" and friends can > obviously be multiple words, so you shouldn't depend on things for those > "complex" cases. But we *definitely* depend on atomicity for regular word > accesses). ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 15:53 ` Alan Cox 2008-03-03 17:11 ` Alan Stern @ 2008-03-03 17:16 ` Nick Piggin 2008-03-03 17:31 ` Paul E. McKenney 2008-03-03 17:33 ` Alan Cox 1 sibling, 2 replies; 92+ messages in thread From: Nick Piggin @ 2008-03-03 17:16 UTC (permalink / raw) To: Alan Cox Cc: Alan Stern, Pavel Machek, Paul E. McKenney, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Tuesday 04 March 2008 02:53, Alan Cox wrote: > > Atomicity of reads of write for pointers and integral types (other than > > long long) should be documented. > > NAK. > > Atomicity of reads or writes for pointers and integral types is NOT > guaranteed. Gcc doesn't believe in your guarantee. Are you sure gcc doesn't? Or is it just "C"? Linux wouldn't work today if gcc did something non-atomic there (presuming you're talking about naturally aligned pointers/ints). It is widely used and accepted. RCU users are far from the only places to rely on this, although I guess they are the main ones when it comes to assigning pointers atomically. ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:16 ` Nick Piggin @ 2008-03-03 17:31 ` Paul E. McKenney 2008-03-03 17:33 ` Alan Cox 1 sibling, 0 replies; 92+ messages in thread From: Paul E. McKenney @ 2008-03-03 17:31 UTC (permalink / raw) To: Nick Piggin Cc: Alan Cox, Alan Stern, Pavel Machek, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Tue, Mar 04, 2008 at 04:16:33AM +1100, Nick Piggin wrote: > On Tuesday 04 March 2008 02:53, Alan Cox wrote: > > > Atomicity of reads of write for pointers and integral types (other than > > > long long) should be documented. > > > > NAK. > > > > Atomicity of reads or writes for pointers and integral types is NOT > > guaranteed. Gcc doesn't believe in your guarantee. > > Are you sure gcc doesn't? Or is it just "C"? > > Linux wouldn't work today if gcc did something non-atomic there > (presuming you're talking about naturally aligned pointers/ints). > It is widely used and accepted. > > RCU users are far from the only places to rely on this, although > I guess they are the main ones when it comes to assigning pointers > atomically. It is true that gcc can refetch pointers/ints if it runs out of registers, which is why rcu_dereference() recently had an ACCESS_ONCE() added to it. But such refetching cannot result in a mish-mash of two different pointer values, confusing though it might be to the affected code. Thanx, Paul ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:16 ` Nick Piggin 2008-03-03 17:31 ` Paul E. McKenney @ 2008-03-03 17:33 ` Alan Cox 1 sibling, 0 replies; 92+ messages in thread From: Alan Cox @ 2008-03-03 17:33 UTC (permalink / raw) To: Nick Piggin Cc: Alan Stern, Pavel Machek, Paul E. McKenney, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list > Are you sure gcc doesn't? Or is it just "C"? gcc doesn't > Linux wouldn't work today if gcc did something non-atomic there > (presuming you're talking about naturally aligned pointers/ints). > It is widely used and accepted. Yes and we've had tty layer traces in the past clearly showing it isn't always safe, especially if any math is involved anywhere near the assignment. That may be why pointer flipping happens to work. Alan ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 15:42 ` Alan Stern 2008-03-03 15:53 ` Alan Cox @ 2008-03-03 17:22 ` Paul E. McKenney 1 sibling, 0 replies; 92+ messages in thread From: Paul E. McKenney @ 2008-03-03 17:22 UTC (permalink / raw) To: Alan Stern Cc: Pavel Machek, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list On Mon, Mar 03, 2008 at 10:42:35AM -0500, Alan Stern wrote: > On Mon, 3 Mar 2008, Pavel Machek wrote: > > > Hi! > > > > > > Alan thinks that `subj` is correct... > > > > > > More precisely, reads and writes of pointers are always atomic. That > > > is, if a write and a read occur concurrently, it is guaranteed that the > > > read will obtain either the old or the new value of the pointer, never > > > a mish-mash of the two. If this were not so then RCU wouldn't work. > > Right, Paul? Yep, as long as they aligned naturally, e.g., 32-bit pointers to a four-byte boundary, 64-bit pointers to an eight-byte boundary. Note that this is a backdoor agreement between the Linux kernel and gcc. The C/C++ standard does -not- require atomic reads or writes for anything other than a volatile sig_atomic_t. In fact, there are a lot of things that C/C++ doesn't guarantee, which is why Linux makes use of so many of gcc's non-standard extensions. ;-) > > Ok, so linux actually atomicity of long? Yep, same alignment rules as pointers. Ah, and the alignment is covered below. Very good! > > If so, this should probably be applied... > > > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > > > diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt > > index 4ef2450..0a7d180 100644 > > --- a/Documentation/atomic_ops.txt > > +++ b/Documentation/atomic_ops.txt > > @@ -21,6 +21,9 @@ local_t is very similar to atomic_t. If > > updated by one CPU, local_t is probably more appropriate. Please see > > Documentation/local_ops.txt for the semantics of local_t. > > > > +long (and int and void *) can be used instead of atomic_t, if all you > > +need is atomic setting and atomic reading. > > + > > The first operations to implement for atomic_t's are the initializers and > > plain reads. > > Yes indeed. This fact doesn't seem to be documented anywhere, but it > is clearly a requirement of the kernel. I would make the text a little > more explicit, see below. > > Alan Stern > > ------------------------------------------------------- > > > Atomicity of reads of write for pointers and integral types (other than > long long) should be documented. > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> > > --- > > Index: usb-2.6/Documentation/atomic_ops.txt > =================================================================== > --- usb-2.6.orig/Documentation/atomic_ops.txt > +++ usb-2.6/Documentation/atomic_ops.txt > @@ -21,6 +21,21 @@ local_t is very similar to atomic_t. If > updated by one CPU, local_t is probably more appropriate. Please see > Documentation/local_ops.txt for the semantics of local_t. > > +For all properly-aligned pointer and integral types other than long > +long, the kernel requires simple reads and writes to be atomic with > +respect to each other. That is, if one CPU reads a pointer value at > +the same time as another CPU overwrites the pointer, it is guaranteed > +that the reader will obtain either the old or the new value of the > +pointer, never some mish-mash combination of the two. Likewise, if > +one CPU writes a long value at the same time as another CPU does, it > +is guaranteed that one or the other of the values will end up stored > +in memory, not some mish-mash combination of bits. > + > +Thus, if all you need is atomicity of reading and writing then you can > +use plain old ints, longs, or pointers; you don't need to use > +atomic_t. (But note: This guarantee emphatically does not apply to > +long long values or unaligned values!) > + > The first operations to implement for atomic_t's are the initializers and > plain reads. > > ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 12:08 ` [patch] Re: using long instead of atomic_t when only set/read is required Pavel Machek 2008-03-03 15:42 ` Alan Stern @ 2008-03-03 15:48 ` Alan Cox 2008-03-03 17:24 ` Pavel Machek 1 sibling, 1 reply; 92+ messages in thread From: Alan Cox @ 2008-03-03 15:48 UTC (permalink / raw) To: Pavel Machek Cc: Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list > Ok, so linux actually atomicity of long? No it doesn't. And even if it did you couldn't use long for this because atomic_t also ensures the points operations complete are defined. You might just about get away with volatile long * objects on x86 for simple assignments but for anything else gcc can and will generate code to update values whichever way it feels best - which includes turning long *x = a + b; into *x = a; *x += b; Alan ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 15:48 ` Alan Cox @ 2008-03-03 17:24 ` Pavel Machek 2008-03-03 20:27 ` Rafael J. Wysocki 2008-03-04 23:32 ` Peter Hartley 0 siblings, 2 replies; 92+ messages in thread From: Pavel Machek @ 2008-03-03 17:24 UTC (permalink / raw) To: Alan Cox Cc: Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman, Kernel development list, pm list Hi! > > Ok, so linux actually atomicity of long? ^~-- assumes should be here. > No it doesn't. And even if it did you couldn't use long for this because > atomic_t also ensures the points operations complete are defined. You > might just about get away with volatile long * objects on x86 for simple > assignments but for anything else gcc can and will generate code to > update values whichever way it feels best - which includes turning > > long *x = a + b; > > into > > *x = a; > *x += b; Ok, I can understand the gcc side. But do we actually run on an architecture where long *x; *x = 0; racing with *x = 0x12345678; can produce *x == 0x12340000; or something like that? I'm told RCU relies on architectures not doing this, and I'd like to get this clarified. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:24 ` Pavel Machek @ 2008-03-03 20:27 ` Rafael J. Wysocki 2008-03-03 21:12 ` Paul E. McKenney 2008-03-03 22:23 ` Linus Torvalds 2008-03-04 23:32 ` Peter Hartley 1 sibling, 2 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-03-03 20:27 UTC (permalink / raw) To: Pavel Machek Cc: Alan Cox, Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Pierre Ossman, Kernel development list, pm list, Paul E. McKenney On Monday, 3 of March 2008, Pavel Machek wrote: > Hi! > > > > Ok, so linux actually atomicity of long? > ^~-- assumes should be here. > > > No it doesn't. And even if it did you couldn't use long for this because > > atomic_t also ensures the points operations complete are defined. You > > might just about get away with volatile long * objects on x86 for simple > > assignments but for anything else gcc can and will generate code to > > update values whichever way it feels best - which includes turning > > > > long *x = a + b; > > > > into > > > > *x = a; > > *x += b; > > Ok, I can understand the gcc side. But do we actually run on an > architecture where > > long *x; > > *x = 0; > > racing with > > *x = 0x12345678; > > can produce > > *x == 0x12340000; > > or something like that? Well something like this could happen, in theory, on a "32-bit" architecture with a 16-bit bus. In that case, one can imagine, the first word of the first write may be sent through the bus immediately followed by the first word of the second write, followed by the second word of the second write and by the second word of the first write, in this order. > I'm told RCU relies on architectures not doing this, and I'd like to get this > clarified. Yes, it would be good to know that for sure. Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 20:27 ` Rafael J. Wysocki @ 2008-03-03 21:12 ` Paul E. McKenney 2008-03-03 22:23 ` Linus Torvalds 1 sibling, 0 replies; 92+ messages in thread From: Paul E. McKenney @ 2008-03-03 21:12 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, Alan Cox, Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Pierre Ossman, Kernel development list, pm list On Mon, Mar 03, 2008 at 09:27:29PM +0100, Rafael J. Wysocki wrote: > On Monday, 3 of March 2008, Pavel Machek wrote: > > Hi! > > > > > > Ok, so linux actually atomicity of long? > > ^~-- assumes should be here. > > > > > No it doesn't. And even if it did you couldn't use long for this because > > > atomic_t also ensures the points operations complete are defined. You > > > might just about get away with volatile long * objects on x86 for simple > > > assignments but for anything else gcc can and will generate code to > > > update values whichever way it feels best - which includes turning > > > > > > long *x = a + b; > > > > > > into > > > > > > *x = a; > > > *x += b; > > > > Ok, I can understand the gcc side. But do we actually run on an > > architecture where > > > > long *x; > > > > *x = 0; > > > > racing with > > > > *x = 0x12345678; > > > > can produce > > > > *x == 0x12340000; > > > > or something like that? > > Well something like this could happen, in theory, on a "32-bit" architecture > with a 16-bit bus. In that case, one can imagine, the first word of the > first write may be sent through the bus immediately followed by the first > word of the second write, followed by the second word of the second > write and by the second word of the first write, in this order. > > > I'm told RCU relies on architectures not doing this, and I'd like to get this > > clarified. > > Yes, it would be good to know that for sure. Most rcu_assign_pointer() calls are protected by locks, but there might be a few that are not. However, the case that concerns me most would be the following: o Task 0 writes the lower 16 bits of the pointer. o Task 1 reads the lower 16 bits of the pointer. o Task 1 reads the upper 16 bits of the pointer. o Task 0 writes the upper 16 bits of the pointer. This would result in task 1 getting a mish-mash of the old and new versions of the pointer. Very bad!!! RCU heavily relies on the reader seeing either the initial value of the pointer or on the value written by some single write. But doesn't this require a -multi-CPU- system with a 16-bit data path from the ALU to the L0 cache? This seems a bit unlikely. Or am I being naive about embedded CPUs? On the other hand, if you have a 32-bit single-CPU system with a 16-bit path to memory, all we need is that interrupts be restricted to happening at instruction boundaries rather than in the middle of instructions. Thanx, Paul ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 20:27 ` Rafael J. Wysocki 2008-03-03 21:12 ` Paul E. McKenney @ 2008-03-03 22:23 ` Linus Torvalds 1 sibling, 0 replies; 92+ messages in thread From: Linus Torvalds @ 2008-03-03 22:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, Alan Cox, Alan Stern, Andrew Morton, Zdenek Kabelac, davem, Pierre Ossman, Kernel development list, pm list, Paul E. McKenney On Mon, 3 Mar 2008, Rafael J. Wysocki wrote: > > Well something like this could happen, in theory, on a "32-bit" architecture > with a 16-bit bus. No it couldn't. That would only be true if there is no cache, and no cache coherency. Basically, Linux requires a cache-coherent architecture to work in SMP. Anything else is insane (except as a cluster). So there is no way we can see partial updates, except with terminally broken hardware that we would never support anyway for tons of other reasons. Linus ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-03 17:24 ` Pavel Machek 2008-03-03 20:27 ` Rafael J. Wysocki @ 2008-03-04 23:32 ` Peter Hartley 2008-03-04 23:54 ` Rafael J. Wysocki 1 sibling, 1 reply; 92+ messages in thread From: Peter Hartley @ 2008-03-04 23:32 UTC (permalink / raw) To: linux-kernel, Pavel Machek Cc: Alan Cox, Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Rafael J. Wysocki, Pierre Ossman On Mon, 2008-03-03 at 18:24 +0100, Pavel Machek wrote: > Ok, I can understand the gcc side. But do we actually run on an > architecture where > > long *x; > > *x = 0; > > racing with > > *x = 0x12345678; > > can produce > > *x == 0x12340000; > > or something like that? I'm told RCU relies on architectures not doing > this, and I'd like to get this clarified. ARM6, ARM7500 and similar do exactly this for short (and unsigned short), although not for int, long, or pointers: > struct foo { short b; short c; }; > void baa(struct foo *f, short cc) { f->c = cc; } becomes (arm-linux-gcc -mcpu=arm6): > baa: > mov r3, r1, lsr #8 > strb r3, [r0, #3] > strb r1, [r0, #2] > mov pc, lr note the two single-byte stores, as ARM6 didn't have the "store halfword" instruction. So I think Alan Stern's "For all properly-aligned pointer and integral types other than long long..." should be amended to "For all properly-aligned pointer and integral types other than short or long long..." Peter ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-04 23:32 ` Peter Hartley @ 2008-03-04 23:54 ` Rafael J. Wysocki 2008-03-05 0:26 ` Paul E. McKenney 0 siblings, 1 reply; 92+ messages in thread From: Rafael J. Wysocki @ 2008-03-04 23:54 UTC (permalink / raw) To: Peter Hartley Cc: linux-kernel, Pavel Machek, Alan Cox, Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Pierre Ossman, Paul E. McKenney On Wednesday, 5 of March 2008, Peter Hartley wrote: > On Mon, 2008-03-03 at 18:24 +0100, Pavel Machek wrote: > > Ok, I can understand the gcc side. But do we actually run on an > > architecture where > > > > long *x; > > > > *x = 0; > > > > racing with > > > > *x = 0x12345678; > > > > can produce > > > > *x == 0x12340000; > > > > or something like that? I'm told RCU relies on architectures not doing > > this, and I'd like to get this clarified. > > ARM6, ARM7500 and similar do exactly this for short (and unsigned > short), although not for int, long, or pointers: > > > struct foo { short b; short c; }; > > void baa(struct foo *f, short cc) { f->c = cc; } > > becomes (arm-linux-gcc -mcpu=arm6): > > > baa: > > mov r3, r1, lsr #8 > > strb r3, [r0, #3] > > strb r1, [r0, #2] > > mov pc, lr > > note the two single-byte stores, as ARM6 didn't have the "store > halfword" instruction. > > So I think Alan Stern's > "For all properly-aligned pointer and integral types other than long > long..." > should be amended to > "For all properly-aligned pointer and integral types other than short or > long long..." Well, perhaps it's sufficient to document just pointers? In fact this is what RCU relies on. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [patch] Re: using long instead of atomic_t when only set/read is required 2008-03-04 23:54 ` Rafael J. Wysocki @ 2008-03-05 0:26 ` Paul E. McKenney 0 siblings, 0 replies; 92+ messages in thread From: Paul E. McKenney @ 2008-03-05 0:26 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Peter Hartley, linux-kernel, Pavel Machek, Alan Cox, Alan Stern, Linus Torvalds, Andrew Morton, Zdenek Kabelac, davem, Pierre Ossman On Wed, Mar 05, 2008 at 12:54:03AM +0100, Rafael J. Wysocki wrote: > On Wednesday, 5 of March 2008, Peter Hartley wrote: > > On Mon, 2008-03-03 at 18:24 +0100, Pavel Machek wrote: > > > Ok, I can understand the gcc side. But do we actually run on an > > > architecture where > > > > > > long *x; > > > > > > *x = 0; > > > > > > racing with > > > > > > *x = 0x12345678; > > > > > > can produce > > > > > > *x == 0x12340000; > > > > > > or something like that? I'm told RCU relies on architectures not doing > > > this, and I'd like to get this clarified. > > > > ARM6, ARM7500 and similar do exactly this for short (and unsigned > > short), although not for int, long, or pointers: > > > > > struct foo { short b; short c; }; > > > void baa(struct foo *f, short cc) { f->c = cc; } > > > > becomes (arm-linux-gcc -mcpu=arm6): > > > > > baa: > > > mov r3, r1, lsr #8 > > > strb r3, [r0, #3] > > > strb r1, [r0, #2] > > > mov pc, lr > > > > note the two single-byte stores, as ARM6 didn't have the "store > > halfword" instruction. > > > > So I think Alan Stern's > > "For all properly-aligned pointer and integral types other than long > > long..." > > should be amended to > > "For all properly-aligned pointer and integral types other than short or > > long long..." > > Well, perhaps it's sufficient to document just pointers? In fact this is what > RCU relies on. One can do RCU on array indexes (ints/longs) as well as pointers, so we need to keep "integral" in there. Thanx, Paul ^ permalink raw reply [flat|nested] 92+ messages in thread
* Re: [Bug 10030] Suspend doesn't work when SD card is inserted 2008-02-25 2:19 ` Alan Stern 2008-02-25 9:03 ` using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) Pavel Machek @ 2008-02-25 11:40 ` Rafael J. Wysocki 1 sibling, 0 replies; 92+ messages in thread From: Rafael J. Wysocki @ 2008-02-25 11:40 UTC (permalink / raw) To: Alan Stern Cc: Pavel Machek, Pierre Ossman, Zdenek Kabelac, Kernel development list, pm list On Monday, 25 of February 2008, Alan Stern wrote: > On Sun, 24 Feb 2008, Pavel Machek wrote: > > > > > At the very least, you'd need rmb() before reading it and wmb() after > > > > writing to it, but I'm not sure if that's enough on every obscure > > > > architecture out there. > > > > > > No, neither one is needed because of the way suspending_task is used. > > > > > > It's not necessary for a reader R to see the variable's actual value; > > > all R needs to know is whether or not suspending_task is equal to R. > > > Since the only process which can set suspending_task to R is R itself, > > > and since R will set suspending_task back to NULL before releasing the > > > write lock on pm_sleep_rwsem, there's never any ambiguity. > > > > Subtle. > > > > Very subtly wrong ;-). > > > > imagine suspending_task == 0xabcdef01. Now task "R" with current == > > 0xabcd0000 reads suspending_task while the other cpu is writing to it, > > and sees 0xabcd0000 (0xef01 was not yet written) -- and mistakenly > > believes that "R" == suspending_task. > > I always thought that reads and writes of pointers are atomic, just > like reads and writes of longs. Is that wrong? That depends on the architecture. It may be wrong on alpha, IIRC. > Now if pointers were the same size as long long then I would agree with > you. That certainly is true on x86-64. On alpha probably too. > > I agree it is very unlikely, and it will not happen on i386. But what > > about just using atomic_t suspending_task, and store current->pid into > > it? > > That would work just as well. Except that it wouldn't need to be > atomic_t, because current->pid is always an integer (not guaranteed, I > suppose, but that's what it is on all current architectures) and > reads/writes of ints _are_ atomic. That also depends on the arch, I'm afraid, and in general if we assume something to be atomic, it's better to make the code reflect that assumption. Thanks, Rafael ^ permalink raw reply [flat|nested] 92+ messages in thread
end of thread, other threads:[~2008-03-06 21:32 UTC | newest] Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20080219232338.E9A24108068@picon.linux-foundation.org> 2008-02-20 16:42 ` [Bug 10030] Suspend doesn't work when SD card is inserted Alan Stern 2008-02-20 17:30 ` Pierre Ossman 2008-02-20 19:26 ` Alan Stern 2008-02-20 20:51 ` Pierre Ossman 2008-02-20 20:58 ` Rafael J. Wysocki 2008-02-20 21:06 ` Alan Stern 2008-02-20 22:15 ` Rafael J. Wysocki 2008-02-20 22:24 ` Alan Stern 2008-02-20 22:41 ` Rafael J. Wysocki 2008-02-21 0:02 ` Rafael J. Wysocki 2008-02-21 16:27 ` Alan Stern 2008-02-21 16:38 ` Rafael J. Wysocki 2008-02-21 17:48 ` Alan Stern 2008-02-21 22:47 ` Rafael J. Wysocki 2008-02-21 23:05 ` Alan Stern 2008-02-23 1:30 ` Rafael J. Wysocki 2008-02-23 4:39 ` Alan Stern 2008-02-23 20:16 ` Rafael J. Wysocki 2008-02-23 23:29 ` Alan Stern 2008-02-24 0:19 ` Rafael J. Wysocki 2008-02-24 3:25 ` Alan Stern 2008-02-24 4:26 ` [linux-pm] " Alan Stern 2008-02-24 14:00 ` Rafael J. Wysocki 2008-02-24 15:33 ` Bugs in MMC [was: [Bug 10030] Suspend doesn't work when SD card is inserted] Alan Stern 2008-02-25 17:41 ` Pierre Ossman 2008-02-25 17:58 ` Alan Stern 2008-02-25 18:31 ` Pierre Ossman 2008-02-25 20:00 ` Alan Stern 2008-03-01 14:11 ` Pierre Ossman 2008-03-01 14:36 ` Alan Stern 2008-03-01 14:47 ` Pierre Ossman 2008-02-25 22:51 ` Felipe Balbi 2008-03-03 21:59 ` David Brownell 2008-03-04 6:03 ` Pierre Ossman 2008-03-04 9:44 ` David Brownell 2008-03-04 9:58 ` Pierre Ossman 2008-03-06 21:23 ` David Brownell 2008-03-04 17:53 ` Alan Stern 2008-03-04 18:53 ` David Brownell 2008-03-04 19:51 ` Alan Stern 2008-03-04 20:30 ` David Brownell 2008-03-04 21:00 ` Alan Stern 2008-03-06 15:55 ` Pavel Machek 2008-03-06 20:33 ` Alan Stern 2008-03-06 20:53 ` Zdenek Kabelac 2008-03-06 21:31 ` Rafael J. Wysocki 2008-03-04 17:50 ` Alan Stern 2008-02-24 13:33 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki 2008-02-24 20:25 ` Rafael J. Wysocki 2008-02-24 20:45 ` Alan Stern 2008-02-24 20:56 ` Rafael J. Wysocki 2008-02-24 21:11 ` Alan Stern 2008-02-24 22:18 ` Rafael J. Wysocki 2008-02-24 13:51 ` Rafael J. Wysocki 2008-02-24 19:27 ` Alan Stern 2008-02-24 19:42 ` Zdenek Kabelac 2008-02-24 20:09 ` Rafael J. Wysocki 2008-02-24 18:21 ` Pavel Machek 2008-02-24 19:03 ` Alan Stern 2008-02-24 20:11 ` Pavel Machek 2008-02-24 20:33 ` Alan Stern 2008-02-24 21:42 ` Pavel Machek 2008-02-24 22:21 ` Rafael J. Wysocki 2008-02-25 2:21 ` Alan Stern 2008-02-25 11:41 ` Rafael J. Wysocki [not found] ` <c4e36d110802250402q7312a488idf752e07db7504e8@mail.gmail.com> 2008-02-25 12:31 ` Rafael J. Wysocki 2008-02-25 2:19 ` Alan Stern 2008-02-25 9:03 ` using long instead of atomic_t when only set/read is required (was Re: [Bug 10030] Suspend doesn't work when SD card is inserted) Pavel Machek 2008-02-25 14:46 ` Alan Stern 2008-03-03 12:08 ` [patch] Re: using long instead of atomic_t when only set/read is required Pavel Machek 2008-03-03 15:42 ` Alan Stern 2008-03-03 15:53 ` Alan Cox 2008-03-03 17:11 ` Alan Stern 2008-03-03 17:26 ` Linus Torvalds 2008-03-03 17:44 ` Pavel Machek 2008-03-03 19:27 ` Alan Stern 2008-03-06 15:58 ` Mark Lord 2008-03-06 16:11 ` Linus Torvalds 2008-03-06 16:27 ` Mark Lord 2008-03-03 17:16 ` Nick Piggin 2008-03-03 17:31 ` Paul E. McKenney 2008-03-03 17:33 ` Alan Cox 2008-03-03 17:22 ` Paul E. McKenney 2008-03-03 15:48 ` Alan Cox 2008-03-03 17:24 ` Pavel Machek 2008-03-03 20:27 ` Rafael J. Wysocki 2008-03-03 21:12 ` Paul E. McKenney 2008-03-03 22:23 ` Linus Torvalds 2008-03-04 23:32 ` Peter Hartley 2008-03-04 23:54 ` Rafael J. Wysocki 2008-03-05 0:26 ` Paul E. McKenney 2008-02-25 11:40 ` [Bug 10030] Suspend doesn't work when SD card is inserted Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).