* Need lockdep help @ 2007-12-02 19:45 Alan Stern 2007-12-02 20:04 ` Arjan van de Ven 2007-12-03 10:36 ` Jarek Poplawski 0 siblings, 2 replies; 11+ messages in thread From: Alan Stern @ 2007-12-02 19:45 UTC (permalink / raw) To: Ingo Molnar; +Cc: Linux-pm mailing list, Kernel development list Ingo: I ran into a lockdep reporting issue just now with some new code under development. I think it's a false positive; the question is how best to deal with it. Here's the situation. The new code runs during a system sleep (i.e., suspend or hibernation). Certain activities have to be deferred during a system sleep, so I defined an rwsem: system_sleep_in_progress_rwsem. Subroutines carrying out these activities acquire a read lock on the rwsem, so normally they proceed with no hindrance. During a sleep transition, I acquire a write lock -- this is done via a PM-notifier callout routine. That is, during a PM_HIBERNATION_PREPARE or PM_SUSPEND_PREPARE notification the routine does down_write(), and during a PM_POST_HIBERNATION or PM_POST_SUSPEND notification the routine does up_write(). The problem is that the notifier chain itself is under the control of an rwsem (to prevent the chain from being modified while it is in use). The resulting actions look like this: System sleep start: down_read(notifier-chain rwsem); call the notifier routine down_write(&system_sleep_in_progress_rwsem); up_read(notifier-chain rwsem); System sleep end: down_read(notifier-chain rwsem); call the notifier routine up_write(&system_sleep_in_progress_rwsem); up_read(notifier-chain rwsem); This creates a lockdep violation; each rwsem in turn is locked while the other is being held. However the only way this could lead to deadlock would be if there was already a bug in the system Power Management code (overlapping notifications). Perhaps the system_sleep_in_progress_rwsem could be replaced by some other sort of synchronizing mechanism, but I don't want to change it -- an rwsem really does seem to be the correct thing to use here. What do you suggest? Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-02 19:45 Need lockdep help Alan Stern @ 2007-12-02 20:04 ` Arjan van de Ven 2007-12-02 20:08 ` Alan Stern 2007-12-03 10:36 ` Jarek Poplawski 1 sibling, 1 reply; 11+ messages in thread From: Arjan van de Ven @ 2007-12-02 20:04 UTC (permalink / raw) To: Alan Stern; +Cc: Ingo Molnar, Linux-pm mailing list, Kernel development list On Sun, 2 Dec 2007 14:45:32 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote: > Ingo: > > I ran into a lockdep reporting issue just now with some new code > under development. I think it's a false positive; the question is > how best to deal with it. > > Here's the situation. The new code runs during a system sleep (i.e., > suspend or hibernation). Certain activities have to be deferred > during a system sleep, so I defined an rwsem: > system_sleep_in_progress_rwsem. > > Subroutines carrying out these activities acquire a read lock on the > rwsem, so normally they proceed with no hindrance. During a sleep > transition, I acquire a write lock -- this is done via a PM-notifier > callout routine. That is, during a PM_HIBERNATION_PREPARE or > PM_SUSPEND_PREPARE notification the routine does down_write(), and > during a PM_POST_HIBERNATION or PM_POST_SUSPEND notification the > routine does up_write(). > > The problem is that the notifier chain itself is under the control of > an rwsem (to prevent the chain from being modified while it is in > use). The resulting actions look like this: > > System sleep start: > down_read(notifier-chain rwsem); > call the notifier routine > down_write(&system_sleep_in_progress_rwsem); > up_read(notifier-chain rwsem); > > System sleep end: > down_read(notifier-chain rwsem); > call the notifier routine > up_write(&system_sleep_in_progress_rwsem); > up_read(notifier-chain rwsem); > > This creates a lockdep violation; each rwsem in turn is locked while > the other is being held. However the only way this could lead to > deadlock would be if there was already a bug in the system Power > Management code (overlapping notifications). or.. modifications to the notifier chain while all this is happening (remember: rwsems are fair, once a writer shows up, all readers wait) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-02 20:04 ` Arjan van de Ven @ 2007-12-02 20:08 ` Alan Stern 0 siblings, 0 replies; 11+ messages in thread From: Alan Stern @ 2007-12-02 20:08 UTC (permalink / raw) To: Arjan van de Ven Cc: Ingo Molnar, Linux-pm mailing list, Kernel development list On Sun, 2 Dec 2007, Arjan van de Ven wrote: > > This creates a lockdep violation; each rwsem in turn is locked while > > the other is being held. However the only way this could lead to > > deadlock would be if there was already a bug in the system Power > > Management code (overlapping notifications). > > or.. modifications to the notifier chain while all this is happening > (remember: rwsems are fair, once a writer shows up, all readers wait) But modifications to the notifier chain don't invoke the callout routines. Hence they won't try to lock the new rwsem and won't lead to deadlock. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-02 19:45 Need lockdep help Alan Stern 2007-12-02 20:04 ` Arjan van de Ven @ 2007-12-03 10:36 ` Jarek Poplawski 2007-12-03 15:08 ` Alan Stern 1 sibling, 1 reply; 11+ messages in thread From: Jarek Poplawski @ 2007-12-03 10:36 UTC (permalink / raw) To: Alan Stern Cc: Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra On 02-12-2007 20:45, Alan Stern wrote: > Ingo: > > I ran into a lockdep reporting issue just now with some new code under > development. I think it's a false positive; the question is how best > to deal with it. > > Here's the situation. The new code runs during a system sleep (i.e., > suspend or hibernation). Certain activities have to be deferred during > a system sleep, so I defined an rwsem: system_sleep_in_progress_rwsem. > > Subroutines carrying out these activities acquire a read lock on the > rwsem, so normally they proceed with no hindrance. During a sleep > transition, I acquire a write lock -- this is done via a PM-notifier > callout routine. That is, during a PM_HIBERNATION_PREPARE or > PM_SUSPEND_PREPARE notification the routine does down_write(), and > during a PM_POST_HIBERNATION or PM_POST_SUSPEND notification the > routine does up_write(). > > The problem is that the notifier chain itself is under the control of > an rwsem (to prevent the chain from being modified while it is in use). > The resulting actions look like this: > > System sleep start: > down_read(notifier-chain rwsem); > call the notifier routine > down_write(&system_sleep_in_progress_rwsem); > up_read(notifier-chain rwsem); > > System sleep end: > down_read(notifier-chain rwsem); > call the notifier routine > up_write(&system_sleep_in_progress_rwsem); > up_read(notifier-chain rwsem); > > This creates a lockdep violation; each rwsem in turn is locked while > the other is being held. However the only way this could lead to > deadlock would be if there was already a bug in the system Power > Management code (overlapping notifications). Actually, IMHO, there is no reason for any lockdep violation: thread #1: has down_read(A); waits for #2 to down_write(B) thread #2: has down_write(B); never waits for #1 to down_read(A) So, deadlock isn't possible here. If lockdep reports something else it should be fixed (and you'd be right to omit lockdep until this is done). Regards, Jarek P. PS: Peter Zijlstra added to CC ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-03 10:36 ` Jarek Poplawski @ 2007-12-03 15:08 ` Alan Stern 2007-12-03 23:25 ` Jarek Poplawski 0 siblings, 1 reply; 11+ messages in thread From: Alan Stern @ 2007-12-03 15:08 UTC (permalink / raw) To: Jarek Poplawski Cc: Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra On Mon, 3 Dec 2007, Jarek Poplawski wrote: > > System sleep start: > > down_read(notifier-chain rwsem); > > call the notifier routine > > down_write(&system_sleep_in_progress_rwsem); > > up_read(notifier-chain rwsem); > > > > System sleep end: > > down_read(notifier-chain rwsem); > > call the notifier routine > > up_write(&system_sleep_in_progress_rwsem); > > up_read(notifier-chain rwsem); > > > > This creates a lockdep violation; each rwsem in turn is locked while > > the other is being held. However the only way this could lead to > > deadlock would be if there was already a bug in the system Power > > Management code (overlapping notifications). > > Actually, IMHO, there is no reason for any lockdep violation: > > thread #1: has down_read(A); waits for #2 to down_write(B) > thread #2: has down_write(B); never waits for #1 to down_read(A) > > So, deadlock isn't possible here. If lockdep reports something else it > should be fixed (and you'd be right to omit lockdep until this is > done). I think the reasoning goes the way Arjan described. Suppose in between #1 and #2 there is thread #3 trying to do down_write(A) and waiting for #1. Then thread #2 doesn't have to wait for #1 directly, but it would have to wait for #3. In my case the simplest answer appears to be the replace the rwsem with something slightly more complicated (a mutex plus a boolean flag -- the loss of concurrency won't matter much since it isn't on a hot path). Thanks for the comment. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-03 15:08 ` Alan Stern @ 2007-12-03 23:25 ` Jarek Poplawski 2007-12-04 15:17 ` Alan Stern 0 siblings, 1 reply; 11+ messages in thread From: Jarek Poplawski @ 2007-12-03 23:25 UTC (permalink / raw) To: Alan Stern Cc: Jarek Poplawski, Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra Alan Stern wrote, On 12/03/2007 04:08 PM: > On Mon, 3 Dec 2007, Jarek Poplawski wrote: > >>> System sleep start: >>> down_read(notifier-chain rwsem); >>> call the notifier routine >>> down_write(&system_sleep_in_progress_rwsem); >>> up_read(notifier-chain rwsem); >>> >>> System sleep end: >>> down_read(notifier-chain rwsem); >>> call the notifier routine >>> up_write(&system_sleep_in_progress_rwsem); >>> up_read(notifier-chain rwsem); >>> >>> This creates a lockdep violation; each rwsem in turn is locked while >>> the other is being held. However the only way this could lead to >>> deadlock would be if there was already a bug in the system Power >>> Management code (overlapping notifications). >> Actually, IMHO, there is no reason for any lockdep violation: >> >> thread #1: has down_read(A); waits for #2 to down_write(B) >> thread #2: has down_write(B); never waits for #1 to down_read(A) >> >> So, deadlock isn't possible here. If lockdep reports something else it >> should be fixed (and you'd be right to omit lockdep until this is >> done). > > I think the reasoning goes the way Arjan described. Suppose in between > #1 and #2 there is thread #3 trying to do down_write(A) and waiting for > #1. Then thread #2 doesn't have to wait for #1 directly, but it would > have to wait for #3. As a matter of fact I completely missed Arjan's point because I thought you described these locks according to lockdep's report, and there is an information about read or write... Since, you still seem to guess above about possible scenario, maybe it would be easier to show this report (and maybe a piece of this code if possible)? Btw., if it were like you're suggesting, it still shouldn't make any difference: if thread #3 is only waiting for the lock taken for reading, then I can't see why thread #2 has to wait for anything. Probably more dangerous, at least for lockdep, could look taking down_write(A) by thread #1 in between, but if it all were possible only within one thread, then still there should be no reason to change good program only to please lockdep. > In my case the simplest answer appears to be the replace the rwsem > with something slightly more complicated (a mutex plus a boolean flag > -- the loss of concurrency won't matter much since it isn't on a hot > path). I'm not sure I can understand your plan, but I doubt there should be such problems with taking rwsem for sleeping, so maybe it would be better to figure out what really scares lockdep, to fix the right place? Jarek P. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-03 23:25 ` Jarek Poplawski @ 2007-12-04 15:17 ` Alan Stern 2007-12-04 19:02 ` Jarek Poplawski 2007-12-04 21:00 ` Ingo Molnar 0 siblings, 2 replies; 11+ messages in thread From: Alan Stern @ 2007-12-04 15:17 UTC (permalink / raw) To: Jarek Poplawski Cc: Jarek Poplawski, Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra On Tue, 4 Dec 2007, Jarek Poplawski wrote: > Alan Stern wrote, On 12/03/2007 04:08 PM: > > > On Mon, 3 Dec 2007, Jarek Poplawski wrote: > > > >> Actually, IMHO, there is no reason for any lockdep violation: > >> > >> thread #1: has down_read(A); waits for #2 to down_write(B) > >> thread #2: has down_write(B); never waits for #1 to down_read(A) > >> > >> So, deadlock isn't possible here. If lockdep reports something else it > >> should be fixed (and you'd be right to omit lockdep until this is > >> done). > > > > I think the reasoning goes the way Arjan described. Suppose in between > > #1 and #2 there is thread #3 trying to do down_write(A) and waiting for > > #1. Then thread #2 doesn't have to wait for #1 directly, but it would > > have to wait for #3. > > > As a matter of fact I completely missed Arjan's point because I thought > you described these locks according to lockdep's report, and there is > an information about read or write... Since, you still seem to guess > above about possible scenario, maybe it would be easier to show this > report (and maybe a piece of this code if possible)? I have changed my new code, so now the problem is avoided. > Btw., if it were like you're suggesting, it still shouldn't make any > difference: if thread #3 is only waiting for the lock taken for reading, > then I can't see why thread #2 has to wait for anything. The deadlock occurs because: Thread #1 has locked A for reading and is waiting for #2 to release B so that #1 can lock B for writing. Thread #3 is waiting for #1 to release A so that #3 can lock A for writing. Thread #2 has locked B for writing and is waiting for #3 to finish using A so that #2 can lock A for reading. So #1 is waiting for #2, #2 is waiting for #3, and #3 is waiting for #1. Lockdep warns because this might happen. Perhaps the point you're missing is that if #3 does down_write(A) before #2 does down_read(A) then #2 has to wait until #3 releases the write lock. #2 isn't allowed to sneak in front of #3; that's what Arjan meant when he said that rwsems are fair. The same sort of problem occurs when a thread does nested read locks on a single rwsem. If a different thread were to try to acquire a write lock in between, another deadlock would occur. That's why lockdep warns about nested (or recursive) read locks. > Probably more > dangerous, at least for lockdep, could look taking down_write(A) by > thread #1 in between, but if it all were possible only within one > thread, then still there should be no reason to change good program > only to please lockdep. I believe the idea behind lockdep is to warn about situations that might reasonably be expected to occur. A thread deadlocking itself by doing down_read(A) followed immediately by down_write(A) isn't reasonable, so lockdep doesn't worry about that possibility. > > In my case the simplest answer appears to be the replace the rwsem > > with something slightly more complicated (a mutex plus a boolean flag > > -- the loss of concurrency won't matter much since it isn't on a hot > > path). > > I'm not sure I can understand your plan, but I doubt there should be > such problems with taking rwsem for sleeping, so maybe it would be > better to figure out what really scares lockdep, to fix the right place? The real problem is that lockdep has to make some generalizations. It can't be aware of the details of every possible situation, and it doesn't have a global view of the entire kernel, so it doesn't know when special circumstances make deadlock impossible. Furthermore, in this case deadlock isn't really impossible -- it could occur if there were a bug somewhere else in the kernel. So lockdep was correct to warn that deadlock might occur. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-04 15:17 ` Alan Stern @ 2007-12-04 19:02 ` Jarek Poplawski 2007-12-04 19:28 ` Alan Stern 2007-12-04 21:00 ` Ingo Molnar 1 sibling, 1 reply; 11+ messages in thread From: Jarek Poplawski @ 2007-12-04 19:02 UTC (permalink / raw) To: Alan Stern Cc: Jarek Poplawski, Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra Alan Stern wrote, On 12/04/2007 04:17 PM: ... > Furthermore, in this case deadlock isn't really impossible -- it could > occur if there were a bug somewhere else in the kernel. So lockdep was > correct to warn that deadlock might occur. Alan, if the scenario was like you described at the beginning, there was no deadlock possible, unless some errors in the notifier. These #1-#3 threads were only helpful to guess what lockdep could 'think', but I guess notifier doesn't use 2 rivaling threads for a wake, so, lockdep probably needed additional information. And you really can't consider any hypothetical kernel bugs here because then each lock is vulnerable. Regards, Jarek P. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-04 19:02 ` Jarek Poplawski @ 2007-12-04 19:28 ` Alan Stern 2007-12-04 20:14 ` Jarek Poplawski 0 siblings, 1 reply; 11+ messages in thread From: Alan Stern @ 2007-12-04 19:28 UTC (permalink / raw) To: Jarek Poplawski Cc: Jarek Poplawski, Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra On Tue, 4 Dec 2007, Jarek Poplawski wrote: > Alan Stern wrote, On 12/04/2007 04:17 PM: > ... > > > Furthermore, in this case deadlock isn't really impossible -- it could > > occur if there were a bug somewhere else in the kernel. So lockdep was > > correct to warn that deadlock might occur. > > > Alan, if the scenario was like you described at the beginning, there was > no deadlock possible, unless some errors in the notifier. Or errors in the notifier's caller. > These #1-#3 > threads were only helpful to guess what lockdep could 'think', but I > guess notifier doesn't use 2 rivaling threads for a wake, so, lockdep > probably needed additional information. And you really can't consider > any hypothetical kernel bugs here because then each lock is vulnerable. But you have to consider hypothetical kernel bugs. That's exactly what lockdep is for -- to warn you about possible deadlocks that could be caused by bugs. As a simple example, if thread #1 does "lock(A); lock(B)" and thread #2 does "lock(B); lock(A)" then there's a possible bug. Lockdep should warn about you, and it does -- even if those two threads can never run at the same time. If lockdep warned about deadlocks only when they actually happened, it wouldn't be nearly so useful. Alan Stern ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-04 19:28 ` Alan Stern @ 2007-12-04 20:14 ` Jarek Poplawski 0 siblings, 0 replies; 11+ messages in thread From: Jarek Poplawski @ 2007-12-04 20:14 UTC (permalink / raw) To: Alan Stern Cc: Jarek Poplawski, Ingo Molnar, Linux-pm mailing list, Kernel development list, Peter Zijlstra Alan Stern wrote, On 12/04/2007 08:28 PM: > On Tue, 4 Dec 2007, Jarek Poplawski wrote: ... > But you have to consider hypothetical kernel bugs. That's exactly what > lockdep is for -- to warn you about possible deadlocks that could be > caused by bugs. > > As a simple example, if thread #1 does "lock(A); lock(B)" and thread > #2 does "lock(B); lock(A)" then there's a possible bug. Lockdep should > warn about you, and it does -- even if those two threads can never run > at the same time. > > If lockdep warned about deadlocks only when they actually happened, it > wouldn't be nearly so useful. Sure! I probably missed your point... Lockdep always names reported locks, so I meant 'hypothetical' only trying to explain lockdep with some other, unknown or unnamed bugs. So, depending on the code, above example with A & B could be a real bug (even if very improbable but logically justified) or a false alarm (eg. when we know both threads could never work at the same). Jarek P. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Need lockdep help 2007-12-04 15:17 ` Alan Stern 2007-12-04 19:02 ` Jarek Poplawski @ 2007-12-04 21:00 ` Ingo Molnar 1 sibling, 0 replies; 11+ messages in thread From: Ingo Molnar @ 2007-12-04 21:00 UTC (permalink / raw) To: Alan Stern Cc: Jarek Poplawski, Jarek Poplawski, Linux-pm mailing list, Kernel development list, Peter Zijlstra * Alan Stern <stern@rowland.harvard.edu> wrote: > > I'm not sure I can understand your plan, but I doubt there should be > > such problems with taking rwsem for sleeping, so maybe it would be > > better to figure out what really scares lockdep, to fix the right > > place? > > The real problem is that lockdep has to make some generalizations. It > can't be aware of the details of every possible situation, and it > doesn't have a global view of the entire kernel, so it doesn't know > when special circumstances make deadlock impossible. > > Furthermore, in this case deadlock isn't really impossible -- it could > occur if there were a bug somewhere else in the kernel. So lockdep > was correct to warn that deadlock might occur. ok. Thanks for resolving this by working it around - i _think_ in the long run we'd like to make all locking "simpler" - not just for the sake of lockdep, but also for the sake of human reviewability. So in that sense, even though in this particular case you are fully right that lockdep was wrong, we benefit long-term. Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-12-04 21:00 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-12-02 19:45 Need lockdep help Alan Stern 2007-12-02 20:04 ` Arjan van de Ven 2007-12-02 20:08 ` Alan Stern 2007-12-03 10:36 ` Jarek Poplawski 2007-12-03 15:08 ` Alan Stern 2007-12-03 23:25 ` Jarek Poplawski 2007-12-04 15:17 ` Alan Stern 2007-12-04 19:02 ` Jarek Poplawski 2007-12-04 19:28 ` Alan Stern 2007-12-04 20:14 ` Jarek Poplawski 2007-12-04 21:00 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).