* Confusing lockdep splat @ 2021-09-24 21:02 Paul E. McKenney 2021-09-24 21:41 ` Waiman Long 0 siblings, 1 reply; 7+ messages in thread From: Paul E. McKenney @ 2021-09-24 21:02 UTC (permalink / raw) To: peterz, mingo, will, longman, boqun.feng; +Cc: linux-kernel, richard Hello! I got the lockdep splat below from an SRCU-T rcutorture run, which uses a !SMP !PREEMPT kernel. This is a random event, and about half the time it happens within an hour or two. My reproducer (on current -rcu "dev" branch for a 16-CPU system) is: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 My points of confusion are as follows: 1. The locks involved in this deadlock cycle are irq-disabled raw spinlocks. The claimed deadlock cycle uses two CPUs. There is only one CPU. There is no possibility of preemption or interrupts. So how can this deadlock actually happen? 2. If there was more than one CPU, then yes, there would be a deadlock. The PI lock is acquired by the wakeup code after acquiring the workqueue lock, and rcutorture tests the new ability of the scheduler to hold the PI lock across rcu_read_unlock(), and while it is at it, across the rest of the unlock primitives. But if there was more than one CPU, Tree SRCU would be used instead of Tiny SRCU, and there would be no wakeup invoked from srcu_read_unlock(). Given only one CPU, there is no way to complete the deadlock cycle. For now, I am working around this by preventing rcutorture from holding the PI lock across Tiny srcu_read_unlock(). Am I missing something subtle here? Thanx, Paul ====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc1+ #3766 Not tainted ------------------------------------------------------ rcu_torture_rea/53 is trying to acquire lock: ffffffffa074e6a8 (srcu_ctl.srcu_wq.lock){....}-{2:2}, at: swake_up_one+0xa/0x30 but task is already holding lock: ffffa03502479d80 (&p->pi_lock){-.-.}-{2:2}, at: rcutorture_one_extend+0x370/0x400 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&p->pi_lock){-.-.}-{2:2}: _raw_spin_lock_irqsave+0x2f/0x50 try_to_wake_up+0x50/0x580 swake_up_locked.part.7+0xe/0x30 swake_up_one+0x22/0x30 rcutorture_one_extend+0x1b6/0x400 rcu_torture_one_read+0x290/0x5d0 rcu_torture_reader+0xac/0x200 kthread+0x12d/0x150 ret_from_fork+0x22/0x30 -> #0 (srcu_ctl.srcu_wq.lock){....}-{2:2}: __lock_acquire+0x130c/0x2440 lock_acquire+0xc2/0x270 _raw_spin_lock_irqsave+0x2f/0x50 swake_up_one+0xa/0x30 rcutorture_one_extend+0x387/0x400 rcu_torture_one_read+0x290/0x5d0 rcu_torture_reader+0xac/0x200 kthread+0x12d/0x150 ret_from_fork+0x22/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&p->pi_lock); lock(srcu_ctl.srcu_wq.lock); lock(&p->pi_lock); lock(srcu_ctl.srcu_wq.lock); *** DEADLOCK *** 1 lock held by rcu_torture_rea/53: #0: ffffa03502479d80 (&p->pi_lock){-.-.}-{2:2}, at: rcutorture_one_extend+0x370/0x400 stack backtrace: CPU: 0 PID: 53 Comm: rcu_torture_rea Not tainted 5.15.0-rc1+ #3766 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module_el8.5.0+746+bbd5d70c 04/01/2014 Call Trace: check_noncircular+0xfe/0x110 ? find_held_lock+0x2d/0x90 __lock_acquire+0x130c/0x2440 lock_acquire+0xc2/0x270 ? swake_up_one+0xa/0x30 ? find_held_lock+0x72/0x90 _raw_spin_lock_irqsave+0x2f/0x50 ? swake_up_one+0xa/0x30 swake_up_one+0xa/0x30 rcutorture_one_extend+0x387/0x400 rcu_torture_one_read+0x290/0x5d0 rcu_torture_reader+0xac/0x200 ? rcutorture_oom_notify+0xf0/0xf0 ? __kthread_parkme+0x61/0x90 ? rcu_torture_one_read+0x5d0/0x5d0 kthread+0x12d/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-24 21:02 Confusing lockdep splat Paul E. McKenney @ 2021-09-24 21:41 ` Waiman Long 2021-09-24 22:43 ` Paul E. McKenney 0 siblings, 1 reply; 7+ messages in thread From: Waiman Long @ 2021-09-24 21:41 UTC (permalink / raw) To: paulmck, peterz, mingo, will, boqun.feng; +Cc: linux-kernel, richard On 9/24/21 5:02 PM, Paul E. McKenney wrote: > Hello! > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses > a !SMP !PREEMPT kernel. This is a random event, and about half the time > it happens within an hour or two. My reproducer (on current -rcu "dev" > branch for a 16-CPU system) is: > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 > > My points of confusion are as follows: > > 1. The locks involved in this deadlock cycle are irq-disabled > raw spinlocks. The claimed deadlock cycle uses two CPUs. > There is only one CPU. There is no possibility of preemption > or interrupts. So how can this deadlock actually happen? > > 2. If there was more than one CPU, then yes, there would be > a deadlock. The PI lock is acquired by the wakeup code after > acquiring the workqueue lock, and rcutorture tests the new ability > of the scheduler to hold the PI lock across rcu_read_unlock(), > and while it is at it, across the rest of the unlock primitives. > > But if there was more than one CPU, Tree SRCU would be used > instead of Tiny SRCU, and there would be no wakeup invoked from > srcu_read_unlock(). > > Given only one CPU, there is no way to complete the deadlock > cycle. > > For now, I am working around this by preventing rcutorture from holding > the PI lock across Tiny srcu_read_unlock(). > > Am I missing something subtle here? I would say that the lockdep code just doesn't have enough intelligence to identify that deadlock is not possible in this special case. There are certainly false positives, and it can be hard to get rid of them. Cheers, Longman ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-24 21:41 ` Waiman Long @ 2021-09-24 22:43 ` Paul E. McKenney 2021-09-25 0:30 ` Waiman Long 2021-09-25 2:38 ` Boqun Feng 0 siblings, 2 replies; 7+ messages in thread From: Paul E. McKenney @ 2021-09-24 22:43 UTC (permalink / raw) To: Waiman Long; +Cc: peterz, mingo, will, boqun.feng, linux-kernel, richard On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote: > On 9/24/21 5:02 PM, Paul E. McKenney wrote: > > Hello! > > > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses > > a !SMP !PREEMPT kernel. This is a random event, and about half the time > > it happens within an hour or two. My reproducer (on current -rcu "dev" > > branch for a 16-CPU system) is: > > > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 > > > > My points of confusion are as follows: > > > > 1. The locks involved in this deadlock cycle are irq-disabled > > raw spinlocks. The claimed deadlock cycle uses two CPUs. > > There is only one CPU. There is no possibility of preemption > > or interrupts. So how can this deadlock actually happen? > > > > 2. If there was more than one CPU, then yes, there would be > > a deadlock. The PI lock is acquired by the wakeup code after > > acquiring the workqueue lock, and rcutorture tests the new ability > > of the scheduler to hold the PI lock across rcu_read_unlock(), > > and while it is at it, across the rest of the unlock primitives. > > > > But if there was more than one CPU, Tree SRCU would be used > > instead of Tiny SRCU, and there would be no wakeup invoked from > > srcu_read_unlock(). > > > > Given only one CPU, there is no way to complete the deadlock > > cycle. > > > > For now, I am working around this by preventing rcutorture from holding > > the PI lock across Tiny srcu_read_unlock(). > > > > Am I missing something subtle here? > > I would say that the lockdep code just doesn't have enough intelligence to > identify that deadlock is not possible in this special case. There are > certainly false positives, and it can be hard to get rid of them. Would it make sense for lockdep to filter out reports involving more than one CPU unless there is at least one sleeplock in the cycle? Of course, it gets more complicated when interrupts are involved... Thanx, Paul ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-24 22:43 ` Paul E. McKenney @ 2021-09-25 0:30 ` Waiman Long 2021-09-25 12:55 ` Paul E. McKenney 2021-09-25 2:38 ` Boqun Feng 1 sibling, 1 reply; 7+ messages in thread From: Waiman Long @ 2021-09-25 0:30 UTC (permalink / raw) To: paulmck, Waiman Long Cc: peterz, mingo, will, boqun.feng, linux-kernel, richard On 9/24/21 6:43 PM, Paul E. McKenney wrote: > On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote: >> On 9/24/21 5:02 PM, Paul E. McKenney wrote: >>> Hello! >>> >>> I got the lockdep splat below from an SRCU-T rcutorture run, which uses >>> a !SMP !PREEMPT kernel. This is a random event, and about half the time >>> it happens within an hour or two. My reproducer (on current -rcu "dev" >>> branch for a 16-CPU system) is: >>> >>> tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 >>> >>> My points of confusion are as follows: >>> >>> 1. The locks involved in this deadlock cycle are irq-disabled >>> raw spinlocks. The claimed deadlock cycle uses two CPUs. >>> There is only one CPU. There is no possibility of preemption >>> or interrupts. So how can this deadlock actually happen? >>> >>> 2. If there was more than one CPU, then yes, there would be >>> a deadlock. The PI lock is acquired by the wakeup code after >>> acquiring the workqueue lock, and rcutorture tests the new ability >>> of the scheduler to hold the PI lock across rcu_read_unlock(), >>> and while it is at it, across the rest of the unlock primitives. >>> >>> But if there was more than one CPU, Tree SRCU would be used >>> instead of Tiny SRCU, and there would be no wakeup invoked from >>> srcu_read_unlock(). >>> >>> Given only one CPU, there is no way to complete the deadlock >>> cycle. >>> >>> For now, I am working around this by preventing rcutorture from holding >>> the PI lock across Tiny srcu_read_unlock(). >>> >>> Am I missing something subtle here? >> I would say that the lockdep code just doesn't have enough intelligence to >> identify that deadlock is not possible in this special case. There are >> certainly false positives, and it can be hard to get rid of them. > Would it make sense for lockdep to filter out reports involving more > than one CPU unless there is at least one sleeplock in the cycle? > > Of course, it gets more complicated when interrupts are involved... Actually, lockdep keeps track of all the possible lock orderings and put out a splat whenever these lock orderings suggest that a circular deadlock is possible. It doesn't keep track if a lock is sleepable or not. Also lockdep deals with lock classes each of which can have many instances. So all the pi_lock's in different task_struct's are all treated as the same lock from lockdep's perspective. We can't treat all different instances separately or we will run out of lockdep table space very quickly. Cheers, Longman ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-25 0:30 ` Waiman Long @ 2021-09-25 12:55 ` Paul E. McKenney 0 siblings, 0 replies; 7+ messages in thread From: Paul E. McKenney @ 2021-09-25 12:55 UTC (permalink / raw) To: Waiman Long; +Cc: peterz, mingo, will, boqun.feng, linux-kernel, richard On Fri, Sep 24, 2021 at 08:30:20PM -0400, Waiman Long wrote: > On 9/24/21 6:43 PM, Paul E. McKenney wrote: > > On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote: > > > On 9/24/21 5:02 PM, Paul E. McKenney wrote: > > > > Hello! > > > > > > > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses > > > > a !SMP !PREEMPT kernel. This is a random event, and about half the time > > > > it happens within an hour or two. My reproducer (on current -rcu "dev" > > > > branch for a 16-CPU system) is: > > > > > > > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 > > > > > > > > My points of confusion are as follows: > > > > > > > > 1. The locks involved in this deadlock cycle are irq-disabled > > > > raw spinlocks. The claimed deadlock cycle uses two CPUs. > > > > There is only one CPU. There is no possibility of preemption > > > > or interrupts. So how can this deadlock actually happen? > > > > > > > > 2. If there was more than one CPU, then yes, there would be > > > > a deadlock. The PI lock is acquired by the wakeup code after > > > > acquiring the workqueue lock, and rcutorture tests the new ability > > > > of the scheduler to hold the PI lock across rcu_read_unlock(), > > > > and while it is at it, across the rest of the unlock primitives. > > > > > > > > But if there was more than one CPU, Tree SRCU would be used > > > > instead of Tiny SRCU, and there would be no wakeup invoked from > > > > srcu_read_unlock(). > > > > > > > > Given only one CPU, there is no way to complete the deadlock > > > > cycle. > > > > > > > > For now, I am working around this by preventing rcutorture from holding > > > > the PI lock across Tiny srcu_read_unlock(). > > > > > > > > Am I missing something subtle here? > > > I would say that the lockdep code just doesn't have enough intelligence to > > > identify that deadlock is not possible in this special case. There are > > > certainly false positives, and it can be hard to get rid of them. > > Would it make sense for lockdep to filter out reports involving more > > than one CPU unless there is at least one sleeplock in the cycle? > > > > Of course, it gets more complicated when interrupts are involved... > > Actually, lockdep keeps track of all the possible lock orderings and put out > a splat whenever these lock orderings suggest that a circular deadlock is > possible. It doesn't keep track if a lock is sleepable or not. Also lockdep > deals with lock classes each of which can have many instances. So all the > pi_lock's in different task_struct's are all treated as the same lock from > lockdep's perspective. We can't treat all different instances separately or > we will run out of lockdep table space very quickly. We shouldn't need additional classes, but only instead a marking of a given lock class to tell whether or not it was a sleeplock. Either way, I now have a workaround within Tiny SRCU that appears to handle this case, so it is not as urgent as it might be. Thanx, Paul ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-24 22:43 ` Paul E. McKenney 2021-09-25 0:30 ` Waiman Long @ 2021-09-25 2:38 ` Boqun Feng 2021-09-25 12:50 ` Paul E. McKenney 1 sibling, 1 reply; 7+ messages in thread From: Boqun Feng @ 2021-09-25 2:38 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Waiman Long, peterz, mingo, will, linux-kernel, richard On Fri, Sep 24, 2021 at 03:43:37PM -0700, Paul E. McKenney wrote: > On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote: > > On 9/24/21 5:02 PM, Paul E. McKenney wrote: > > > Hello! > > > > > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses > > > a !SMP !PREEMPT kernel. This is a random event, and about half the time > > > it happens within an hour or two. My reproducer (on current -rcu "dev" > > > branch for a 16-CPU system) is: > > > > > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 > > > > > > My points of confusion are as follows: > > > > > > 1. The locks involved in this deadlock cycle are irq-disabled > > > raw spinlocks. The claimed deadlock cycle uses two CPUs. > > > There is only one CPU. There is no possibility of preemption > > > or interrupts. So how can this deadlock actually happen? > > > > > > 2. If there was more than one CPU, then yes, there would be > > > a deadlock. The PI lock is acquired by the wakeup code after > > > acquiring the workqueue lock, and rcutorture tests the new ability > > > of the scheduler to hold the PI lock across rcu_read_unlock(), > > > and while it is at it, across the rest of the unlock primitives. > > > > > > But if there was more than one CPU, Tree SRCU would be used > > > instead of Tiny SRCU, and there would be no wakeup invoked from > > > srcu_read_unlock(). > > > > > > Given only one CPU, there is no way to complete the deadlock > > > cycle. > > > > > > For now, I am working around this by preventing rcutorture from holding > > > the PI lock across Tiny srcu_read_unlock(). > > > > > > Am I missing something subtle here? > > > > I would say that the lockdep code just doesn't have enough intelligence to > > identify that deadlock is not possible in this special case. There are > > certainly false positives, and it can be hard to get rid of them. > > Would it make sense for lockdep to filter out reports involving more > than one CPU unless there is at least one sleeplock in the cycle? > I think SRCU is special here, because it has different implementations in SMP and UP. For other code, if the implemenation in SMP and UP is the same, we want lockdep to detect the deadlock even if it's not in UP. We can provide an annotation similar to data_race() for SRCU to mark UP-only code #define LOCKDEP_UP_ONLY(expr) ({ \ BUILD_BUG_ON(IS_ENABLED(CONFIG_SMP)); \ <disable lockdep> <...> v = expr; <enable lockdep> v }) and in __srcu_read_unlock(): LOCKDEP_UP_ONLY(swake_up_one(...)); Thoughts? Regards, Boqun > Of course, it gets more complicated when interrupts are involved... > > Thanx, Paul ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Confusing lockdep splat 2021-09-25 2:38 ` Boqun Feng @ 2021-09-25 12:50 ` Paul E. McKenney 0 siblings, 0 replies; 7+ messages in thread From: Paul E. McKenney @ 2021-09-25 12:50 UTC (permalink / raw) To: Boqun Feng; +Cc: Waiman Long, peterz, mingo, will, linux-kernel, richard On Sat, Sep 25, 2021 at 10:38:28AM +0800, Boqun Feng wrote: > On Fri, Sep 24, 2021 at 03:43:37PM -0700, Paul E. McKenney wrote: > > On Fri, Sep 24, 2021 at 05:41:17PM -0400, Waiman Long wrote: > > > On 9/24/21 5:02 PM, Paul E. McKenney wrote: > > > > Hello! > > > > > > > > I got the lockdep splat below from an SRCU-T rcutorture run, which uses > > > > a !SMP !PREEMPT kernel. This is a random event, and about half the time > > > > it happens within an hour or two. My reproducer (on current -rcu "dev" > > > > branch for a 16-CPU system) is: > > > > > > > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 16 --configs "16*SRCU-T" --duration 7200 > > > > > > > > My points of confusion are as follows: > > > > > > > > 1. The locks involved in this deadlock cycle are irq-disabled > > > > raw spinlocks. The claimed deadlock cycle uses two CPUs. > > > > There is only one CPU. There is no possibility of preemption > > > > or interrupts. So how can this deadlock actually happen? > > > > > > > > 2. If there was more than one CPU, then yes, there would be > > > > a deadlock. The PI lock is acquired by the wakeup code after > > > > acquiring the workqueue lock, and rcutorture tests the new ability > > > > of the scheduler to hold the PI lock across rcu_read_unlock(), > > > > and while it is at it, across the rest of the unlock primitives. > > > > > > > > But if there was more than one CPU, Tree SRCU would be used > > > > instead of Tiny SRCU, and there would be no wakeup invoked from > > > > srcu_read_unlock(). > > > > > > > > Given only one CPU, there is no way to complete the deadlock > > > > cycle. > > > > > > > > For now, I am working around this by preventing rcutorture from holding > > > > the PI lock across Tiny srcu_read_unlock(). > > > > > > > > Am I missing something subtle here? > > > > > > I would say that the lockdep code just doesn't have enough intelligence to > > > identify that deadlock is not possible in this special case. There are > > > certainly false positives, and it can be hard to get rid of them. > > > > Would it make sense for lockdep to filter out reports involving more > > than one CPU unless there is at least one sleeplock in the cycle? > > I think SRCU is special here, because it has different implementations > in SMP and UP. For other code, if the implemenation in SMP and UP is the > same, we want lockdep to detect the deadlock even if it's not in UP. Ah, fair point! There are a few others, for example, kernel/up.c, but it seems to just disable interrupts as its "big UP kernel lock". > We can provide an annotation similar to data_race() for SRCU to mark > UP-only code > > #define LOCKDEP_UP_ONLY(expr) ({ \ > BUILD_BUG_ON(IS_ENABLED(CONFIG_SMP)); \ > > <disable lockdep> > <...> v = expr; > <enable lockdep> > v > }) > > and in __srcu_read_unlock(): > > LOCKDEP_UP_ONLY(swake_up_one(...)); > > Thoughts? With the workaround I have now, all is well unless someone needs to hold a PI lock across an rcu_read_unlock(), which seems unlikely. If such a case does arise, lockdep will let us know. In which case what you are suggesting might be a good way to go. Alternatively, I could use the trick that RCU Tasks Trace uses, with the swake_up_one() deferred to an irq_work_queue() handler. It does appear that !SMP kernels are nowhere near as important to the community as they were 20 years ago. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-09-25 12:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-24 21:02 Confusing lockdep splat Paul E. McKenney 2021-09-24 21:41 ` Waiman Long 2021-09-24 22:43 ` Paul E. McKenney 2021-09-25 0:30 ` Waiman Long 2021-09-25 12:55 ` Paul E. McKenney 2021-09-25 2:38 ` Boqun Feng 2021-09-25 12:50 ` Paul E. McKenney
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.