All of lore.kernel.org
 help / color / mirror / Atom feed
* Lockdep splat in -next
@ 2021-06-28 15:37 Paul E. McKenney
  2021-06-28 16:08 ` Jason Gunthorpe
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2021-06-28 15:37 UTC (permalink / raw)
  To: liuyixian; +Cc: jgg, liweihang, sfr, linux-kernel, linux-next

Hello, Yixian Liu,

The following -next commit results in a lockdep splat:

591f762b2750 ("RDMA/hns: Remove the condition of light load for posting DWQE")

The repeat-by is as follows:

tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --trust-make --duration 1 --configs TASKS01

The resulting splat is as shown below.  This appears to have been either
fixed or obscured by a later commit, but it does affect bisectability.
Which I found out about the hard way.  ;-)

							Thanx, Paul

======================================================
WARNING: possible circular locking dependency detected
5.13.0-rc1+ #2218 Not tainted
------------------------------------------------------
rcu_torture_sta/66 is trying to acquire lock:
ffffffffa0063c90 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x9/0x20

but task is already holding lock:
ffffffffa01754e8 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (slab_mutex){+.+.}-{3:3}:
       __mutex_lock+0x99/0x950
       slub_cpu_dead+0x15/0xf0
       cpuhp_invoke_callback+0x181/0x850
       cpuhp_invoke_callback_range+0x3b/0x80
       _cpu_down+0xdf/0x2a0
       cpu_down+0x2c/0x50
       device_offline+0x82/0xb0
       remove_cpu+0x1a/0x30
       torture_offline+0x80/0x140
       torture_onoff+0x147/0x260
       kthread+0x123/0x160
       ret_from_fork+0x22/0x30

-> #0 (cpu_hotplug_lock){++++}-{0:0}:
       __lock_acquire+0x12e6/0x27c0
       lock_acquire+0xc8/0x3a0
       cpus_read_lock+0x26/0xb0
       static_key_enable+0x9/0x20
       __kmem_cache_create+0x39e/0x440
       kmem_cache_create_usercopy+0x146/0x250
       kmem_cache_create+0xd/0x10
       rcu_torture_stats+0x79/0x280
       kthread+0x123/0x160
       ret_from_fork+0x22/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slab_mutex);
                               lock(cpu_hotplug_lock);
                               lock(slab_mutex);
  lock(cpu_hotplug_lock);

 *** DEADLOCK ***

1 lock held by rcu_torture_sta/66:
 #0: ffffffffa01754e8 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250

stack backtrace:
CPU: 1 PID: 66 Comm: rcu_torture_sta Not tainted 5.13.0-rc1+ #2218
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 dump_stack+0x6d/0x89
 check_noncircular+0xfe/0x110
 ? rcu_read_lock_sched_held+0x4d/0x80
 ? __alloc_pages+0x329/0x360
 __lock_acquire+0x12e6/0x27c0
 lock_acquire+0xc8/0x3a0
 ? static_key_enable+0x9/0x20
 cpus_read_lock+0x26/0xb0
 ? static_key_enable+0x9/0x20
 static_key_enable+0x9/0x20
 __kmem_cache_create+0x39e/0x440
 kmem_cache_create_usercopy+0x146/0x250
 ? rcu_torture_stats_print+0xd0/0xd0
 kmem_cache_create+0xd/0x10
 rcu_torture_stats+0x79/0x280
 kthread+0x123/0x160
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x30

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lockdep splat in -next
  2021-06-28 15:37 Lockdep splat in -next Paul E. McKenney
@ 2021-06-28 16:08 ` Jason Gunthorpe
  2021-06-28 22:15   ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Jason Gunthorpe @ 2021-06-28 16:08 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: liuyixian, liweihang, sfr, linux-kernel, linux-next

On Mon, Jun 28, 2021 at 08:37:46AM -0700, Paul E. McKenney wrote:
> Hello, Yixian Liu,
> 
> The following -next commit results in a lockdep splat:
> 
> 591f762b2750 ("RDMA/hns: Remove the condition of light load for posting DWQE")
> 
> The repeat-by is as follows:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --trust-make --duration 1 --configs TASKS01
> 
> The resulting splat is as shown below.  This appears to have been either
> fixed or obscured by a later commit, but it does affect bisectability.
> Which I found out about the hard way.  ;-)

I'm confused, the hns driver is causing this, and you are able to test
with the hns hardware???

Jason

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lockdep splat in -next
  2021-06-28 16:08 ` Jason Gunthorpe
@ 2021-06-28 22:15   ` Paul E. McKenney
  0 siblings, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2021-06-28 22:15 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: liuyixian, liweihang, sfr, linux-kernel, linux-next

On Mon, Jun 28, 2021 at 01:08:41PM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 28, 2021 at 08:37:46AM -0700, Paul E. McKenney wrote:
> > Hello, Yixian Liu,
> > 
> > The following -next commit results in a lockdep splat:
> > 
> > 591f762b2750 ("RDMA/hns: Remove the condition of light load for posting DWQE")
> > 
> > The repeat-by is as follows:
> > 
> > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --trust-make --duration 1 --configs TASKS01
> > 
> > The resulting splat is as shown below.  This appears to have been either
> > fixed or obscured by a later commit, but it does affect bisectability.
> > Which I found out about the hard way.  ;-)
> 
> I'm confused, the hns driver is causing this, and you are able to test
> with the hns hardware???

Apparently I am as well, apologies for the noise.  I incorrectly assumed
that v5.13-rc1 was clean, but this deadlock really is present in v5.13-rc.

And here I thought I was handling the heat relatively well...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-28 22:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 15:37 Lockdep splat in -next Paul E. McKenney
2021-06-28 16:08 ` Jason Gunthorpe
2021-06-28 22:15   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.