All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
@ 2021-12-04  3:04 Yi Zhang
  2022-02-27 23:21 ` Bart Van Assche
  2022-05-25 11:01 ` Sagi Grimberg
  0 siblings, 2 replies; 19+ messages in thread
From: Yi Zhang @ 2021-12-04  3:04 UTC (permalink / raw)
  To: RDMA mailing list, open list:NVM EXPRESS DRIVER

Hello
Below WARNING triggered by blktests nvmeof-mp/002, it can be
reproduced with the latest 5.16.0-rc3 and also exists on 5.12, pls
check it.

# use_siw=1 ./check nvmeof-mp/002
nvmeof-mp/002 (File I/O on top of multipath concurrently with logout
and login (mq)) [passed]
    runtime  41.767s  ...  36.877s
# ./check nvmeof-mp/002
nvmeof-mp/002 (File I/O on top of multipath concurrently with logout
and login (mq)) [passed]
    runtime    ...  41.277s

[  609.166100] run blktests nvmeof-mp/002 at 2021-12-03 21:27:52
[  609.977883] null_blk: module loaded
[  610.538697] rdma_rxe: loaded
[  610.602763] infiniband eno1_rxe: set active
[  610.602782] infiniband eno1_rxe: added eno1
[  610.639520] eno2 speed is unknown, defaulting to 1000
[  610.644946] eno2 speed is unknown, defaulting to 1000
[  610.650318] eno2 speed is unknown, defaulting to 1000
[  610.690546] infiniband eno2_rxe: set down
[  610.690553] infiniband eno2_rxe: added eno2
[  610.691112] eno2 speed is unknown, defaulting to 1000
[  610.714293] eno2 speed is unknown, defaulting to 1000
[  610.729535] eno3 speed is unknown, defaulting to 1000
[  610.734666] eno3 speed is unknown, defaulting to 1000
[  610.740026] eno3 speed is unknown, defaulting to 1000
[  610.781372] infiniband eno3_rxe: set down
[  610.781379] infiniband eno3_rxe: added eno3
[  610.781906] eno3 speed is unknown, defaulting to 1000
[  610.806211] eno2 speed is unknown, defaulting to 1000
[  610.811508] eno3 speed is unknown, defaulting to 1000
[  610.826655] eno4 speed is unknown, defaulting to 1000
[  610.831810] eno4 speed is unknown, defaulting to 1000
[  610.837176] eno4 speed is unknown, defaulting to 1000
[  610.885844] infiniband eno4_rxe: set down
[  610.885850] infiniband eno4_rxe: added eno4
[  610.886426] eno4 speed is unknown, defaulting to 1000
[  611.266756] Loading iSCSI transport class v2.0-870.
[  611.335358] iscsi: registered transport (iser)
[  611.387969] nvmet: adding nsid 1 to subsystem nvme-test
[  611.403878] eno2 speed is unknown, defaulting to 1000
[  611.409276] eno3 speed is unknown, defaulting to 1000
[  611.414522] eno4 speed is unknown, defaulting to 1000
[  611.468712] Rounding down aligned max_sectors from 4294967295 to 4294967288
[  611.469203] db_root: cannot open: /etc/target
[  611.515816] nvmet_rdma: enabling port 1 (10.16.221.116:7777)
[  611.622158] eno2 speed is unknown, defaulting to 1000
[  611.628243] eno3 speed is unknown, defaulting to 1000
[  611.633422] eno4 speed is unknown, defaulting to 1000
[  611.703648] eno2 speed is unknown, defaulting to 1000
[  611.708949] eno3 speed is unknown, defaulting to 1000
[  611.714267] eno4 speed is unknown, defaulting to 1000
[  611.754520] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
resolved (0): status 0 id 00000000d0d64a6d
[  611.765454] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
resolved  (2): status 0 id 00000000d0d64a6d
[  611.767778] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
request (4): status 0 id 000000007ccc375a
[  611.767935] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
added eno1_rxe.
[  611.777083] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 289
cm_id= 000000007ccc375a
--snip--
[  612.132317] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
(9): status 0 id 00000000cede4b76
[  612.132452] nvme nvme0: mapped 32/0/0 default/read/poll queues.
[  612.132537] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
established (9): status 0 id 0000000070c2f22c
[  612.157849] nvmet:nvmet_execute_io_connect: nvmet: adding queue 1 to ctrl 1.
[  612.158194] nvmet:nvmet_execute_io_connect: nvmet: adding queue 2 to ctrl 1.
[  612.158548] nvmet:nvmet_execute_io_connect: nvmet: adding queue 3 to ctrl 1.
[  612.158854] nvmet:nvmet_execute_io_connect: nvmet: adding queue 4 to ctrl 1.
[  612.159173] nvmet:nvmet_execute_io_connect: nvmet: adding queue 5 to ctrl 1.
[  612.159510] nvmet:nvmet_execute_io_connect: nvmet: adding queue 6 to ctrl 1.
[  612.159853] nvmet:nvmet_execute_io_connect: nvmet: adding queue 7 to ctrl 1.
[  612.160225] nvmet:nvmet_execute_io_connect: nvmet: adding queue 8 to ctrl 1.
[  612.160572] nvmet:nvmet_execute_io_connect: nvmet: adding queue 9 to ctrl 1.
[  612.160885] nvmet:nvmet_execute_io_connect: nvmet: adding queue 10 to ctrl 1.
[  612.161216] nvmet:nvmet_execute_io_connect: nvmet: adding queue 11 to ctrl 1.
[  612.161759] nvmet:nvmet_execute_io_connect: nvmet: adding queue 12 to ctrl 1.
[  612.162123] nvmet:nvmet_execute_io_connect: nvmet: adding queue 13 to ctrl 1.
[  612.162473] nvmet:nvmet_execute_io_connect: nvmet: adding queue 14 to ctrl 1.
[  612.162793] nvmet:nvmet_execute_io_connect: nvmet: adding queue 15 to ctrl 1.
[  612.163135] nvmet:nvmet_execute_io_connect: nvmet: adding queue 16 to ctrl 1.
[  612.163458] nvmet:nvmet_execute_io_connect: nvmet: adding queue 17 to ctrl 1.
[  612.163776] nvmet:nvmet_execute_io_connect: nvmet: adding queue 18 to ctrl 1.
[  612.164103] nvmet:nvmet_execute_io_connect: nvmet: adding queue 19 to ctrl 1.
[  612.164417] nvmet:nvmet_execute_io_connect: nvmet: adding queue 20 to ctrl 1.
[  612.164700] nvmet:nvmet_execute_io_connect: nvmet: adding queue 21 to ctrl 1.
[  612.164971] nvmet:nvmet_execute_io_connect: nvmet: adding queue 22 to ctrl 1.
[  612.165275] nvmet:nvmet_execute_io_connect: nvmet: adding queue 23 to ctrl 1.
[  612.165689] nvmet:nvmet_execute_io_connect: nvmet: adding queue 24 to ctrl 1.
[  612.166082] nvmet:nvmet_execute_io_connect: nvmet: adding queue 25 to ctrl 1.
[  612.166383] nvmet:nvmet_execute_io_connect: nvmet: adding queue 26 to ctrl 1.
[  612.166663] nvmet:nvmet_execute_io_connect: nvmet: adding queue 27 to ctrl 1.
[  612.166963] nvmet:nvmet_execute_io_connect: nvmet: adding queue 28 to ctrl 1.
[  612.167273] nvmet:nvmet_execute_io_connect: nvmet: adding queue 29 to ctrl 1.
[  612.167529] nvmet:nvmet_execute_io_connect: nvmet: adding queue 30 to ctrl 1.
[  612.167804] nvmet:nvmet_execute_io_connect: nvmet: adding queue 31 to ctrl 1.
[  612.168114] nvmet:nvmet_execute_io_connect: nvmet: adding queue 32 to ctrl 1.
[  612.168523] nvme nvme0: new ctrl: NQN "nvme-test", addr 10.16.221.116:7777
[  612.293734] device-mapper: multipath service-time: version 0.3.0 loaded
[  612.296099] device-mapper: table: 253:3: multipath: error getting
device (-EBUSY)
[  612.303676] device-mapper: ioctl: error adding target to table
[  612.501014] device-mapper: table: 253:3: multipath: error getting
device (-EBUSY)
[  612.508751] device-mapper: ioctl: error adding target to table
[  612.514717] device-mapper: table: 253:3: multipath: error getting
device (-EBUSY)
[  612.522632] device-mapper: ioctl: error adding target to table
[  612.950887] device-mapper: table: 253:3: multipath: error getting
device (-EBUSY)
[  612.958616] device-mapper: ioctl: error adding target to table
[  613.869348] EXT4-fs (dm-3): mounted filesystem without journal.
Opts: (null). Quota mode: none.
[  613.869619] ext4 filesystem being mounted at
/root/kernel-tests/storage/blktests/nvme/nvmeof-mp/blktests/results/tmpdir.nvmeof-mp.002.ZYO/mnt0
supports timestamps until 2038 (0x7fffffff)
[  617.144570] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  617.272414] nvmet:nvmet_execute_keep_alive: nvmet: ctrl 1 update
keep-alive timer for 5 secs
[  622.775724] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  627.895810] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  633.015369] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  638.138209] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  643.254966] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  644.822178] eno2 speed is unknown, defaulting to 1000
[  644.827494] eno3 speed is unknown, defaulting to 1000
[  644.832797] eno4 speed is unknown, defaulting to 1000
[  648.377790] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
traffic based keep-alive timer
[  651.364570] nvme nvme0: Removing ctrl: NQN "nvme-test"
[  651.430891] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
disconnected (10): status 0 id 000000009ae8bc67
[  651.430914] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
cm_id= 000000009ae8bc67 queue->state= 1
[  651.430983] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
disconnected (10): status 0 id 0000000078827089
[  651.430994] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
cm_id= 0000000078827089 queue->state= 1
--snip--
[  651.438145] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
disconnected (10): status 0 id 00000000cede4b76
[  651.438150] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
received - connection closed

[  651.448883] ======================================================
[  651.455061] WARNING: possible circular locking dependency detected
[  651.461242] 5.16.0-rc3+ #2 Tainted: G S        I
[  651.466380] ------------------------------------------------------
[  651.472559] kworker/14:4/4839 is trying to acquire lock:
[  651.477873] ffff8884c0e353e0 (&id_priv->handler_mutex){+.+.}-{3:3},
at: rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.487623]
               but task is already holding lock:
[  651.493453] ffff8883cc21fe00
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x888/0x1740
[  651.504062]
               which lock already depends on the new lock.

[  651.512233]
               the existing dependency chain (in reverse order) is:
[  651.519715]
               -> #3 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
[  651.527894]        lock_acquire+0x1d2/0x5a0
[  651.532079]        process_one_work+0x8da/0x1740
[  651.536700]        worker_thread+0x87/0xbf0
[  651.540886]        kthread+0x3aa/0x490
[  651.544637]        ret_from_fork+0x1f/0x30
[  651.548737]
               -> #2 ((wq_completion)events){+.+.}-{0:0}:
[  651.555349]        lock_acquire+0x1d2/0x5a0
[  651.559535]        flush_workqueue+0x109/0x15f0
[  651.564067]        nvmet_rdma_queue_connect+0x1c79/0x2610 [nvmet_rdma]
[  651.570594]        cma_cm_event_handler+0xef/0x530 [rdma_cm]
[  651.576261]        cma_ib_req_handler+0x1a76/0x4a50 [rdma_cm]
[  651.582016]        cm_process_work+0x44/0x190 [ib_cm]
[  651.587077]        cm_req_handler+0x2218/0x7040 [ib_cm]
[  651.592312]        cm_work_handler+0x1acf/0x4a30 [ib_cm]
[  651.597633]        process_one_work+0x978/0x1740
[  651.602253]        worker_thread+0x87/0xbf0
[  651.606437]        kthread+0x3aa/0x490
[  651.610190]        ret_from_fork+0x1f/0x30
[  651.614290]
               -> #1 (&id_priv->handler_mutex/1){+.+.}-{3:3}:
[  651.621249]        lock_acquire+0x1d2/0x5a0
[  651.625435]        __mutex_lock+0x155/0x15f0
[  651.629708]        cma_ib_req_handler+0x1709/0x4a50 [rdma_cm]
[  651.635462]        cm_process_work+0x44/0x190 [ib_cm]
[  651.640521]        cm_req_handler+0x2218/0x7040 [ib_cm]
[  651.645749]        cm_work_handler+0x1acf/0x4a30 [ib_cm]
[  651.651071]        process_one_work+0x978/0x1740
[  651.655689]        worker_thread+0x87/0xbf0
[  651.659874]        kthread+0x3aa/0x490
[  651.663628]        ret_from_fork+0x1f/0x30
[  651.667726]
               -> #0 (&id_priv->handler_mutex){+.+.}-{3:3}:
[  651.674511]        check_prevs_add+0x3fd/0x2480
[  651.679045]        __lock_acquire+0x23f7/0x2f80
[  651.683577]        lock_acquire+0x1d2/0x5a0
[  651.687762]        __mutex_lock+0x155/0x15f0
[  651.692036]        rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.697184]        nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma]
[  651.703189]        nvmet_rdma_release_queue_work+0x3b/0x90 [nvmet_rdma]
[  651.709802]        process_one_work+0x978/0x1740
[  651.714422]        worker_thread+0x87/0xbf0
[  651.718608]        kthread+0x3aa/0x490
[  651.722358]        ret_from_fork+0x1f/0x30
[  651.726459]
               other info that might help us debug this:

[  651.734457] Chain exists of:
                 &id_priv->handler_mutex --> (wq_completion)events -->
(work_completion)(&queue->release_work)

[  651.748531]  Possible unsafe locking scenario:

[  651.754451]        CPU0                    CPU1
[  651.758984]        ----                    ----
[  651.763515]   lock((work_completion)(&queue->release_work));
[  651.769174]                                lock((wq_completion)events);
[  651.775786]
lock((work_completion)(&queue->release_work));
[  651.783959]   lock(&id_priv->handler_mutex);
[  651.788233]
                *** DEADLOCK ***

[  651.794153] 2 locks held by kworker/14:4/4839:
[  651.798596]  #0: ffff88810005b948
((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x854/0x1740
[  651.808078]  #1: ffff8883cc21fe00
((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
process_one_work+0x888/0x1740
[  651.819121]
               stack backtrace:
[  651.823478] CPU: 14 PID: 4839 Comm: kworker/14:4 Tainted: G S
 I       5.16.0-rc3+ #2
[  651.831824] Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS
2.11.2 004/21/2021
[  651.839476] Workqueue: events nvmet_rdma_release_queue_work [nvmet_rdma]
[  651.846175] Call Trace:
[  651.848628]  <TASK>
[  651.850735]  dump_stack_lvl+0x44/0x57
[  651.854401]  check_noncircular+0x280/0x320
[  651.858501]  ? print_circular_bug.isra.47+0x430/0x430
[  651.863553]  ? lock_chain_count+0x20/0x20
[  651.867566]  ? wait_for_completion+0x1e0/0x270
[  651.872010]  ? mark_lock.part.52+0xf7/0x1050
[  651.876283]  ? sched_clock_cpu+0x15/0x200
[  651.880295]  ? find_held_lock+0x3a/0x1c0
[  651.884223]  check_prevs_add+0x3fd/0x2480
[  651.888238]  ? mark_held_locks+0xb7/0x120
[  651.892248]  ? check_irq_usage+0xb50/0xb50
[  651.896346]  ? lockdep_lock+0xcb/0x1c0
[  651.900100]  ? static_obj+0xc0/0xc0
[  651.903591]  ? sched_clock_cpu+0x15/0x200
[  651.907606]  __lock_acquire+0x23f7/0x2f80
[  651.911620]  ? rcu_read_lock_bh_held+0xc0/0xc0
[  651.916063]  lock_acquire+0x1d2/0x5a0
[  651.919729]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.924529]  ? rcu_read_unlock+0x50/0x50
[  651.928456]  ? lock_is_held_type+0xd9/0x130
[  651.932644]  __mutex_lock+0x155/0x15f0
[  651.936393]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.941194]  ? kasan_quarantine_put+0xa4/0x180
[  651.945641]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.950442]  ? mutex_lock_io_nested+0x1460/0x1460
[  651.955146]  ? lock_is_held_type+0xd9/0x130
[  651.959334]  ? rcu_read_lock_sched_held+0xaf/0xe0
[  651.964039]  ? rcu_read_lock_bh_held+0xc0/0xc0
[  651.968487]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.973287]  rdma_destroy_id+0x17/0x20 [rdma_cm]
[  651.977915]  nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma]
[  651.983401]  nvmet_rdma_release_queue_work+0x3b/0x90 [nvmet_rdma]
[  651.989493]  process_one_work+0x978/0x1740
[  651.993595]  ? pwq_dec_nr_in_flight+0x270/0x270
[  651.998126]  worker_thread+0x87/0xbf0
[  652.001790]  ? __kthread_parkme+0xc3/0x1d0
[  652.005890]  ? process_one_work+0x1740/0x1740
[  652.010248]  kthread+0x3aa/0x490
[  652.013480]  ? set_kthread_struct+0x100/0x100
[  652.017841]  ret_from_fork+0x1f/0x30
[  652.021424]  </TASK>
[  652.050739] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
disconnected (10): status 0 id 000000007ccc375a
[  652.060318] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
cm_id= 000000007ccc375a queue->state= 1
[  652.069937] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 0
[  652.082650] nvmet:nvmet_stop_keep_alive_timer: nvmet: ctrl 1 stop keep-alive
[  652.488476] eno2 speed is unknown, defaulting to 1000
[  652.493657] eno3 speed is unknown, defaulting to 1000
[  652.498847] eno4 speed is unknown, defaulting to 1000
[  652.558751] rdma_rxe: rxe-ah pool destroyed with unfree'd elem
[  652.678415] rdma_rxe: unloaded

-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2021-12-04  3:04 [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002 Yi Zhang
@ 2022-02-27 23:21 ` Bart Van Assche
  2022-05-25  3:40   ` yangx.jy
  2022-05-25 11:01 ` Sagi Grimberg
  1 sibling, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-02-27 23:21 UTC (permalink / raw)
  To: Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 12/3/21 19:04, Yi Zhang wrote:
> Below WARNING triggered by blktests nvmeof-mp/002, it can be
> reproduced with the latest 5.16.0-rc3 and also exists on 5.12, pls
> check it.
> 
> # use_siw=1 ./check nvmeof-mp/002

I can reproduce this issue with kernel v5.17.0-rc2.

Bart.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-02-27 23:21 ` Bart Van Assche
@ 2022-05-25  3:40   ` yangx.jy
  0 siblings, 0 replies; 19+ messages in thread
From: yangx.jy @ 2022-05-25  3:40 UTC (permalink / raw)
  To: Bart Van Assche, Yi Zhang, RDMA mailing list,
	open list:NVM EXPRESS DRIVER

On 2022/2/28 7:21, Bart Van Assche wrote:
> On 12/3/21 19:04, Yi Zhang wrote:
>> Below WARNING triggered by blktests nvmeof-mp/002, it can be
>> reproduced with the latest 5.16.0-rc3 and also exists on 5.12, pls
>> check it.
>>
>> # use_siw=1 ./check nvmeof-mp/002
> 
> I can reproduce this issue with kernel v5.17.0-rc2.
> 
> Bart.

Hi Bart and Yi

Could you reproduce it on v5.18.0? I cannot reproduce it on v5.18.0.
Is there a fix patch has been merged into v5.18.0?

Best Regards,
Xiao Yang

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2021-12-04  3:04 [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002 Yi Zhang
  2022-02-27 23:21 ` Bart Van Assche
@ 2022-05-25 11:01 ` Sagi Grimberg
  2022-05-25 18:50   ` Bart Van Assche
  2022-08-25  5:59   ` yangx.jy
  1 sibling, 2 replies; 19+ messages in thread
From: Sagi Grimberg @ 2022-05-25 11:01 UTC (permalink / raw)
  To: Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER


> Hello
> Below WARNING triggered by blktests nvmeof-mp/002, it can be
> reproduced with the latest 5.16.0-rc3 and also exists on 5.12, pls
> check it.

iirc this was reported before, based on my analysis lockdep is giving
a false alarm here. The reason is that the id_priv->handler_mutex cannot
be the same for both cm_id that is handling the connect and the cm_id
that is handling the rdma_destroy_id because rdma_destroy_id call
is always called on a already disconnected cm_id, so this deadlock
lockdep is complaining about cannot happen.

I'm not sure how to settle this.

> 
> # use_siw=1 ./check nvmeof-mp/002
> nvmeof-mp/002 (File I/O on top of multipath concurrently with logout
> and login (mq)) [passed]
>      runtime  41.767s  ...  36.877s
> # ./check nvmeof-mp/002
> nvmeof-mp/002 (File I/O on top of multipath concurrently with logout
> and login (mq)) [passed]
>      runtime    ...  41.277s
> 
> [  609.166100] run blktests nvmeof-mp/002 at 2021-12-03 21:27:52
> [  609.977883] null_blk: module loaded
> [  610.538697] rdma_rxe: loaded
> [  610.602763] infiniband eno1_rxe: set active
> [  610.602782] infiniband eno1_rxe: added eno1
> [  610.639520] eno2 speed is unknown, defaulting to 1000
> [  610.644946] eno2 speed is unknown, defaulting to 1000
> [  610.650318] eno2 speed is unknown, defaulting to 1000
> [  610.690546] infiniband eno2_rxe: set down
> [  610.690553] infiniband eno2_rxe: added eno2
> [  610.691112] eno2 speed is unknown, defaulting to 1000
> [  610.714293] eno2 speed is unknown, defaulting to 1000
> [  610.729535] eno3 speed is unknown, defaulting to 1000
> [  610.734666] eno3 speed is unknown, defaulting to 1000
> [  610.740026] eno3 speed is unknown, defaulting to 1000
> [  610.781372] infiniband eno3_rxe: set down
> [  610.781379] infiniband eno3_rxe: added eno3
> [  610.781906] eno3 speed is unknown, defaulting to 1000
> [  610.806211] eno2 speed is unknown, defaulting to 1000
> [  610.811508] eno3 speed is unknown, defaulting to 1000
> [  610.826655] eno4 speed is unknown, defaulting to 1000
> [  610.831810] eno4 speed is unknown, defaulting to 1000
> [  610.837176] eno4 speed is unknown, defaulting to 1000
> [  610.885844] infiniband eno4_rxe: set down
> [  610.885850] infiniband eno4_rxe: added eno4
> [  610.886426] eno4 speed is unknown, defaulting to 1000
> [  611.266756] Loading iSCSI transport class v2.0-870.
> [  611.335358] iscsi: registered transport (iser)
> [  611.387969] nvmet: adding nsid 1 to subsystem nvme-test
> [  611.403878] eno2 speed is unknown, defaulting to 1000
> [  611.409276] eno3 speed is unknown, defaulting to 1000
> [  611.414522] eno4 speed is unknown, defaulting to 1000
> [  611.468712] Rounding down aligned max_sectors from 4294967295 to 4294967288
> [  611.469203] db_root: cannot open: /etc/target
> [  611.515816] nvmet_rdma: enabling port 1 (10.16.221.116:7777)
> [  611.622158] eno2 speed is unknown, defaulting to 1000
> [  611.628243] eno3 speed is unknown, defaulting to 1000
> [  611.633422] eno4 speed is unknown, defaulting to 1000
> [  611.703648] eno2 speed is unknown, defaulting to 1000
> [  611.708949] eno3 speed is unknown, defaulting to 1000
> [  611.714267] eno4 speed is unknown, defaulting to 1000
> [  611.754520] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: address
> resolved (0): status 0 id 00000000d0d64a6d
> [  611.765454] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: route
> resolved  (2): status 0 id 00000000d0d64a6d
> [  611.767778] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: connect
> request (4): status 0 id 000000007ccc375a
> [  611.767935] nvmet_rdma:nvmet_rdma_find_get_device: nvmet_rdma:
> added eno1_rxe.
> [  611.777083] nvmet_rdma:nvmet_rdma_create_queue_ib: nvmet_rdma:
> nvmet_rdma_create_queue_ib: max_cqe= 8191 max_sge= 32 sq_size = 289
> cm_id= 000000007ccc375a
> --snip--
> [  612.132317] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: established
> (9): status 0 id 00000000cede4b76
> [  612.132452] nvme nvme0: mapped 32/0/0 default/read/poll queues.
> [  612.132537] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> established (9): status 0 id 0000000070c2f22c
> [  612.157849] nvmet:nvmet_execute_io_connect: nvmet: adding queue 1 to ctrl 1.
> [  612.158194] nvmet:nvmet_execute_io_connect: nvmet: adding queue 2 to ctrl 1.
> [  612.158548] nvmet:nvmet_execute_io_connect: nvmet: adding queue 3 to ctrl 1.
> [  612.158854] nvmet:nvmet_execute_io_connect: nvmet: adding queue 4 to ctrl 1.
> [  612.159173] nvmet:nvmet_execute_io_connect: nvmet: adding queue 5 to ctrl 1.
> [  612.159510] nvmet:nvmet_execute_io_connect: nvmet: adding queue 6 to ctrl 1.
> [  612.159853] nvmet:nvmet_execute_io_connect: nvmet: adding queue 7 to ctrl 1.
> [  612.160225] nvmet:nvmet_execute_io_connect: nvmet: adding queue 8 to ctrl 1.
> [  612.160572] nvmet:nvmet_execute_io_connect: nvmet: adding queue 9 to ctrl 1.
> [  612.160885] nvmet:nvmet_execute_io_connect: nvmet: adding queue 10 to ctrl 1.
> [  612.161216] nvmet:nvmet_execute_io_connect: nvmet: adding queue 11 to ctrl 1.
> [  612.161759] nvmet:nvmet_execute_io_connect: nvmet: adding queue 12 to ctrl 1.
> [  612.162123] nvmet:nvmet_execute_io_connect: nvmet: adding queue 13 to ctrl 1.
> [  612.162473] nvmet:nvmet_execute_io_connect: nvmet: adding queue 14 to ctrl 1.
> [  612.162793] nvmet:nvmet_execute_io_connect: nvmet: adding queue 15 to ctrl 1.
> [  612.163135] nvmet:nvmet_execute_io_connect: nvmet: adding queue 16 to ctrl 1.
> [  612.163458] nvmet:nvmet_execute_io_connect: nvmet: adding queue 17 to ctrl 1.
> [  612.163776] nvmet:nvmet_execute_io_connect: nvmet: adding queue 18 to ctrl 1.
> [  612.164103] nvmet:nvmet_execute_io_connect: nvmet: adding queue 19 to ctrl 1.
> [  612.164417] nvmet:nvmet_execute_io_connect: nvmet: adding queue 20 to ctrl 1.
> [  612.164700] nvmet:nvmet_execute_io_connect: nvmet: adding queue 21 to ctrl 1.
> [  612.164971] nvmet:nvmet_execute_io_connect: nvmet: adding queue 22 to ctrl 1.
> [  612.165275] nvmet:nvmet_execute_io_connect: nvmet: adding queue 23 to ctrl 1.
> [  612.165689] nvmet:nvmet_execute_io_connect: nvmet: adding queue 24 to ctrl 1.
> [  612.166082] nvmet:nvmet_execute_io_connect: nvmet: adding queue 25 to ctrl 1.
> [  612.166383] nvmet:nvmet_execute_io_connect: nvmet: adding queue 26 to ctrl 1.
> [  612.166663] nvmet:nvmet_execute_io_connect: nvmet: adding queue 27 to ctrl 1.
> [  612.166963] nvmet:nvmet_execute_io_connect: nvmet: adding queue 28 to ctrl 1.
> [  612.167273] nvmet:nvmet_execute_io_connect: nvmet: adding queue 29 to ctrl 1.
> [  612.167529] nvmet:nvmet_execute_io_connect: nvmet: adding queue 30 to ctrl 1.
> [  612.167804] nvmet:nvmet_execute_io_connect: nvmet: adding queue 31 to ctrl 1.
> [  612.168114] nvmet:nvmet_execute_io_connect: nvmet: adding queue 32 to ctrl 1.
> [  612.168523] nvme nvme0: new ctrl: NQN "nvme-test", addr 10.16.221.116:7777
> [  612.293734] device-mapper: multipath service-time: version 0.3.0 loaded
> [  612.296099] device-mapper: table: 253:3: multipath: error getting
> device (-EBUSY)
> [  612.303676] device-mapper: ioctl: error adding target to table
> [  612.501014] device-mapper: table: 253:3: multipath: error getting
> device (-EBUSY)
> [  612.508751] device-mapper: ioctl: error adding target to table
> [  612.514717] device-mapper: table: 253:3: multipath: error getting
> device (-EBUSY)
> [  612.522632] device-mapper: ioctl: error adding target to table
> [  612.950887] device-mapper: table: 253:3: multipath: error getting
> device (-EBUSY)
> [  612.958616] device-mapper: ioctl: error adding target to table
> [  613.869348] EXT4-fs (dm-3): mounted filesystem without journal.
> Opts: (null). Quota mode: none.
> [  613.869619] ext4 filesystem being mounted at
> /root/kernel-tests/storage/blktests/nvme/nvmeof-mp/blktests/results/tmpdir.nvmeof-mp.002.ZYO/mnt0
> supports timestamps until 2038 (0x7fffffff)
> [  617.144570] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  617.272414] nvmet:nvmet_execute_keep_alive: nvmet: ctrl 1 update
> keep-alive timer for 5 secs
> [  622.775724] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  627.895810] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  633.015369] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  638.138209] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  643.254966] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  644.822178] eno2 speed is unknown, defaulting to 1000
> [  644.827494] eno3 speed is unknown, defaulting to 1000
> [  644.832797] eno4 speed is unknown, defaulting to 1000
> [  648.377790] nvmet:nvmet_keep_alive_timer: nvmet: ctrl 1 reschedule
> traffic based keep-alive timer
> [  651.364570] nvme nvme0: Removing ctrl: NQN "nvme-test"
> [  651.430891] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 000000009ae8bc67
> [  651.430914] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 000000009ae8bc67 queue->state= 1
> [  651.430983] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 0000000078827089
> [  651.430994] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 0000000078827089 queue->state= 1
> --snip--
> [  651.438145] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0:
> disconnected (10): status 0 id 00000000cede4b76
> [  651.438150] nvme_rdma:nvme_rdma_cm_handler: nvme nvme0: disconnect
> received - connection closed
> 
> [  651.448883] ======================================================
> [  651.455061] WARNING: possible circular locking dependency detected
> [  651.461242] 5.16.0-rc3+ #2 Tainted: G S        I
> [  651.466380] ------------------------------------------------------
> [  651.472559] kworker/14:4/4839 is trying to acquire lock:
> [  651.477873] ffff8884c0e353e0 (&id_priv->handler_mutex){+.+.}-{3:3},
> at: rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.487623]
>                 but task is already holding lock:
> [  651.493453] ffff8883cc21fe00
> ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
> process_one_work+0x888/0x1740
> [  651.504062]
>                 which lock already depends on the new lock.
> 
> [  651.512233]
>                 the existing dependency chain (in reverse order) is:
> [  651.519715]
>                 -> #3 ((work_completion)(&queue->release_work)){+.+.}-{0:0}:
> [  651.527894]        lock_acquire+0x1d2/0x5a0
> [  651.532079]        process_one_work+0x8da/0x1740
> [  651.536700]        worker_thread+0x87/0xbf0
> [  651.540886]        kthread+0x3aa/0x490
> [  651.544637]        ret_from_fork+0x1f/0x30
> [  651.548737]
>                 -> #2 ((wq_completion)events){+.+.}-{0:0}:
> [  651.555349]        lock_acquire+0x1d2/0x5a0
> [  651.559535]        flush_workqueue+0x109/0x15f0
> [  651.564067]        nvmet_rdma_queue_connect+0x1c79/0x2610 [nvmet_rdma]
> [  651.570594]        cma_cm_event_handler+0xef/0x530 [rdma_cm]
> [  651.576261]        cma_ib_req_handler+0x1a76/0x4a50 [rdma_cm]
> [  651.582016]        cm_process_work+0x44/0x190 [ib_cm]
> [  651.587077]        cm_req_handler+0x2218/0x7040 [ib_cm]
> [  651.592312]        cm_work_handler+0x1acf/0x4a30 [ib_cm]
> [  651.597633]        process_one_work+0x978/0x1740
> [  651.602253]        worker_thread+0x87/0xbf0
> [  651.606437]        kthread+0x3aa/0x490
> [  651.610190]        ret_from_fork+0x1f/0x30
> [  651.614290]
>                 -> #1 (&id_priv->handler_mutex/1){+.+.}-{3:3}:
> [  651.621249]        lock_acquire+0x1d2/0x5a0
> [  651.625435]        __mutex_lock+0x155/0x15f0
> [  651.629708]        cma_ib_req_handler+0x1709/0x4a50 [rdma_cm]
> [  651.635462]        cm_process_work+0x44/0x190 [ib_cm]
> [  651.640521]        cm_req_handler+0x2218/0x7040 [ib_cm]
> [  651.645749]        cm_work_handler+0x1acf/0x4a30 [ib_cm]
> [  651.651071]        process_one_work+0x978/0x1740
> [  651.655689]        worker_thread+0x87/0xbf0
> [  651.659874]        kthread+0x3aa/0x490
> [  651.663628]        ret_from_fork+0x1f/0x30
> [  651.667726]
>                 -> #0 (&id_priv->handler_mutex){+.+.}-{3:3}:
> [  651.674511]        check_prevs_add+0x3fd/0x2480
> [  651.679045]        __lock_acquire+0x23f7/0x2f80
> [  651.683577]        lock_acquire+0x1d2/0x5a0
> [  651.687762]        __mutex_lock+0x155/0x15f0
> [  651.692036]        rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.697184]        nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma]
> [  651.703189]        nvmet_rdma_release_queue_work+0x3b/0x90 [nvmet_rdma]
> [  651.709802]        process_one_work+0x978/0x1740
> [  651.714422]        worker_thread+0x87/0xbf0
> [  651.718608]        kthread+0x3aa/0x490
> [  651.722358]        ret_from_fork+0x1f/0x30
> [  651.726459]
>                 other info that might help us debug this:
> 
> [  651.734457] Chain exists of:
>                   &id_priv->handler_mutex --> (wq_completion)events -->
> (work_completion)(&queue->release_work)
> 
> [  651.748531]  Possible unsafe locking scenario:
> 
> [  651.754451]        CPU0                    CPU1
> [  651.758984]        ----                    ----
> [  651.763515]   lock((work_completion)(&queue->release_work));
> [  651.769174]                                lock((wq_completion)events);
> [  651.775786]
> lock((work_completion)(&queue->release_work));
> [  651.783959]   lock(&id_priv->handler_mutex);
> [  651.788233]
>                  *** DEADLOCK ***
> 
> [  651.794153] 2 locks held by kworker/14:4/4839:
> [  651.798596]  #0: ffff88810005b948
> ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x854/0x1740
> [  651.808078]  #1: ffff8883cc21fe00
> ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at:
> process_one_work+0x888/0x1740
> [  651.819121]
>                 stack backtrace:
> [  651.823478] CPU: 14 PID: 4839 Comm: kworker/14:4 Tainted: G S
>   I       5.16.0-rc3+ #2
> [  651.831824] Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS
> 2.11.2 004/21/2021
> [  651.839476] Workqueue: events nvmet_rdma_release_queue_work [nvmet_rdma]
> [  651.846175] Call Trace:
> [  651.848628]  <TASK>
> [  651.850735]  dump_stack_lvl+0x44/0x57
> [  651.854401]  check_noncircular+0x280/0x320
> [  651.858501]  ? print_circular_bug.isra.47+0x430/0x430
> [  651.863553]  ? lock_chain_count+0x20/0x20
> [  651.867566]  ? wait_for_completion+0x1e0/0x270
> [  651.872010]  ? mark_lock.part.52+0xf7/0x1050
> [  651.876283]  ? sched_clock_cpu+0x15/0x200
> [  651.880295]  ? find_held_lock+0x3a/0x1c0
> [  651.884223]  check_prevs_add+0x3fd/0x2480
> [  651.888238]  ? mark_held_locks+0xb7/0x120
> [  651.892248]  ? check_irq_usage+0xb50/0xb50
> [  651.896346]  ? lockdep_lock+0xcb/0x1c0
> [  651.900100]  ? static_obj+0xc0/0xc0
> [  651.903591]  ? sched_clock_cpu+0x15/0x200
> [  651.907606]  __lock_acquire+0x23f7/0x2f80
> [  651.911620]  ? rcu_read_lock_bh_held+0xc0/0xc0
> [  651.916063]  lock_acquire+0x1d2/0x5a0
> [  651.919729]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.924529]  ? rcu_read_unlock+0x50/0x50
> [  651.928456]  ? lock_is_held_type+0xd9/0x130
> [  651.932644]  __mutex_lock+0x155/0x15f0
> [  651.936393]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.941194]  ? kasan_quarantine_put+0xa4/0x180
> [  651.945641]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.950442]  ? mutex_lock_io_nested+0x1460/0x1460
> [  651.955146]  ? lock_is_held_type+0xd9/0x130
> [  651.959334]  ? rcu_read_lock_sched_held+0xaf/0xe0
> [  651.964039]  ? rcu_read_lock_bh_held+0xc0/0xc0
> [  651.968487]  ? rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.973287]  rdma_destroy_id+0x17/0x20 [rdma_cm]
> [  651.977915]  nvmet_rdma_free_queue+0x7a/0x380 [nvmet_rdma]
> [  651.983401]  nvmet_rdma_release_queue_work+0x3b/0x90 [nvmet_rdma]
> [  651.989493]  process_one_work+0x978/0x1740
> [  651.993595]  ? pwq_dec_nr_in_flight+0x270/0x270
> [  651.998126]  worker_thread+0x87/0xbf0
> [  652.001790]  ? __kthread_parkme+0xc3/0x1d0
> [  652.005890]  ? process_one_work+0x1740/0x1740
> [  652.010248]  kthread+0x3aa/0x490
> [  652.013480]  ? set_kthread_struct+0x100/0x100
> [  652.017841]  ret_from_fork+0x1f/0x30
> [  652.021424]  </TASK>
> [  652.050739] nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma:
> disconnected (10): status 0 id 000000007ccc375a
> [  652.060318] nvmet_rdma:__nvmet_rdma_queue_disconnect: nvmet_rdma:
> cm_id= 000000007ccc375a queue->state= 1
> [  652.069937] nvmet_rdma:nvmet_rdma_free_queue: nvmet_rdma: freeing queue 0
> [  652.082650] nvmet:nvmet_stop_keep_alive_timer: nvmet: ctrl 1 stop keep-alive
> [  652.488476] eno2 speed is unknown, defaulting to 1000
> [  652.493657] eno3 speed is unknown, defaulting to 1000
> [  652.498847] eno4 speed is unknown, defaulting to 1000
> [  652.558751] rdma_rxe: rxe-ah pool destroyed with unfree'd elem
> [  652.678415] rdma_rxe: unloaded
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-25 11:01 ` Sagi Grimberg
@ 2022-05-25 18:50   ` Bart Van Assche
  2022-05-27 12:52     ` Jason Gunthorpe
  2022-08-25  5:59   ` yangx.jy
  1 sibling, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-05-25 18:50 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 5/25/22 13:01, Sagi Grimberg wrote:
> iirc this was reported before, based on my analysis lockdep is giving
> a false alarm here. The reason is that the id_priv->handler_mutex cannot
> be the same for both cm_id that is handling the connect and the cm_id
> that is handling the rdma_destroy_id because rdma_destroy_id call
> is always called on a already disconnected cm_id, so this deadlock
> lockdep is complaining about cannot happen.
> 
> I'm not sure how to settle this.

If the above is correct, using lockdep_register_key() for 
id_priv->handler_mutex instead of a static key should make the lockdep 
false positive disappear.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-25 18:50   ` Bart Van Assche
@ 2022-05-27 12:52     ` Jason Gunthorpe
  2022-05-28 19:00       ` Bart Van Assche
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-05-27 12:52 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On Wed, May 25, 2022 at 08:50:52PM +0200, Bart Van Assche wrote:
> On 5/25/22 13:01, Sagi Grimberg wrote:
> > iirc this was reported before, based on my analysis lockdep is giving
> > a false alarm here. The reason is that the id_priv->handler_mutex cannot
> > be the same for both cm_id that is handling the connect and the cm_id
> > that is handling the rdma_destroy_id because rdma_destroy_id call
> > is always called on a already disconnected cm_id, so this deadlock
> > lockdep is complaining about cannot happen.
> > 
> > I'm not sure how to settle this.
> 
> If the above is correct, using lockdep_register_key() for
> id_priv->handler_mutex instead of a static key should make the lockdep false
> positive disappear.

That only works if you can detect actual different lock classes during
lock creation. It doesn't seem applicable in this case.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-27 12:52     ` Jason Gunthorpe
@ 2022-05-28 19:00       ` Bart Van Assche
  2022-05-31 12:35         ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-05-28 19:00 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 5/27/22 14:52, Jason Gunthorpe wrote:
> On Wed, May 25, 2022 at 08:50:52PM +0200, Bart Van Assche wrote:
>> On 5/25/22 13:01, Sagi Grimberg wrote:
>>> iirc this was reported before, based on my analysis lockdep is giving
>>> a false alarm here. The reason is that the id_priv->handler_mutex cannot
>>> be the same for both cm_id that is handling the connect and the cm_id
>>> that is handling the rdma_destroy_id because rdma_destroy_id call
>>> is always called on a already disconnected cm_id, so this deadlock
>>> lockdep is complaining about cannot happen.
>>>
>>> I'm not sure how to settle this.
>>
>> If the above is correct, using lockdep_register_key() for
>> id_priv->handler_mutex instead of a static key should make the lockdep false
>> positive disappear.
> 
> That only works if you can detect actual different lock classes during
> lock creation. It doesn't seem applicable in this case.

Why doesn't it seem applicable in this case? The default behavior of 
mutex_init() and related initialization functions is to create one lock 
class per synchronization object initialization caller. 
lockdep_register_key() can be used to create one lock class per 
synchronization object instance. I introduced lockdep_register_key() 
myself a few years ago.

After having taken a closer look at the RDMA/CM code, I decided not yet 
to implement what I proposed above. I noticed that handler_mutex is held 
around callback invocations. An example:

static int cma_cm_event_handler(struct rdma_id_private *id_priv,
				struct rdma_cm_event *event)
{
	int ret;

	lockdep_assert_held(&id_priv->handler_mutex);

	trace_cm_event_handler(id_priv, event);
	ret = id_priv->id.event_handler(&id_priv->id, event);
	trace_cm_event_done(id_priv, event, ret);
	return ret;
}

My opinion is that holding *any* lock around the invocation of a 
callback function is an antipattern, in other words, something that 
never should be done. John Ousterhout already described this in 1996 in 
his presentation [1]. Patches like 071ba4cc559d ("RDMA: Add 
rdma_connect_locked()") work around this problem but do not solve it.

Has it been considered to rework the RDMA/CM such that no locks are held 
around the invocation of callback functions like the event_handler 
callback? There are other mechanisms to report events from one software 
layer (RDMA/CM) to a higher software layer (ULP), e.g. a linked list 
with event information. The RDMA/CM could queue events onto that list 
and the ULP can dequeue events from that list.

Thanks,

Bart.

[1] Ousterhout, John. "Why threads are a bad idea (for most purposes)." 
In Presentation given at the 1996 Usenix Annual Technical Conference, 
vol. 5. 1996.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-28 19:00       ` Bart Van Assche
@ 2022-05-31 12:35         ` Jason Gunthorpe
  2022-05-31 17:55           ` Bart Van Assche
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-05-31 12:35 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On Sat, May 28, 2022 at 09:00:16PM +0200, Bart Van Assche wrote:
> On 5/27/22 14:52, Jason Gunthorpe wrote:
> > On Wed, May 25, 2022 at 08:50:52PM +0200, Bart Van Assche wrote:
> > > On 5/25/22 13:01, Sagi Grimberg wrote:
> > > > iirc this was reported before, based on my analysis lockdep is giving
> > > > a false alarm here. The reason is that the id_priv->handler_mutex cannot
> > > > be the same for both cm_id that is handling the connect and the cm_id
> > > > that is handling the rdma_destroy_id because rdma_destroy_id call
> > > > is always called on a already disconnected cm_id, so this deadlock
> > > > lockdep is complaining about cannot happen.
> > > > 
> > > > I'm not sure how to settle this.
> > > 
> > > If the above is correct, using lockdep_register_key() for
> > > id_priv->handler_mutex instead of a static key should make the lockdep false
> > > positive disappear.
> > 
> > That only works if you can detect actual different lock classes during
> > lock creation. It doesn't seem applicable in this case.
> 
> Why doesn't it seem applicable in this case? The default behavior of
> mutex_init() and related initialization functions is to create one lock
> class per synchronization object initialization caller.
> lockdep_register_key() can be used to create one lock class per
> synchronization object instance. I introduced lockdep_register_key() myself
> a few years ago.

I don't think this should be used to create one key per instance of
the object which would be required here. The overhead would be very
high.

> My opinion is that holding *any* lock around the invocation of a callback
> function is an antipattern, in other words, something that never should be
> done. 

Then you invariably have an API that will be full of races because we
do need to run the callbacks synchronously with the FSM.  Many
syzkaller bugs were fixed by adding this serialization.

> Has it been considered to rework the RDMA/CM such that no locks are held
> around the invocation of callback functions like the event_handler
> callback?

IMHO it is too difficult, maybe impossible.

> There are other mechanisms to report events from one software layer
> (RDMA/CM) to a higher software layer (ULP), e.g. a linked list with event
> information. The RDMA/CM could queue events onto that list and the ULP can
> dequeue events from that list.

Then it is not synchronous, the point of these callbacks is to be
synchronous. If a ULP would like and can tolerate a decoupled
operation then it can implement an event queue, but we can't generally
state that all ULPs are safe to be asynchronous for all events.

This also doesn't actually solve anything because we still have races
with destroying the ID while the event queue is refering to the cm_id,
or while the event queue consumer is processing it. This still
requires locks to solve, even if they may be weaker rw/locks or
refcounting locks.

> [1] Ousterhout, John. "Why threads are a bad idea (for most purposes)." In
> Presentation given at the 1996 Usenix Annual Technical Conference, vol. 5.
> 1996.

Indeed, but we have threads here and we can't wish them away.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-31 12:35         ` Jason Gunthorpe
@ 2022-05-31 17:55           ` Bart Van Assche
  2022-06-01 12:45             ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-05-31 17:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 5/31/22 05:35, Jason Gunthorpe wrote:
> On Sat, May 28, 2022 at 09:00:16PM +0200, Bart Van Assche wrote:
>> On 5/27/22 14:52, Jason Gunthorpe wrote:
>>> That only works if you can detect actual different lock classes during
>>> lock creation. It doesn't seem applicable in this case.
>>
>> Why doesn't it seem applicable in this case? The default behavior of
>> mutex_init() and related initialization functions is to create one lock
>> class per synchronization object initialization caller.
>> lockdep_register_key() can be used to create one lock class per
>> synchronization object instance. I introduced lockdep_register_key() myself
>> a few years ago.
> 
> I don't think this should be used to create one key per instance of
> the object which would be required here. The overhead would be very
> high.

Are we perhaps referring to different code changes? I'm referring to the
code change below. The runtime and memory overhead of the patch below
should be minimal.

Thanks,

Bart.


diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index fabca5e51e3d..d476c64cd84a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -860,6 +860,8 @@ __rdma_create_id(struct net *net, rdma_cm_event_handler event_handler,
  	init_completion(&id_priv->comp);
  	refcount_set(&id_priv->refcount, 1);
  	mutex_init(&id_priv->handler_mutex);
+	lockdep_register_key(&id_priv->handler_key);
+	lockdep_set_class(&id_priv->handler_mutex, &id_priv->handler_key);
  	INIT_LIST_HEAD(&id_priv->device_item);
  	INIT_LIST_HEAD(&id_priv->listen_list);
  	INIT_LIST_HEAD(&id_priv->mc_list);
@@ -1899,12 +1901,16 @@ static void _destroy_id(struct rdma_id_private *id_priv,
  	cma_id_put(id_priv);
  	wait_for_completion(&id_priv->comp);

+	mutex_destroy(&id_priv->handler_mutex);
+	lockdep_unregister_key(&id_priv->handler_key);
+
  	if (id_priv->internal_id)
  		cma_id_put(id_priv->id.context);

  	kfree(id_priv->id.route.path_rec);

  	put_net(id_priv->id.route.addr.dev_addr.net);
+
  	kfree(id_priv);
  }

diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
index 757a0ef79872..4affecd044eb 100644
--- a/drivers/infiniband/core/cma_priv.h
+++ b/drivers/infiniband/core/cma_priv.h
@@ -75,6 +75,7 @@ struct rdma_id_private {
  	struct completion	comp;
  	refcount_t refcount;
  	struct mutex		handler_mutex;
+	struct lock_class_key	handler_key;

  	int			backlog;
  	int			timeout_ms;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-31 17:55           ` Bart Van Assche
@ 2022-06-01 12:45             ` Jason Gunthorpe
  2022-06-01 16:26               ` Bart Van Assche
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-01 12:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On Tue, May 31, 2022 at 10:55:46AM -0700, Bart Van Assche wrote:
> On 5/31/22 05:35, Jason Gunthorpe wrote:
> > On Sat, May 28, 2022 at 09:00:16PM +0200, Bart Van Assche wrote:
> > > On 5/27/22 14:52, Jason Gunthorpe wrote:
> > > > That only works if you can detect actual different lock classes during
> > > > lock creation. It doesn't seem applicable in this case.
> > > 
> > > Why doesn't it seem applicable in this case? The default behavior of
> > > mutex_init() and related initialization functions is to create one lock
> > > class per synchronization object initialization caller.
> > > lockdep_register_key() can be used to create one lock class per
> > > synchronization object instance. I introduced lockdep_register_key() myself
> > > a few years ago.
> > 
> > I don't think this should be used to create one key per instance of
> > the object which would be required here. The overhead would be very
> > high.
> 
> Are we perhaps referring to different code changes? I'm referring to the
> code change below. The runtime and memory overhead of the patch below
> should be minimal.

This is not minimal, the lockdep graph will expand now with a node per
created CM ID ever created and with all the additional locking
arcs. This is an expensive operation.

AFIAK keys should not be created per-object like this but based on
object classes known when the object is created - eg a CM listening ID
vs a connceting ID as an example

This might be a suitable hack if the # of objects was small???

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-06-01 12:45             ` Jason Gunthorpe
@ 2022-06-01 16:26               ` Bart Van Assche
  2022-06-01 17:30                 ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-06-01 16:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 6/1/22 05:45, Jason Gunthorpe wrote:
> On Tue, May 31, 2022 at 10:55:46AM -0700, Bart Van Assche wrote:
>> On 5/31/22 05:35, Jason Gunthorpe wrote:
>>> On Sat, May 28, 2022 at 09:00:16PM +0200, Bart Van Assche wrote:
>>>> On 5/27/22 14:52, Jason Gunthorpe wrote:
>>>>> That only works if you can detect actual different lock classes during
>>>>> lock creation. It doesn't seem applicable in this case.
>>>>
>>>> Why doesn't it seem applicable in this case? The default behavior of
>>>> mutex_init() and related initialization functions is to create one lock
>>>> class per synchronization object initialization caller.
>>>> lockdep_register_key() can be used to create one lock class per
>>>> synchronization object instance. I introduced lockdep_register_key() myself
>>>> a few years ago.
>>>
>>> I don't think this should be used to create one key per instance of
>>> the object which would be required here. The overhead would be very
>>> high.
>>
>> Are we perhaps referring to different code changes? I'm referring to the
>> code change below. The runtime and memory overhead of the patch below
>> should be minimal.
> 
> This is not minimal, the lockdep graph will expand now with a node per
> created CM ID ever created and with all the additional locking
> arcs. This is an expensive operation.
> 
> AFIAK keys should not be created per-object like this but based on
> object classes known when the object is created - eg a CM listening ID
> vs a connceting ID as an example
> 
> This might be a suitable hack if the # of objects was small???

Lockdep uses hashing when looking up a lock object so the lookup time 
shouldn't increase significantly if the number of hash collisions stays 
low. I think this is likely since the number of hash entries is 
identical to the maximum number of synchronization objects divided by 
two. See also the definition of the lock_keys_hash[] array in 
kernel/locking/lockdep.c.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-06-01 16:26               ` Bart Van Assche
@ 2022-06-01 17:30                 ` Jason Gunthorpe
  2022-06-03  5:13                   ` Bart Van Assche
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-01 17:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On Wed, Jun 01, 2022 at 09:26:52AM -0700, Bart Van Assche wrote:
> On 6/1/22 05:45, Jason Gunthorpe wrote:
> > On Tue, May 31, 2022 at 10:55:46AM -0700, Bart Van Assche wrote:
> > > On 5/31/22 05:35, Jason Gunthorpe wrote:
> > > > On Sat, May 28, 2022 at 09:00:16PM +0200, Bart Van Assche wrote:
> > > > > On 5/27/22 14:52, Jason Gunthorpe wrote:
> > > > > > That only works if you can detect actual different lock classes during
> > > > > > lock creation. It doesn't seem applicable in this case.
> > > > > 
> > > > > Why doesn't it seem applicable in this case? The default behavior of
> > > > > mutex_init() and related initialization functions is to create one lock
> > > > > class per synchronization object initialization caller.
> > > > > lockdep_register_key() can be used to create one lock class per
> > > > > synchronization object instance. I introduced lockdep_register_key() myself
> > > > > a few years ago.
> > > > 
> > > > I don't think this should be used to create one key per instance of
> > > > the object which would be required here. The overhead would be very
> > > > high.
> > > 
> > > Are we perhaps referring to different code changes? I'm referring to the
> > > code change below. The runtime and memory overhead of the patch below
> > > should be minimal.
> > 
> > This is not minimal, the lockdep graph will expand now with a node per
> > created CM ID ever created and with all the additional locking
> > arcs. This is an expensive operation.
> > 
> > AFIAK keys should not be created per-object like this but based on
> > object classes known when the object is created - eg a CM listening ID
> > vs a connceting ID as an example
> > 
> > This might be a suitable hack if the # of objects was small???
> 
> Lockdep uses hashing when looking up a lock object so the lookup time
> shouldn't increase significantly if the number of hash collisions stays low.
> I think this is likely since the number of hash entries is identical to the
> maximum number of synchronization objects divided by two. See also the
> definition of the lock_keys_hash[] array in kernel/locking/lockdep.c.

That is just the keys, not the graph arcs. lockdep records an arc
between every key that establishes a locking relationship and
minimizing the number of keys also de-duplicates those arcs.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-06-01 17:30                 ` Jason Gunthorpe
@ 2022-06-03  5:13                   ` Bart Van Assche
  2022-06-06 16:21                     ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2022-06-03  5:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 6/1/22 10:30, Jason Gunthorpe wrote:
> That is just the keys, not the graph arcs. lockdep records an arc
> between every key that establishes a locking relationship and
> minimizing the number of keys also de-duplicates those arcs.

Do you agree that this overhead is acceptable since lockdep is only
used to debug kernel code?

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-06-03  5:13                   ` Bart Van Assche
@ 2022-06-06 16:21                     ` Jason Gunthorpe
  2022-08-23  7:29                       ` yangx.jy
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2022-06-06 16:21 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On Thu, Jun 02, 2022 at 10:13:25PM -0700, Bart Van Assche wrote:
> On 6/1/22 10:30, Jason Gunthorpe wrote:
> > That is just the keys, not the graph arcs. lockdep records an arc
> > between every key that establishes a locking relationship and
> > minimizing the number of keys also de-duplicates those arcs.
> 
> Do you agree that this overhead is acceptable since lockdep is only
> used to debug kernel code?

I don't know lockdep well enough to know - if you bloat the graph too
much it may become excessively slow when searching for
cycles/edges. Debug workloads with CM may trigger creating of 1000's
of keys and we still want the testing to be workable.

IMHO it was not designed to have one key per instance.

Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-06-06 16:21                     ` Jason Gunthorpe
@ 2022-08-23  7:29                       ` yangx.jy
  0 siblings, 0 replies; 19+ messages in thread
From: yangx.jy @ 2022-08-23  7:29 UTC (permalink / raw)
  To: Jason Gunthorpe, Bart Van Assche
  Cc: Sagi Grimberg, Yi Zhang, RDMA mailing list, open list:NVM EXPRESS DRIVER

On 2022/6/7 0:21, Jason Gunthorpe wrote:
> On Thu, Jun 02, 2022 at 10:13:25PM -0700, Bart Van Assche wrote:
>> On 6/1/22 10:30, Jason Gunthorpe wrote:
>>> That is just the keys, not the graph arcs. lockdep records an arc
>>> between every key that establishes a locking relationship and
>>> minimizing the number of keys also de-duplicates those arcs.
>>
>> Do you agree that this overhead is acceptable since lockdep is only
>> used to debug kernel code?
> 
> I don't know lockdep well enough to know - if you bloat the graph too
> much it may become excessively slow when searching for
> cycles/edges. Debug workloads with CM may trigger creating of 1000's
> of keys and we still want the testing to be workable.
> 
> IMHO it was not designed to have one key per instance.

Hi Jason, Bart

I wonder if there is a patch to fix the issue?

Best Regards,
Xiao Yang

> 
> Jason

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-05-25 11:01 ` Sagi Grimberg
  2022-05-25 18:50   ` Bart Van Assche
@ 2022-08-25  5:59   ` yangx.jy
  2022-08-25  6:26     ` Guoqing Jiang
  1 sibling, 1 reply; 19+ messages in thread
From: yangx.jy @ 2022-08-25  5:59 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang, RDMA mailing list,
	open list:NVM EXPRESS DRIVER, bvanassche, Jason Gunthorpe

On 2022/5/25 19:01, Sagi Grimberg wrote:
> iirc this was reported before, based on my analysis lockdep is giving
> a false alarm here. The reason is that the id_priv->handler_mutex cannot
> be the same for both cm_id that is handling the connect and the cm_id
> that is handling the rdma_destroy_id because rdma_destroy_id call
> is always called on a already disconnected cm_id, so this deadlock
> lockdep is complaining about cannot happen.

Hi Jason, Bart and Sagi,

I also think it is actually a false positive.  The cm_id handling the 
connection and the cm_id calling rdma_destroy_id() cannot be the same 
one, right?

> 
> I'm not sure how to settle this.

Do you have any suggestion to remove the false positive by refactoring 
the related RDMA/CM code. Sorry, I didn't know how to do it for now.

Best Regards,
Xiao Yang

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-08-25  5:59   ` yangx.jy
@ 2022-08-25  6:26     ` Guoqing Jiang
  2022-08-26 10:03       ` yangx.jy
  0 siblings, 1 reply; 19+ messages in thread
From: Guoqing Jiang @ 2022-08-25  6:26 UTC (permalink / raw)
  To: yangx.jy, Sagi Grimberg, Yi Zhang, RDMA mailing list,
	open list:NVM EXPRESS DRIVER, bvanassche, Jason Gunthorpe



On 8/25/22 1:59 PM, yangx.jy@fujitsu.com wrote:
> On 2022/5/25 19:01, Sagi Grimberg wrote:
>> iirc this was reported before, based on my analysis lockdep is giving
>> a false alarm here. The reason is that the id_priv->handler_mutex cannot
>> be the same for both cm_id that is handling the connect and the cm_id
>> that is handling the rdma_destroy_id because rdma_destroy_id call
>> is always called on a already disconnected cm_id, so this deadlock
>> lockdep is complaining about cannot happen.
> Hi Jason, Bart and Sagi,
>
> I also think it is actually a false positive.  The cm_id handling the
> connection and the cm_id calling rdma_destroy_id() cannot be the same
> one, right?

I am wondering if it is the same as the thread.

https://lore.kernel.org/linux-rdma/CAMGffEm22sP-oKK0D9=vOw77nbS05iwG7MC3DTVB0CyzVFhtXg@mail.gmail.com/

>> I'm not sure how to settle this.
> Do you have any suggestion to remove the false positive by refactoring
> the related RDMA/CM code. Sorry, I didn't know how to do it for now.

The simplest way is to call lockdep_off in case it is false alarm to 
avoid the
debugging effort, but not everyone likes the idea.

https://elixir.bootlin.com/linux/v6.0-rc2/C/ident/lockdep_off

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-08-25  6:26     ` Guoqing Jiang
@ 2022-08-26 10:03       ` yangx.jy
  2022-08-26 11:32         ` Guoqing Jiang
  0 siblings, 1 reply; 19+ messages in thread
From: yangx.jy @ 2022-08-26 10:03 UTC (permalink / raw)
  To: Guoqing Jiang, Sagi Grimberg, Yi Zhang, RDMA mailing list,
	open list:NVM EXPRESS DRIVER, bvanassche, Jason Gunthorpe

On 2022/8/25 14:26, Guoqing Jiang wrote:
> 
> 
> On 8/25/22 1:59 PM, yangx.jy@fujitsu.com wrote:
>> On 2022/5/25 19:01, Sagi Grimberg wrote:
>>> iirc this was reported before, based on my analysis lockdep is giving
>>> a false alarm here. The reason is that the id_priv->handler_mutex cannot
>>> be the same for both cm_id that is handling the connect and the cm_id
>>> that is handling the rdma_destroy_id because rdma_destroy_id call
>>> is always called on a already disconnected cm_id, so this deadlock
>>> lockdep is complaining about cannot happen.
>> Hi Jason, Bart and Sagi,
>>
>> I also think it is actually a false positive.  The cm_id handling the
>> connection and the cm_id calling rdma_destroy_id() cannot be the same
>> one, right?
> 
> I am wondering if it is the same as the thread.
> 
> https://lore.kernel.org/linux-rdma/CAMGffEm22sP-oKK0D9=vOw77nbS05iwG7MC3DTVB0CyzVFhtXg@mail.gmail.com/ 

Hi Guoqing,

Thanks for your feedback.

I think they are the same deadlock issue (i.e. AB vs BCA).  The only 
difference is that two combinations of locks caused the same issue.

It seems that one id_priv->handler_mutex is locked on the new-created 
cm_id and the other id_priv->handler_mutex is locked on the disconnected 
cm_id.

> 
> 
>>> I'm not sure how to settle this.
>> Do you have any suggestion to remove the false positive by refactoring
>> the related RDMA/CM code. Sorry, I didn't know how to do it for now.
> 
> The simplest way is to call lockdep_off in case it is false alarm to 
> avoid the
> debugging effort, but not everyone likes the idea.
> 
> https://elixir.bootlin.com/linux/v6.0-rc2/C/ident/lockdep_off

To be honest, I don't like the fix way as well. I wonder if we can avoid 
the false positive by changing the related RDMA/CM code.

Best Regards,
Xiao Yang

> 
> Thanks,
> Guoqing

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002
  2022-08-26 10:03       ` yangx.jy
@ 2022-08-26 11:32         ` Guoqing Jiang
  0 siblings, 0 replies; 19+ messages in thread
From: Guoqing Jiang @ 2022-08-26 11:32 UTC (permalink / raw)
  To: yangx.jy, Sagi Grimberg, Yi Zhang, RDMA mailing list,
	open list:NVM EXPRESS DRIVER, bvanassche, Jason Gunthorpe



On 8/26/22 6:03 PM, yangx.jy@fujitsu.com wrote:
> On 2022/8/25 14:26, Guoqing Jiang wrote:
>>
>> On 8/25/22 1:59 PM, yangx.jy@fujitsu.com wrote:
>>> On 2022/5/25 19:01, Sagi Grimberg wrote:
>>>> iirc this was reported before, based on my analysis lockdep is giving
>>>> a false alarm here. The reason is that the id_priv->handler_mutex cannot
>>>> be the same for both cm_id that is handling the connect and the cm_id
>>>> that is handling the rdma_destroy_id because rdma_destroy_id call
>>>> is always called on a already disconnected cm_id, so this deadlock
>>>> lockdep is complaining about cannot happen.
>>> Hi Jason, Bart and Sagi,
>>>
>>> I also think it is actually a false positive.  The cm_id handling the
>>> connection and the cm_id calling rdma_destroy_id() cannot be the same
>>> one, right?
>> I am wondering if it is the same as the thread.
>>
>> https://lore.kernel.org/linux-rdma/CAMGffEm22sP-oKK0D9=vOw77nbS05iwG7MC3DTVB0CyzVFhtXg@mail.gmail.com/
> Hi Guoqing,
>
> Thanks for your feedback.
>
> I think they are the same deadlock issue (i.e. AB vs BCA).  The only
> difference is that two combinations of locks caused the same issue.
>
> It seems that one id_priv->handler_mutex is locked on the new-created
> cm_id and the other id_priv->handler_mutex is locked on the disconnected
> cm_id.
>
>>>> I'm not sure how to settle this.
>>> Do you have any suggestion to remove the false positive by refactoring
>>> the related RDMA/CM code. Sorry, I didn't know how to do it for now.
>> The simplest way is to call lockdep_off in case it is false alarm to
>> avoid the
>> debugging effort, but not everyone likes the idea.
>>
>> https://elixir.bootlin.com/linux/v6.0-rc2/C/ident/lockdep_off
> To be honest, I don't like the fix way as well. I wonder if we can avoid
> the false positive by changing the related RDMA/CM code.

I would consider it is a workaround before CM code is changed (and it needs
more effort I guess hopefully I am wrong), otherwise different people would
post the similar issue to list again.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-08-26 11:32 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-04  3:04 [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002 Yi Zhang
2022-02-27 23:21 ` Bart Van Assche
2022-05-25  3:40   ` yangx.jy
2022-05-25 11:01 ` Sagi Grimberg
2022-05-25 18:50   ` Bart Van Assche
2022-05-27 12:52     ` Jason Gunthorpe
2022-05-28 19:00       ` Bart Van Assche
2022-05-31 12:35         ` Jason Gunthorpe
2022-05-31 17:55           ` Bart Van Assche
2022-06-01 12:45             ` Jason Gunthorpe
2022-06-01 16:26               ` Bart Van Assche
2022-06-01 17:30                 ` Jason Gunthorpe
2022-06-03  5:13                   ` Bart Van Assche
2022-06-06 16:21                     ` Jason Gunthorpe
2022-08-23  7:29                       ` yangx.jy
2022-08-25  5:59   ` yangx.jy
2022-08-25  6:26     ` Guoqing Jiang
2022-08-26 10:03       ` yangx.jy
2022-08-26 11:32         ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.