All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] patches and bug report for rxe
@ 2022-02-10  7:36 Guoqing Jiang
  2022-02-10  7:36 ` [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index Guoqing Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-10  7:36 UTC (permalink / raw)
  To: zyjzyj2000, jgg, rpearsonhpe; +Cc: linux-rdma

Hi,

Recently, I am trying to run rnbd/rtrs on top of rxe, and get several
calltraces with 5.17-rc3 kernel (CONFIG_PROVE_LOCKING is enabled).

However, seems rnbd/rtrs over rxe still can't work with 5.17-rc3 kernel,
dmesg reports below.

1. server side

[  440.723182] rdma_rxe: qp#17 moved to error state
[  440.725300] rtrs_server L1205: <bla>: remote access error (wr_cqe: 000000003b14397c, type: 0, vendor_err: 0x0, len: 0)
[  440.845926] rnbd_server L256: RTRS Session bla disconnected

2. client side

[  997.817536] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
[  998.968810] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
[  999.017988] rtrs_client L610: <bla>: RDMA failed: remote access error
[ 1029.836943] rtrs_client L353: <bla>: Failed IB_WR_LOCAL_INV: WR flushe    

Then I tried 5.16 and 5.15 version, seems 5.15 does work as follows.

1. server side

[  333.076482] rnbd_server L800: </dev/loop1@bla>: Opened device 'loop1'

2. client side

[ 1584.325825] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
[ 1585.268291] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
[ 1585.349300] rnbd_client L1607: </dev/loop1@bla> map_device: Device mapped as rnbd0 (nsectors: 0, logical_block_size: 512, physical_block_size: 512, max_write_same_sectors: 0, max_discard_sectors: 0, discard_granularity: 0, discard_alignment: 0, secure_discard: 0, max_segments: 128, max_hw_sectors: 248, rotational: 1, wc: 0, fua: 0)

I would appreciate if someone shed light on why it doesn't work after 5.15,
And I am happy to test potential patch for it.

Guoqing Jiang (3):
  RDMA/rxe: Replace write_{lock,unlock}_bh with
    write_lock_irq{save,restore} in __rxe_add_index
  RDMA/rxe: Replace write_{lock,unlock}_bh with
    write_lock_irq{save,restore} in __rxe_drop_index
  RDMA/rxe: Replace write_{lock,unlock}_bh with
    write_lock_irq{save,restore} in post_one_send

 drivers/infiniband/sw/rxe/rxe_pool.c  | 10 ++++++----
 drivers/infiniband/sw/rxe/rxe_verbs.c |  5 +++--
 2 files changed, 9 insertions(+), 6 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index
  2022-02-10  7:36 [PATCH 0/3] patches and bug report for rxe Guoqing Jiang
@ 2022-02-10  7:36 ` Guoqing Jiang
  2022-02-10 13:29   ` Zhu Yanjun
  2022-02-10  7:36 ` [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index Guoqing Jiang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-10  7:36 UTC (permalink / raw)
  To: zyjzyj2000, jgg, rpearsonhpe; +Cc: linux-rdma

We need to make the lock fully IRQ safe, otherwise below calltrace appears.

[  495.697917] ------------[ cut here ]------------
[  495.698316] WARNING: CPU: 5 PID: 67 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
[ ... ]
[  495.702594] CPU: 5 PID: 67 Comm: kworker/5:1 Kdump: loaded Tainted: G            EL    5.17.0-rc3-57-default #17
[  495.702856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[  495.703144] Workqueue: ib_cm cm_work_handler [ib_cm]
[  495.708238] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
[  495.713197] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 f3 51 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f f3 51 85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20 00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
[  495.723257] RSP: 0018:ffff888100f9f1d8 EFLAGS: 00010046
[  495.728296] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
[  495.733441] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffb095dbac
[  495.738546] RBP: ffffffffc1761aa5 R08: ffffffffae1059da R09: 0000000000000000
[  495.743689] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800f6cd380
[  495.748913] R13: 0000000000000000 R14: ffff8880031e1ae0 R15: ffff8880031e1a28
[  495.754091] FS:  0000000000000000(0000) GS:ffff888109880000(0000) knlGS:0000000000000000
[  495.759217] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  495.764434] CR2: 00007f69a232e830 CR3: 00000000b6a16005 CR4: 0000000000770ee0
[  495.769531] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  495.774505] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  495.779449] PKRU: 55555554
[  495.784331] Call Trace:
[  495.789157]  <TASK>
[  495.793988]  __rxe_add_index+0x35/0x40 [rdma_rxe]
[  495.798938]  rxe_create_ah+0xa9/0x1e0 [rdma_rxe]
[  495.804007]  _rdma_create_ah+0x28a/0x2c0 [ib_core]
[  495.809328]  ? ib_create_srq_user+0x2c0/0x2c0 [ib_core]
[  495.814439]  ? lock_acquire+0x182/0x410
[  495.819558]  ? lock_release+0x450/0x450
[  495.824880]  rdma_create_ah+0xe1/0x1a0 [ib_core]
[  495.830101]  ? _rdma_create_ah+0x2c0/0x2c0 [ib_core]
[  495.835261]  ? rwlock_bug.part.0+0x60/0x60
[  495.840418]  cm_alloc_msg+0xb4/0x260 [ib_cm]
[  495.845528]  cm_alloc_priv_msg+0x29/0x70 [ib_cm]
[  495.850656]  ib_send_cm_rep+0x7c/0x860 [ib_cm]
[  495.855677]  ? lock_is_held_type+0xe4/0x140
[  495.860761]  rdma_accept+0x44c/0x5e0 [rdma_cm]
[  495.865817]  ? cma_rep_recv+0x330/0x330 [rdma_cm]
[  495.870658]  ? rcu_read_lock_sched_held+0x3f/0x60
[  495.875388]  ? trace_kmalloc+0x29/0xd0
[  495.879807]  ? __kmalloc+0x1c5/0x3a0
[  495.884114]  ? rtrs_iu_alloc+0x12b/0x260 [rtrs_core]
[  495.888343]  rtrs_srv_rdma_cm_handler+0x7ba/0xcf0 [rtrs_server]
[  495.892503]  ? rtrs_srv_inv_rkey_done+0x100/0x100 [rtrs_server]
[  495.896532]  ? find_held_lock+0x85/0xa0
[  495.900417]  ? lock_release+0x24e/0x450
[  495.904174]  ? rdma_restrack_add+0x9c/0x220 [ib_core]
[  495.907939]  ? rcu_read_lock_sched_held+0x3f/0x60
[  495.911638]  cma_cm_event_handler+0x77/0x2c0 [rdma_cm]
[  495.915225]  cma_ib_req_handler+0xbd5/0x23f0 [rdma_cm]
[  495.918702]  ? cma_cancel_operation+0x1f0/0x1f0 [rdma_cm]
[  495.922039]  ? lockdep_lock+0xb4/0x170
[  495.925195]  ? _find_first_zero_bit+0x28/0x50
[  495.928525]  ? mark_held_locks+0x65/0x90
[  495.931787]  cm_process_work+0x2f/0x210 [ib_cm]
[  495.934952]  ? _raw_spin_unlock_irq+0x35/0x50
[  495.937930]  ? cm_queue_work_unlock+0x40/0x110 [ib_cm]
[  495.940899]  cm_req_handler+0xf7f/0x2030 [ib_cm]
[  495.943738]  ? cm_lap_handler+0xba0/0xba0 [ib_cm]
[  495.946708]  ? lockdep_hardirqs_on_prepare+0x220/0x220
[  495.949600]  cm_work_handler+0x6ce/0x37c0 [ib_cm]
[  495.952395]  ? lock_acquire+0x182/0x410
[  495.955245]  ? lock_release+0x450/0x450
[  495.958005]  ? lock_downgrade+0x3c0/0x3c0
[  495.960695]  ? ib_cm_init_qp_attr+0xa90/0xa90 [ib_cm]
[  495.963323]  ? mark_held_locks+0x24/0x90
[  495.965902]  ? lock_is_held_type+0xe4/0x140
[  495.968597]  process_one_work+0x5a8/0xa80
[  495.971155]  ? lock_release+0x450/0x450
[  495.973812]  ? pwq_dec_nr_in_flight+0x100/0x100
[  495.976426]  ? rwlock_bug.part.0+0x60/0x60
[  495.979006]  ? _raw_spin_lock_irq+0x54/0x60
[  495.981600]  worker_thread+0x2b5/0x760
[  495.984272]  ? process_one_work+0xa80/0xa80
[  495.986832]  kthread+0x169/0x1a0
[  495.989348]  ? kthread_complete_and_exit+0x20/0x20
[  495.992032]  ret_from_fork+0x1f/0x30
[  495.994622]  </TASK>
[  495.997126] irq event stamp: 52525
[  495.999637] hardirqs last  enabled at (52523): [<ffffffffaf179c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
[  496.002367] hardirqs last disabled at (52524): [<ffffffffaf179a10>] _raw_spin_lock_irqsave+0x60/0x70
[  496.005109] softirqs last  enabled at (52514): [<ffffffffc1764b58>] rxe_post_recv+0xb8/0x120 [rdma_rxe]
[  496.007888] softirqs last disabled at (52525): [<ffffffffc1761a92>] __rxe_add_index+0x22/0x40 [rdma_rxe]
[  496.010698] ---[ end trace 0000000000000000 ]---

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_pool.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index 63c594173565..b4444785da52 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -300,10 +300,11 @@ int __rxe_add_index(struct rxe_pool_elem *elem)
 {
 	struct rxe_pool *pool = elem->pool;
 	int err;
+	unsigned long flags;
 
-	write_lock_bh(&pool->pool_lock);
+	write_lock_irqsave(&pool->pool_lock, flags);
 	err = __rxe_add_index_locked(elem);
-	write_unlock_bh(&pool->pool_lock);
+	write_unlock_irqrestore(&pool->pool_lock, flags);
 
 	return err;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-10  7:36 [PATCH 0/3] patches and bug report for rxe Guoqing Jiang
  2022-02-10  7:36 ` [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index Guoqing Jiang
@ 2022-02-10  7:36 ` Guoqing Jiang
  2022-02-10 14:16   ` Zhu Yanjun
  2022-02-10  7:36 ` [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send Guoqing Jiang
  2022-02-22  9:50 ` bug report for rxe Guoqing Jiang
  3 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-10  7:36 UTC (permalink / raw)
  To: zyjzyj2000, jgg, rpearsonhpe; +Cc: linux-rdma

Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
below calltrace appears.

[  250.757218] ------------[ cut here ]------------
[  250.758997] WARNING: CPU: 6 PID: 90 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
[ ... ]
[  250.769900] CPU: 6 PID: 90 Comm: kworker/u16:3 Kdump: loaded Tainted: G           OEL    5.17.0-rc3-57-default #17
[  250.770413] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[  250.770955] Workqueue: ib_mad1 timeout_sends [ib_core]
[  250.771400] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
[  250.771678] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 b3 60 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f b3 60 85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20
 00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
[  250.772562] RSP: 0018:ffff88801b2e7ae8 EFLAGS: 00010046
[  250.772845] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
[  250.773197] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffa1d5dbac
[  250.773548] RBP: ffffffffc15c4da7 R08: ffffffff9f5059da R09: 0000000000000000
[  250.773911] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
[  250.774261] R13: ffff888017c0e850 R14: ffff888016f18000 R15: 0000000000000005
[  250.774614] FS:  0000000000000000(0000) GS:ffff888104f00000(0000) knlGS:0000000000000000
[  250.775009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  250.775296] CR2: 00007f7ea9e84fe8 CR3: 000000000216e002 CR4: 0000000000770ee0
[  250.775651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  250.784298] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  250.791093] PKRU: 55555554
[  250.796587] Call Trace:
[  250.801957]  <TASK>
[  250.807269]  rxe_destroy_ah+0x17/0x60 [rdma_rxe]
[  250.812682]  rdma_destroy_ah_user+0x5a/0xb0 [ib_core]
[  250.818242]  cm_free_priv_msg+0x6e/0x130 [ib_cm]
[  250.823738]  cm_send_handler+0x1f6/0x480 [ib_cm]
[  250.829176]  ? ib_cm_insert_listen+0x100/0x100 [ib_cm]
[  250.834654]  ? lockdep_hardirqs_on_prepare+0x129/0x220
[  250.840111]  ? _raw_spin_unlock_irqrestore+0x2d/0x60
[  250.845514]  timeout_sends+0x310/0x420 [ib_core]
[  250.851007]  ? ib_send_mad+0x850/0x850 [ib_core]
[  250.856471]  ? mark_held_locks+0x24/0x90
[  250.861679]  ? lock_is_held_type+0xe4/0x140
[  250.866835]  process_one_work+0x5a8/0xa80
[  250.871949]  ? lock_release+0x450/0x450
[  250.877061]  ? pwq_dec_nr_in_flight+0x100/0x100
[  250.882144]  ? rwlock_bug.part.0+0x60/0x60
[  250.887093]  ? _raw_spin_lock_irq+0x54/0x60
[  250.891960]  worker_thread+0x2b5/0x760
[  250.896691]  ? process_one_work+0xa80/0xa80
[  250.901265]  kthread+0x169/0x1a0
[  250.905722]  ? kthread_complete_and_exit+0x20/0x20
[  250.910094]  ret_from_fork+0x1f/0x30
[  250.914381]  </TASK>
[  250.918441] irq event stamp: 21397
[  250.922427] hardirqs last  enabled at (21395): [<ffffffffa0579c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
[  250.926476] hardirqs last disabled at (21396): [<ffffffffa05799a4>] _raw_spin_lock_irq+0x54/0x60
[  250.930443] softirqs last  enabled at (21370): [<ffffffff9f5319f8>] process_one_work+0x5a8/0xa80
[  250.934329] softirqs last disabled at (21397): [<ffffffffc15c1b60>] __rxe_drop_index+0x20/0x40 [rdma_rxe]
[  250.938120] ---[ end trace 0000000000000000 ]---

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_pool.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index b4444785da52..026b60363fd6 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -320,10 +320,11 @@ void __rxe_drop_index_locked(struct rxe_pool_elem *elem)
 void __rxe_drop_index(struct rxe_pool_elem *elem)
 {
 	struct rxe_pool *pool = elem->pool;
+	unsigned long flags;
 
-	write_lock_bh(&pool->pool_lock);
+	write_lock_irqsave(&pool->pool_lock, flags);
 	__rxe_drop_index_locked(elem);
-	write_unlock_bh(&pool->pool_lock);
+	write_unlock_irqrestore(&pool->pool_lock, flags);
 }
 
 void *rxe_alloc_locked(struct rxe_pool *pool)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send
  2022-02-10  7:36 [PATCH 0/3] patches and bug report for rxe Guoqing Jiang
  2022-02-10  7:36 ` [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index Guoqing Jiang
  2022-02-10  7:36 ` [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index Guoqing Jiang
@ 2022-02-10  7:36 ` Guoqing Jiang
  2022-02-10 14:18   ` Zhu Yanjun
  2022-02-22  9:50 ` bug report for rxe Guoqing Jiang
  3 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-10  7:36 UTC (permalink / raw)
  To: zyjzyj2000, jgg, rpearsonhpe; +Cc: linux-rdma

Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
below calltrace appears.

[  763.942623] ------------[ cut here ]------------
[  763.943171] WARNING: CPU: 5 PID: 97 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
[ ... ]
[  763.947276] CPU: 5 PID: 97 Comm: kworker/5:2 Kdump: loaded Tainted: G           OEL    5.17.0-rc3-57-default #17
[  763.947575] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
[  763.947893] Workqueue: ib_cm cm_work_handler [ib_cm]
[  763.948075] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
[  763.948232] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 f3 56 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f f3 56
85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20 00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
[  763.948736] RSP: 0018:ffff888004a970e8 EFLAGS: 00010046
[  763.948897] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
[  763.949095] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffab95dbac
[  763.949292] RBP: ffffffffc127c269 R08: ffffffffa91059da R09: ffff88800afde323
[  763.949556] R10: ffffed10015fbc64 R11: 0000000000000001 R12: ffffc900005a2000
[  763.949781] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88800afde000
[  763.949982] FS:  0000000000000000(0000) GS:ffff888104c80000(0000) knlGS:0000000000000000
[  763.950205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  763.950367] CR2: 00007f85ec3f5b18 CR3: 0000000116216005 CR4: 0000000000770ee0
[  763.956480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  763.962608] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  763.968785] PKRU: 55555554
[  763.974707] Call Trace:
[  763.980557]  <TASK>
[  763.986377]  rxe_post_send+0x569/0x8e0 [rdma_rxe]
[  763.992340]  ib_send_mad+0x4c1/0x850 [ib_core]
[  763.998442]  ? ib_register_mad_agent+0x1710/0x1710 [ib_core]
[  764.004486]  ? __kmalloc+0x21d/0x3a0
[  764.010465]  ib_post_send_mad+0x28c/0x10b0 [ib_core]
[  764.016480]  ? lock_is_held_type+0xe4/0x140
[  764.022359]  ? find_held_lock+0x85/0xa0
[  764.028230]  ? lock_release+0x24e/0x450
[  764.034061]  ? timeout_sends+0x420/0x420 [ib_core]
[  764.039879]  ? ib_create_send_mad+0x541/0x670 [ib_core]
[  764.045604]  ? do_raw_spin_unlock+0x86/0xf0
[  764.051178]  ? preempt_count_sub+0x14/0xc0
[  764.056851]  ? lock_is_held_type+0xe4/0x140
[  764.062412]  ib_send_cm_rep+0x47a/0x860 [ib_cm]
[  764.067965]  rdma_accept+0x44c/0x5e0 [rdma_cm]
[  764.073381]  ? cma_rep_recv+0x330/0x330 [rdma_cm]
[  764.078762]  ? rcu_read_lock_sched_held+0x3f/0x60
[  764.084072]  ? trace_kmalloc+0x29/0xd0
[  764.089185]  ? __kmalloc+0x1c5/0x3a0
[  764.094185]  ? rtrs_iu_alloc+0x12b/0x260 [rtrs_core]
[  764.099075]  rtrs_srv_rdma_cm_handler+0x7ba/0xcf0 [rtrs_server]
[  764.103917]  ? rtrs_srv_inv_rkey_done+0x100/0x100 [rtrs_server]
[  764.108563]  ? find_held_lock+0x85/0xa0
[  764.113033]  ? lock_release+0x24e/0x450
[  764.117452]  ? rdma_restrack_add+0x9c/0x220 [ib_core]
[  764.121797]  ? rcu_read_lock_sched_held+0x3f/0x60
[  764.125961]  cma_cm_event_handler+0x77/0x2c0 [rdma_cm]
[  764.130061]  cma_ib_req_handler+0xbd5/0x23f0 [rdma_cm]
[  764.134027]  ? cma_cancel_operation+0x1f0/0x1f0 [rdma_cm]
[  764.137950]  ? lockdep_lock+0xb4/0x170
[  764.141667]  ? _find_first_zero_bit+0x28/0x50
[  764.145486]  ? mark_held_locks+0x65/0x90
[  764.149002]  cm_process_work+0x2f/0x210 [ib_cm]
[  764.152413]  ? _raw_spin_unlock_irq+0x35/0x50
[  764.155763]  ? cm_queue_work_unlock+0x40/0x110 [ib_cm]
[  764.159080]  cm_req_handler+0xf7f/0x2030 [ib_cm]
[  764.162522]  ? cm_lap_handler+0xba0/0xba0 [ib_cm]
[  764.165847]  ? lockdep_hardirqs_on_prepare+0x220/0x220
[  764.169155]  cm_work_handler+0x6ce/0x37c0 [ib_cm]
[  764.172497]  ? lock_acquire+0x182/0x410
[  764.175771]  ? lock_release+0x450/0x450
[  764.178925]  ? lock_downgrade+0x3c0/0x3c0
[  764.182148]  ? ib_cm_init_qp_attr+0xa90/0xa90 [ib_cm]
[  764.185511]  ? mark_held_locks+0x24/0x90
[  764.188692]  ? lock_is_held_type+0xe4/0x140
[  764.191876]  process_one_work+0x5a8/0xa80
[  764.195034]  ? lock_release+0x450/0x450
[  764.198208]  ? pwq_dec_nr_in_flight+0x100/0x100
[  764.201433]  ? rwlock_bug.part.0+0x60/0x60
[  764.204660]  ? _raw_spin_lock_irq+0x54/0x60
[  764.207835]  worker_thread+0x2b5/0x760
[  764.210920]  ? process_one_work+0xa80/0xa80
[  764.214014]  kthread+0x169/0x1a0
[  764.217033]  ? kthread_complete_and_exit+0x20/0x20
[  764.220205]  ret_from_fork+0x1f/0x30
[  764.223467]  </TASK>
[  764.226482] irq event stamp: 55805
[  764.229527] hardirqs last  enabled at (55803): [<ffffffffaa179c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
[  764.232779] hardirqs last disabled at (55804): [<ffffffffaa179a10>] _raw_spin_lock_irqsave+0x60/0x70
[  764.236052] softirqs last  enabled at (55794): [<ffffffffc127cb68>] rxe_post_recv+0xb8/0x120 [rdma_rxe]
[  764.239428] softirqs last disabled at (55805): [<ffffffffc127beeb>] rxe_post_send+0x1eb/0x8e0 [rdma_rxe]
[  764.242740] ---[ end trace 0000000000000000 ]---

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 9f0aef4b649d..0056418425a1 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -644,12 +644,13 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
 	struct rxe_sq *sq = &qp->sq;
 	struct rxe_send_wqe *send_wqe;
 	int full;
+	unsigned long flags;
 
 	err = validate_send_wr(qp, ibwr, mask, length);
 	if (err)
 		return err;
 
-	spin_lock_bh(&qp->sq.sq_lock);
+	spin_lock_irqsave(&qp->sq.sq_lock, flags);
 
 	full = queue_full(sq->queue, QUEUE_TYPE_TO_DRIVER);
 
@@ -663,7 +664,7 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
 
 	queue_advance_producer(sq->queue, QUEUE_TYPE_TO_DRIVER);
 
-	spin_unlock_bh(&qp->sq.sq_lock);
+	spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
 
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index
  2022-02-10  7:36 ` [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index Guoqing Jiang
@ 2022-02-10 13:29   ` Zhu Yanjun
  0 siblings, 0 replies; 17+ messages in thread
From: Zhu Yanjun @ 2022-02-10 13:29 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list

On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> We need to make the lock fully IRQ safe, otherwise below calltrace appears.
>
> [  495.697917] ------------[ cut here ]------------
> [  495.698316] WARNING: CPU: 5 PID: 67 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
> [ ... ]
> [  495.702594] CPU: 5 PID: 67 Comm: kworker/5:1 Kdump: loaded Tainted: G            EL    5.17.0-rc3-57-default #17
> [  495.702856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [  495.703144] Workqueue: ib_cm cm_work_handler [ib_cm]
> [  495.708238] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
> [  495.713197] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 f3 51 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f f3 51 85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20 00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
> [  495.723257] RSP: 0018:ffff888100f9f1d8 EFLAGS: 00010046
> [  495.728296] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
> [  495.733441] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffb095dbac
> [  495.738546] RBP: ffffffffc1761aa5 R08: ffffffffae1059da R09: 0000000000000000
> [  495.743689] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800f6cd380
> [  495.748913] R13: 0000000000000000 R14: ffff8880031e1ae0 R15: ffff8880031e1a28
> [  495.754091] FS:  0000000000000000(0000) GS:ffff888109880000(0000) knlGS:0000000000000000
> [  495.759217] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  495.764434] CR2: 00007f69a232e830 CR3: 00000000b6a16005 CR4: 0000000000770ee0
> [  495.769531] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  495.774505] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  495.779449] PKRU: 55555554
> [  495.784331] Call Trace:
> [  495.789157]  <TASK>
> [  495.793988]  __rxe_add_index+0x35/0x40 [rdma_rxe]
> [  495.798938]  rxe_create_ah+0xa9/0x1e0 [rdma_rxe]
> [  495.804007]  _rdma_create_ah+0x28a/0x2c0 [ib_core]
> [  495.809328]  ? ib_create_srq_user+0x2c0/0x2c0 [ib_core]
> [  495.814439]  ? lock_acquire+0x182/0x410
> [  495.819558]  ? lock_release+0x450/0x450
> [  495.824880]  rdma_create_ah+0xe1/0x1a0 [ib_core]
> [  495.830101]  ? _rdma_create_ah+0x2c0/0x2c0 [ib_core]
> [  495.835261]  ? rwlock_bug.part.0+0x60/0x60
> [  495.840418]  cm_alloc_msg+0xb4/0x260 [ib_cm]
> [  495.845528]  cm_alloc_priv_msg+0x29/0x70 [ib_cm]
> [  495.850656]  ib_send_cm_rep+0x7c/0x860 [ib_cm]
> [  495.855677]  ? lock_is_held_type+0xe4/0x140
> [  495.860761]  rdma_accept+0x44c/0x5e0 [rdma_cm]
> [  495.865817]  ? cma_rep_recv+0x330/0x330 [rdma_cm]
> [  495.870658]  ? rcu_read_lock_sched_held+0x3f/0x60
> [  495.875388]  ? trace_kmalloc+0x29/0xd0
> [  495.879807]  ? __kmalloc+0x1c5/0x3a0
> [  495.884114]  ? rtrs_iu_alloc+0x12b/0x260 [rtrs_core]
> [  495.888343]  rtrs_srv_rdma_cm_handler+0x7ba/0xcf0 [rtrs_server]
> [  495.892503]  ? rtrs_srv_inv_rkey_done+0x100/0x100 [rtrs_server]
> [  495.896532]  ? find_held_lock+0x85/0xa0
> [  495.900417]  ? lock_release+0x24e/0x450
> [  495.904174]  ? rdma_restrack_add+0x9c/0x220 [ib_core]
> [  495.907939]  ? rcu_read_lock_sched_held+0x3f/0x60
> [  495.911638]  cma_cm_event_handler+0x77/0x2c0 [rdma_cm]
> [  495.915225]  cma_ib_req_handler+0xbd5/0x23f0 [rdma_cm]
> [  495.918702]  ? cma_cancel_operation+0x1f0/0x1f0 [rdma_cm]
> [  495.922039]  ? lockdep_lock+0xb4/0x170
> [  495.925195]  ? _find_first_zero_bit+0x28/0x50
> [  495.928525]  ? mark_held_locks+0x65/0x90
> [  495.931787]  cm_process_work+0x2f/0x210 [ib_cm]
> [  495.934952]  ? _raw_spin_unlock_irq+0x35/0x50
> [  495.937930]  ? cm_queue_work_unlock+0x40/0x110 [ib_cm]
> [  495.940899]  cm_req_handler+0xf7f/0x2030 [ib_cm]
> [  495.943738]  ? cm_lap_handler+0xba0/0xba0 [ib_cm]
> [  495.946708]  ? lockdep_hardirqs_on_prepare+0x220/0x220
> [  495.949600]  cm_work_handler+0x6ce/0x37c0 [ib_cm]
> [  495.952395]  ? lock_acquire+0x182/0x410
> [  495.955245]  ? lock_release+0x450/0x450
> [  495.958005]  ? lock_downgrade+0x3c0/0x3c0
> [  495.960695]  ? ib_cm_init_qp_attr+0xa90/0xa90 [ib_cm]
> [  495.963323]  ? mark_held_locks+0x24/0x90
> [  495.965902]  ? lock_is_held_type+0xe4/0x140
> [  495.968597]  process_one_work+0x5a8/0xa80
> [  495.971155]  ? lock_release+0x450/0x450
> [  495.973812]  ? pwq_dec_nr_in_flight+0x100/0x100
> [  495.976426]  ? rwlock_bug.part.0+0x60/0x60
> [  495.979006]  ? _raw_spin_lock_irq+0x54/0x60
> [  495.981600]  worker_thread+0x2b5/0x760
> [  495.984272]  ? process_one_work+0xa80/0xa80
> [  495.986832]  kthread+0x169/0x1a0
> [  495.989348]  ? kthread_complete_and_exit+0x20/0x20
> [  495.992032]  ret_from_fork+0x1f/0x30
> [  495.994622]  </TASK>
> [  495.997126] irq event stamp: 52525
> [  495.999637] hardirqs last  enabled at (52523): [<ffffffffaf179c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
> [  496.002367] hardirqs last disabled at (52524): [<ffffffffaf179a10>] _raw_spin_lock_irqsave+0x60/0x70
> [  496.005109] softirqs last  enabled at (52514): [<ffffffffc1764b58>] rxe_post_recv+0xb8/0x120 [rdma_rxe]
> [  496.007888] softirqs last disabled at (52525): [<ffffffffc1761a92>] __rxe_add_index+0x22/0x40 [rdma_rxe]
> [  496.010698] ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>

Zhu Yanjun
> ---
>  drivers/infiniband/sw/rxe/rxe_pool.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
> index 63c594173565..b4444785da52 100644
> --- a/drivers/infiniband/sw/rxe/rxe_pool.c
> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
> @@ -300,10 +300,11 @@ int __rxe_add_index(struct rxe_pool_elem *elem)
>  {
>         struct rxe_pool *pool = elem->pool;
>         int err;
> +       unsigned long flags;
>
> -       write_lock_bh(&pool->pool_lock);
> +       write_lock_irqsave(&pool->pool_lock, flags);
>         err = __rxe_add_index_locked(elem);
> -       write_unlock_bh(&pool->pool_lock);
> +       write_unlock_irqrestore(&pool->pool_lock, flags);
>
>         return err;
>  }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-10  7:36 ` [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index Guoqing Jiang
@ 2022-02-10 14:16   ` Zhu Yanjun
  2022-02-10 15:49     ` Bob Pearson
  0 siblings, 1 reply; 17+ messages in thread
From: Zhu Yanjun @ 2022-02-10 14:16 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list

On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
> below calltrace appears.
>
> [  250.757218] ------------[ cut here ]------------
> [  250.758997] WARNING: CPU: 6 PID: 90 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
> [ ... ]
> [  250.769900] CPU: 6 PID: 90 Comm: kworker/u16:3 Kdump: loaded Tainted: G           OEL    5.17.0-rc3-57-default #17
> [  250.770413] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [  250.770955] Workqueue: ib_mad1 timeout_sends [ib_core]
> [  250.771400] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
> [  250.771678] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 b3 60 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f b3 60 85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20
>  00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
> [  250.772562] RSP: 0018:ffff88801b2e7ae8 EFLAGS: 00010046
> [  250.772845] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
> [  250.773197] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffa1d5dbac
> [  250.773548] RBP: ffffffffc15c4da7 R08: ffffffff9f5059da R09: 0000000000000000
> [  250.773911] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> [  250.774261] R13: ffff888017c0e850 R14: ffff888016f18000 R15: 0000000000000005
> [  250.774614] FS:  0000000000000000(0000) GS:ffff888104f00000(0000) knlGS:0000000000000000
> [  250.775009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  250.775296] CR2: 00007f7ea9e84fe8 CR3: 000000000216e002 CR4: 0000000000770ee0
> [  250.775651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  250.784298] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  250.791093] PKRU: 55555554
> [  250.796587] Call Trace:
> [  250.801957]  <TASK>
> [  250.807269]  rxe_destroy_ah+0x17/0x60 [rdma_rxe]
> [  250.812682]  rdma_destroy_ah_user+0x5a/0xb0 [ib_core]
> [  250.818242]  cm_free_priv_msg+0x6e/0x130 [ib_cm]
> [  250.823738]  cm_send_handler+0x1f6/0x480 [ib_cm]
> [  250.829176]  ? ib_cm_insert_listen+0x100/0x100 [ib_cm]
> [  250.834654]  ? lockdep_hardirqs_on_prepare+0x129/0x220
> [  250.840111]  ? _raw_spin_unlock_irqrestore+0x2d/0x60
> [  250.845514]  timeout_sends+0x310/0x420 [ib_core]
> [  250.851007]  ? ib_send_mad+0x850/0x850 [ib_core]
> [  250.856471]  ? mark_held_locks+0x24/0x90
> [  250.861679]  ? lock_is_held_type+0xe4/0x140
> [  250.866835]  process_one_work+0x5a8/0xa80
> [  250.871949]  ? lock_release+0x450/0x450
> [  250.877061]  ? pwq_dec_nr_in_flight+0x100/0x100
> [  250.882144]  ? rwlock_bug.part.0+0x60/0x60
> [  250.887093]  ? _raw_spin_lock_irq+0x54/0x60
> [  250.891960]  worker_thread+0x2b5/0x760
> [  250.896691]  ? process_one_work+0xa80/0xa80
> [  250.901265]  kthread+0x169/0x1a0
> [  250.905722]  ? kthread_complete_and_exit+0x20/0x20
> [  250.910094]  ret_from_fork+0x1f/0x30
> [  250.914381]  </TASK>
> [  250.918441] irq event stamp: 21397
> [  250.922427] hardirqs last  enabled at (21395): [<ffffffffa0579c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
> [  250.926476] hardirqs last disabled at (21396): [<ffffffffa05799a4>] _raw_spin_lock_irq+0x54/0x60
> [  250.930443] softirqs last  enabled at (21370): [<ffffffff9f5319f8>] process_one_work+0x5a8/0xa80
> [  250.934329] softirqs last disabled at (21397): [<ffffffffc15c1b60>] __rxe_drop_index+0x20/0x40 [rdma_rxe]
> [  250.938120] ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>

Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>

Zhu Yanjun

> ---
>  drivers/infiniband/sw/rxe/rxe_pool.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
> index b4444785da52..026b60363fd6 100644
> --- a/drivers/infiniband/sw/rxe/rxe_pool.c
> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
> @@ -320,10 +320,11 @@ void __rxe_drop_index_locked(struct rxe_pool_elem *elem)
>  void __rxe_drop_index(struct rxe_pool_elem *elem)
>  {
>         struct rxe_pool *pool = elem->pool;
> +       unsigned long flags;
>
> -       write_lock_bh(&pool->pool_lock);
> +       write_lock_irqsave(&pool->pool_lock, flags);
>         __rxe_drop_index_locked(elem);
> -       write_unlock_bh(&pool->pool_lock);
> +       write_unlock_irqrestore(&pool->pool_lock, flags);
>  }
>
>  void *rxe_alloc_locked(struct rxe_pool *pool)
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send
  2022-02-10  7:36 ` [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send Guoqing Jiang
@ 2022-02-10 14:18   ` Zhu Yanjun
  0 siblings, 0 replies; 17+ messages in thread
From: Zhu Yanjun @ 2022-02-10 14:18 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list

On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
> below calltrace appears.
>
> [  763.942623] ------------[ cut here ]------------
> [  763.943171] WARNING: CPU: 5 PID: 97 at kernel/softirq.c:363 __local_bh_enable_ip+0xb1/0x110
> [ ... ]
> [  763.947276] CPU: 5 PID: 97 Comm: kworker/5:2 Kdump: loaded Tainted: G           OEL    5.17.0-rc3-57-default #17
> [  763.947575] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [  763.947893] Workqueue: ib_cm cm_work_handler [ib_cm]
> [  763.948075] RIP: 0010:__local_bh_enable_ip+0xb1/0x110
> [  763.948232] Code: e8 54 ae 04 00 e8 7f 4e 20 00 fb 66 0f 1f 44 00 00 65 8b 05 b1 03 f3 56 85 c0 74 51 5b 5d c3 65 8b 05 3f 0f f3 56
> 85 c0 75 8e <0f> 0b eb 8a e8 76 4c 20 00 eb 99 48 89 ef e8 9c 8d 0b 00 eb a2 48
> [  763.948736] RSP: 0018:ffff888004a970e8 EFLAGS: 00010046
> [  763.948897] RAX: 0000000000000000 RBX: 0000000000000201 RCX: dffffc0000000000
> [  763.949095] RDX: 0000000000000007 RSI: 0000000000000201 RDI: ffffffffab95dbac
> [  763.949292] RBP: ffffffffc127c269 R08: ffffffffa91059da R09: ffff88800afde323
> [  763.949556] R10: ffffed10015fbc64 R11: 0000000000000001 R12: ffffc900005a2000
> [  763.949781] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88800afde000
> [  763.949982] FS:  0000000000000000(0000) GS:ffff888104c80000(0000) knlGS:0000000000000000
> [  763.950205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  763.950367] CR2: 00007f85ec3f5b18 CR3: 0000000116216005 CR4: 0000000000770ee0
> [  763.956480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  763.962608] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  763.968785] PKRU: 55555554
> [  763.974707] Call Trace:
> [  763.980557]  <TASK>
> [  763.986377]  rxe_post_send+0x569/0x8e0 [rdma_rxe]
> [  763.992340]  ib_send_mad+0x4c1/0x850 [ib_core]
> [  763.998442]  ? ib_register_mad_agent+0x1710/0x1710 [ib_core]
> [  764.004486]  ? __kmalloc+0x21d/0x3a0
> [  764.010465]  ib_post_send_mad+0x28c/0x10b0 [ib_core]
> [  764.016480]  ? lock_is_held_type+0xe4/0x140
> [  764.022359]  ? find_held_lock+0x85/0xa0
> [  764.028230]  ? lock_release+0x24e/0x450
> [  764.034061]  ? timeout_sends+0x420/0x420 [ib_core]
> [  764.039879]  ? ib_create_send_mad+0x541/0x670 [ib_core]
> [  764.045604]  ? do_raw_spin_unlock+0x86/0xf0
> [  764.051178]  ? preempt_count_sub+0x14/0xc0
> [  764.056851]  ? lock_is_held_type+0xe4/0x140
> [  764.062412]  ib_send_cm_rep+0x47a/0x860 [ib_cm]
> [  764.067965]  rdma_accept+0x44c/0x5e0 [rdma_cm]
> [  764.073381]  ? cma_rep_recv+0x330/0x330 [rdma_cm]
> [  764.078762]  ? rcu_read_lock_sched_held+0x3f/0x60
> [  764.084072]  ? trace_kmalloc+0x29/0xd0
> [  764.089185]  ? __kmalloc+0x1c5/0x3a0
> [  764.094185]  ? rtrs_iu_alloc+0x12b/0x260 [rtrs_core]
> [  764.099075]  rtrs_srv_rdma_cm_handler+0x7ba/0xcf0 [rtrs_server]
> [  764.103917]  ? rtrs_srv_inv_rkey_done+0x100/0x100 [rtrs_server]
> [  764.108563]  ? find_held_lock+0x85/0xa0
> [  764.113033]  ? lock_release+0x24e/0x450
> [  764.117452]  ? rdma_restrack_add+0x9c/0x220 [ib_core]
> [  764.121797]  ? rcu_read_lock_sched_held+0x3f/0x60
> [  764.125961]  cma_cm_event_handler+0x77/0x2c0 [rdma_cm]
> [  764.130061]  cma_ib_req_handler+0xbd5/0x23f0 [rdma_cm]
> [  764.134027]  ? cma_cancel_operation+0x1f0/0x1f0 [rdma_cm]
> [  764.137950]  ? lockdep_lock+0xb4/0x170
> [  764.141667]  ? _find_first_zero_bit+0x28/0x50
> [  764.145486]  ? mark_held_locks+0x65/0x90
> [  764.149002]  cm_process_work+0x2f/0x210 [ib_cm]
> [  764.152413]  ? _raw_spin_unlock_irq+0x35/0x50
> [  764.155763]  ? cm_queue_work_unlock+0x40/0x110 [ib_cm]
> [  764.159080]  cm_req_handler+0xf7f/0x2030 [ib_cm]
> [  764.162522]  ? cm_lap_handler+0xba0/0xba0 [ib_cm]
> [  764.165847]  ? lockdep_hardirqs_on_prepare+0x220/0x220
> [  764.169155]  cm_work_handler+0x6ce/0x37c0 [ib_cm]
> [  764.172497]  ? lock_acquire+0x182/0x410
> [  764.175771]  ? lock_release+0x450/0x450
> [  764.178925]  ? lock_downgrade+0x3c0/0x3c0
> [  764.182148]  ? ib_cm_init_qp_attr+0xa90/0xa90 [ib_cm]
> [  764.185511]  ? mark_held_locks+0x24/0x90
> [  764.188692]  ? lock_is_held_type+0xe4/0x140
> [  764.191876]  process_one_work+0x5a8/0xa80
> [  764.195034]  ? lock_release+0x450/0x450
> [  764.198208]  ? pwq_dec_nr_in_flight+0x100/0x100
> [  764.201433]  ? rwlock_bug.part.0+0x60/0x60
> [  764.204660]  ? _raw_spin_lock_irq+0x54/0x60
> [  764.207835]  worker_thread+0x2b5/0x760
> [  764.210920]  ? process_one_work+0xa80/0xa80
> [  764.214014]  kthread+0x169/0x1a0
> [  764.217033]  ? kthread_complete_and_exit+0x20/0x20
> [  764.220205]  ret_from_fork+0x1f/0x30
> [  764.223467]  </TASK>
> [  764.226482] irq event stamp: 55805
> [  764.229527] hardirqs last  enabled at (55803): [<ffffffffaa179c6d>] _raw_spin_unlock_irqrestore+0x2d/0x60
> [  764.232779] hardirqs last disabled at (55804): [<ffffffffaa179a10>] _raw_spin_lock_irqsave+0x60/0x70
> [  764.236052] softirqs last  enabled at (55794): [<ffffffffc127cb68>] rxe_post_recv+0xb8/0x120 [rdma_rxe]
> [  764.239428] softirqs last disabled at (55805): [<ffffffffc127beeb>] rxe_post_send+0x1eb/0x8e0 [rdma_rxe]
> [  764.242740] ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>

Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>

Zhu Yanjun

> ---
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 9f0aef4b649d..0056418425a1 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -644,12 +644,13 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
>         struct rxe_sq *sq = &qp->sq;
>         struct rxe_send_wqe *send_wqe;
>         int full;
> +       unsigned long flags;
>
>         err = validate_send_wr(qp, ibwr, mask, length);
>         if (err)
>                 return err;
>
> -       spin_lock_bh(&qp->sq.sq_lock);
> +       spin_lock_irqsave(&qp->sq.sq_lock, flags);
>
>         full = queue_full(sq->queue, QUEUE_TYPE_TO_DRIVER);
>
> @@ -663,7 +664,7 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
>
>         queue_advance_producer(sq->queue, QUEUE_TYPE_TO_DRIVER);
>
> -       spin_unlock_bh(&qp->sq.sq_lock);
> +       spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
>
>         return 0;
>  }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-10 14:16   ` Zhu Yanjun
@ 2022-02-10 15:49     ` Bob Pearson
  2022-02-11 10:09       ` Guoqing Jiang
  0 siblings, 1 reply; 17+ messages in thread
From: Bob Pearson @ 2022-02-10 15:49 UTC (permalink / raw)
  To: Zhu Yanjun, Guoqing Jiang; +Cc: Jason Gunthorpe, RDMA mailing list

On 2/10/22 08:16, Zhu Yanjun wrote:
> On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>>
>> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
>> below calltrace appears.
>>
I had the impression that NAPI ran on a soft IRQ and the rxe tasklets are also on soft IRQs. So at least in theory spin_lock_bh() should be sufficient. Can someone explain where the hard interrupt is coming from that we need to protect. There are other race conditions in current rxe that may also be the cause of this. I am trying to get a patch series accepted to deal with those.

Bob

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-10 15:49     ` Bob Pearson
@ 2022-02-11 10:09       ` Guoqing Jiang
  2022-02-11 17:37         ` Bob Pearson
  0 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-11 10:09 UTC (permalink / raw)
  To: Bob Pearson, Zhu Yanjun; +Cc: Jason Gunthorpe, RDMA mailing list



On 2/10/22 11:49 PM, Bob Pearson wrote:
> On 2/10/22 08:16, Zhu Yanjun wrote:
>> On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang<guoqing.jiang@linux.dev>  wrote:
>>> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
>>> below calltrace appears.
>>>
> I had the impression that NAPI ran on a soft IRQ and the rxe tasklets are also on soft IRQs. So at least in theory spin_lock_bh() should be sufficient. Can someone explain where the hard interrupt is coming from that we need to protect.

Since rxe is actually run on top of NIC,  could it comes from NIC if NIC 
driver doesn't switch to NAPI
or from other hardware? But my knowledge about the domain is limited.

>   There are other race conditions in current rxe that may also be the cause of this. I am trying to get a patch series accepted to deal with those.

If possible, could you investigate why rxe after 5.15 kernel doesn't 
work as reported in cover letter? Thank you!

Guoqing

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-11 10:09       ` Guoqing Jiang
@ 2022-02-11 17:37         ` Bob Pearson
  2022-02-12  0:59           ` Guoqing Jiang
  0 siblings, 1 reply; 17+ messages in thread
From: Bob Pearson @ 2022-02-11 17:37 UTC (permalink / raw)
  To: Guoqing Jiang, Zhu Yanjun; +Cc: Jason Gunthorpe, RDMA mailing list

On 2/11/22 04:09, Guoqing Jiang wrote:
> 
> 
> On 2/10/22 11:49 PM, Bob Pearson wrote:
>> On 2/10/22 08:16, Zhu Yanjun wrote:
>>> On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang<guoqing.jiang@linux.dev>  wrote:
>>>> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
>>>> below calltrace appears.
>>>>
>> I had the impression that NAPI ran on a soft IRQ and the rxe tasklets are also on soft IRQs. So at least in theory spin_lock_bh() should be sufficient. Can someone explain where the hard interrupt is coming from that we need to protect.
> 
> Since rxe is actually run on top of NIC,  could it comes from NIC if NIC driver doesn't switch to NAPI
> or from other hardware? But my knowledge about the domain is limited.
> 
>>   There are other race conditions in current rxe that may also be the cause of this. I am trying to get a patch series accepted to deal with those.
> 
> If possible, could you investigate why rxe after 5.15 kernel doesn't work as reported in cover letter? Thank you!
> 
> Guoqing

Guoqing,

It would help to know more about the test setup you are using. I.e. which NIC/driver.
I mostly test on head of tree and things seem to be working.
You could add something like

	if (in_irq())
		<print something once or twice>

to rxe_udp_encap_recv() to check if you are in a hard interrupt in the receive path.

Bob

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index
  2022-02-11 17:37         ` Bob Pearson
@ 2022-02-12  0:59           ` Guoqing Jiang
  0 siblings, 0 replies; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-12  0:59 UTC (permalink / raw)
  To: Bob Pearson, Zhu Yanjun; +Cc: Jason Gunthorpe, RDMA mailing list



On 2/12/22 1:37 AM, Bob Pearson wrote:
> Guoqing,
>
> It would help to know more about the test setup you are using. I.e. which NIC/driver.
> I mostly test on head of tree and things seem to be working.

I runs a  5.17-rc3 inside VM which was configured with e1000e NIC, and 
the three calltraces
can be reproduced 100%, as said, CONFIG_PROVE_LOCKING is enabled.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bug report for rxe
  2022-02-10  7:36 [PATCH 0/3] patches and bug report for rxe Guoqing Jiang
                   ` (2 preceding siblings ...)
  2022-02-10  7:36 ` [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send Guoqing Jiang
@ 2022-02-22  9:50 ` Guoqing Jiang
  2022-02-22 10:04   ` Zhu Yanjun
  3 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-22  9:50 UTC (permalink / raw)
  To: zyjzyj2000, jgg, rpearsonhpe; +Cc: linux-rdma



On 2/10/22 3:36 PM, Guoqing Jiang wrote:
> However, seems rnbd/rtrs over rxe still can't work with 5.17-rc3 kernel,
> dmesg reports below.
>
> 1. server side
>
> [  440.723182] rdma_rxe: qp#17 moved to error state
> [  440.725300] rtrs_server L1205: <bla>: remote access error (wr_cqe: 000000003b14397c, type: 0, vendor_err: 0x0, len: 0)
> [  440.845926] rnbd_server L256: RTRS Session bla disconnected
>
> 2. client side
>
> [  997.817536] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
> [  998.968810] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
> [  999.017988] rtrs_client L610: <bla>: RDMA failed: remote access error
> [ 1029.836943] rtrs_client L353: <bla>: Failed IB_WR_LOCAL_INV: WR flushe
>
> Then I tried 5.16 and 5.15 version, seems 5.15 does work as follows.
>
> 1. server side
>
> [  333.076482] rnbd_server L800: </dev/loop1@bla>: Opened device 'loop1'
>
> 2. client side
>
> [ 1584.325825] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
> [ 1585.268291] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
> [ 1585.349300] rnbd_client L1607: </dev/loop1@bla> map_device: Device mapped as rnbd0 (nsectors: 0, logical_block_size: 512, physical_block_size: 512, max_write_same_sectors: 0, max_discard_sectors: 0, discard_granularity: 0, discard_alignment: 0, secure_discard: 0, max_segments: 128, max_hw_sectors: 248, rotational: 1, wc: 0, fua: 0)
>
> I would appreciate if someone shed light on why it doesn't work after 5.15,
> And I am happy to test potential patch for it.

After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
Create duplicate mapping tables for FMRs"). The problem is mr_check_range
returns -EFAULT after find iova and length are not valid, so connection 
between
two VMs can't be established.

Revert the commit manually or apply below temporary change,  rxe works again
with rnbd/rtrs though I don't think it is the right thing to do. Could 
experts provide
a proper solution? Thanks.

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c 
b/drivers/infiniband/sw/rxe/rxe_mr.c
index 453ef3c9d535..4a2fc4d5809d 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -652,7 +652,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct 
rxe_send_wqe *wqe)
         mr->state = RXE_MR_STATE_VALID;

         set = mr->cur_map_set;
-       mr->cur_map_set = mr->next_map_set;
+       //mr->cur_map_set = mr->next_map_set;
         mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova;
         mr->next_map_set = set;

@@ -662,7 +662,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct 
rxe_send_wqe *wqe)
  int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr)
  {
         struct rxe_mr *mr = to_rmr(ibmr);
-       struct rxe_map_set *set = mr->next_map_set;
+       struct rxe_map_set *set = mr->cur_map_set;
         struct rxe_map *map;
         struct rxe_phys_buf *buf;

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c 
b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 80df9a8f71a1..e41d2c8612d8 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -992,7 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct 
scatterlist *sg,
                          int sg_nents, unsigned int *sg_offset)
  {
         struct rxe_mr *mr = to_rmr(ibmr);
-       struct rxe_map_set *set = mr->next_map_set;
+       struct rxe_map_set *set = mr->cur_map_set;

And the test is pretty simple.

1.  VM (server)

modprobe rdma_rxe
rdma link add rxe0 type rxe netdev ens3
modprobe rnbd-server

2.  VM (client)

modprobe rdma_rxe
rdma link add rxe0 type rxe netdev ens3
modprobe rnbd-client
echo "sessname=bla path=ip:$serverip 
device_path=$block_device_in_server" > 
/sys/devices/virtual/rnbd-client/ctl/map_device

BTW, I tried wip/jgg-for-next branch with commit 3810c1a1cbe8f.

Guoqing

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: bug report for rxe
  2022-02-22  9:50 ` bug report for rxe Guoqing Jiang
@ 2022-02-22 10:04   ` Zhu Yanjun
  2022-02-22 16:58     ` Pearson, Robert B
  0 siblings, 1 reply; 17+ messages in thread
From: Zhu Yanjun @ 2022-02-22 10:04 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list

On Tue, Feb 22, 2022 at 5:50 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
>
>
> On 2/10/22 3:36 PM, Guoqing Jiang wrote:
> > However, seems rnbd/rtrs over rxe still can't work with 5.17-rc3 kernel,
> > dmesg reports below.
> >
> > 1. server side
> >
> > [  440.723182] rdma_rxe: qp#17 moved to error state
> > [  440.725300] rtrs_server L1205: <bla>: remote access error (wr_cqe: 000000003b14397c, type: 0, vendor_err: 0x0, len: 0)
> > [  440.845926] rnbd_server L256: RTRS Session bla disconnected
> >
> > 2. client side
> >
> > [  997.817536] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
> > [  998.968810] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
> > [  999.017988] rtrs_client L610: <bla>: RDMA failed: remote access error
> > [ 1029.836943] rtrs_client L353: <bla>: Failed IB_WR_LOCAL_INV: WR flushe
> >
> > Then I tried 5.16 and 5.15 version, seems 5.15 does work as follows.
> >
> > 1. server side
> >
> > [  333.076482] rnbd_server L800: </dev/loop1@bla>: Opened device 'loop1'
> >
> > 2. client side
> >
> > [ 1584.325825] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0)
> > [ 1585.268291] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues.
> > [ 1585.349300] rnbd_client L1607: </dev/loop1@bla> map_device: Device mapped as rnbd0 (nsectors: 0, logical_block_size: 512, physical_block_size: 512, max_write_same_sectors: 0, max_discard_sectors: 0, discard_granularity: 0, discard_alignment: 0, secure_discard: 0, max_segments: 128, max_hw_sectors: 248, rotational: 1, wc: 0, fua: 0)
> >
> > I would appreciate if someone shed light on why it doesn't work after 5.15,
> > And I am happy to test potential patch for it.
>
> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
> Create duplicate mapping tables for FMRs"). The problem is mr_check_range
> returns -EFAULT after find iova and length are not valid, so connection
> between
> two VMs can't be established.
>
> Revert the commit manually or apply below temporary change,  rxe works again
> with rnbd/rtrs though I don't think it is the right thing to do. Could
> experts provide
> a proper solution? Thanks.
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c
> b/drivers/infiniband/sw/rxe/rxe_mr.c
> index 453ef3c9d535..4a2fc4d5809d 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -652,7 +652,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct
> rxe_send_wqe *wqe)
>          mr->state = RXE_MR_STATE_VALID;
>
>          set = mr->cur_map_set;
> -       mr->cur_map_set = mr->next_map_set;
> +       //mr->cur_map_set = mr->next_map_set;
>          mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova;
>          mr->next_map_set = set;
>
> @@ -662,7 +662,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct
> rxe_send_wqe *wqe)
>   int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr)
>   {
>          struct rxe_mr *mr = to_rmr(ibmr);
> -       struct rxe_map_set *set = mr->next_map_set;
> +       struct rxe_map_set *set = mr->cur_map_set;
>          struct rxe_map *map;
>          struct rxe_phys_buf *buf;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c
> b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 80df9a8f71a1..e41d2c8612d8 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -992,7 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct
> scatterlist *sg,
>                           int sg_nents, unsigned int *sg_offset)
>   {
>          struct rxe_mr *mr = to_rmr(ibmr);
> -       struct rxe_map_set *set = mr->next_map_set;
> +       struct rxe_map_set *set = mr->cur_map_set;

Thanks a lot. Please file a patch for the above changes.

Zhu Yanjun

>
> And the test is pretty simple.
>
> 1.  VM (server)
>
> modprobe rdma_rxe
> rdma link add rxe0 type rxe netdev ens3
> modprobe rnbd-server
>
> 2.  VM (client)
>
> modprobe rdma_rxe
> rdma link add rxe0 type rxe netdev ens3
> modprobe rnbd-client
> echo "sessname=bla path=ip:$serverip
> device_path=$block_device_in_server" >
> /sys/devices/virtual/rnbd-client/ctl/map_device
>
> BTW, I tried wip/jgg-for-next branch with commit 3810c1a1cbe8f.
>
> Guoqing

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: bug report for rxe
  2022-02-22 10:04   ` Zhu Yanjun
@ 2022-02-22 16:58     ` Pearson, Robert B
  2022-02-23  4:43       ` Guoqing Jiang
  0 siblings, 1 reply; 17+ messages in thread
From: Pearson, Robert B @ 2022-02-22 16:58 UTC (permalink / raw)
  To: Zhu Yanjun, Guoqing Jiang; +Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list

>
> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
> Create duplicate mapping tables for FMRs"). The problem is 
> mr_check_range returns -EFAULT after find iova and length are not 
> valid, so connection between two VMs can't be established.
>
> Revert the commit manually or apply below temporary change,  rxe works 
> again with rnbd/rtrs though I don't think it is the right thing to do. 
> Could experts provide a proper solution? Thanks.
>
This patch fixed failures in blktests and srp which were discussed at length. See e.g.

https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/

and related messages. The conclusion was that two mappings were required. One owned by the
driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping
while the old one is still active and then switching them.

If this case has iova and length not valid as indicated is there a problem with the test case?

Bob

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bug report for rxe
  2022-02-22 16:58     ` Pearson, Robert B
@ 2022-02-23  4:43       ` Guoqing Jiang
  2022-02-23  5:01         ` Bob Pearson
  0 siblings, 1 reply; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-23  4:43 UTC (permalink / raw)
  To: Pearson, Robert B, Zhu Yanjun, Jinpu Wang
  Cc: Jason Gunthorpe, Bob Pearson, RDMA mailing list



On 2/23/22 12:58 AM, Pearson, Robert B wrote:
>> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
>> Create duplicate mapping tables for FMRs"). The problem is
>> mr_check_range returns -EFAULT after find iova and length are not
>> valid, so connection between two VMs can't be established.
>>
>> Revert the commit manually or apply below temporary change,  rxe works
>> again with rnbd/rtrs though I don't think it is the right thing to do.
>> Could experts provide a proper solution? Thanks.
>>
> This patch fixed failures in blktests and srp which were discussed at length. See e.g.
>
> https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/

Thanks for the link, which reminds me the always_invalidate parameter in 
rtrs_server.

> and related messages. The conclusion was that two mappings were required. One owned by the
> driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping
> while the old one is still active and then switching them.
>
> If this case has iova and length not valid as indicated is there a problem with the test case?

And after disable always_invalidate (which is enabled by default), 
rnbd/rtrs over roce
works either. So I suppose there might be potential issue for 
always_invalidate=Y in
rtrs server side since invalidate works for srp IIUC, @Jinpu.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bug report for rxe
  2022-02-23  4:43       ` Guoqing Jiang
@ 2022-02-23  5:01         ` Bob Pearson
  2022-02-23  5:50           ` Guoqing Jiang
  0 siblings, 1 reply; 17+ messages in thread
From: Bob Pearson @ 2022-02-23  5:01 UTC (permalink / raw)
  To: Guoqing Jiang, Pearson, Robert B, Zhu Yanjun, Jinpu Wang
  Cc: Jason Gunthorpe, RDMA mailing list

On 2/22/22 22:43, Guoqing Jiang wrote:
> 
> 
> On 2/23/22 12:58 AM, Pearson, Robert B wrote:
>>> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
>>> Create duplicate mapping tables for FMRs"). The problem is
>>> mr_check_range returns -EFAULT after find iova and length are not
>>> valid, so connection between two VMs can't be established.
>>>
>>> Revert the commit manually or apply below temporary change,  rxe works
>>> again with rnbd/rtrs though I don't think it is the right thing to do.
>>> Could experts provide a proper solution? Thanks.
>>>
>> This patch fixed failures in blktests and srp which were discussed at length. See e.g.
>>
>> https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/
> 
> Thanks for the link, which reminds me the always_invalidate parameter in rtrs_server.
> 
>> and related messages. The conclusion was that two mappings were required. One owned by the
>> driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping
>> while the old one is still active and then switching them.
>>
>> If this case has iova and length not valid as indicated is there a problem with the test case?
> 
> And after disable always_invalidate (which is enabled by default), rnbd/rtrs over roce
> works either. So I suppose there might be potential issue for always_invalidate=Y in
> rtrs server side since invalidate works for srp IIUC, @Jinpu.
> 
> Thanks,
> Guoqing

It would help to understand what you are running. My main concern is that mr_check_range
shouldn't fail unless there is something very wrong.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bug report for rxe
  2022-02-23  5:01         ` Bob Pearson
@ 2022-02-23  5:50           ` Guoqing Jiang
  0 siblings, 0 replies; 17+ messages in thread
From: Guoqing Jiang @ 2022-02-23  5:50 UTC (permalink / raw)
  To: Bob Pearson, Pearson, Robert B, Zhu Yanjun, Jinpu Wang
  Cc: Jason Gunthorpe, RDMA mailing list



On 2/23/22 1:01 PM, Bob Pearson wrote:
> On 2/22/22 22:43, Guoqing Jiang wrote:
>>
>> On 2/23/22 12:58 AM, Pearson, Robert B wrote:
>>>> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe:
>>>> Create duplicate mapping tables for FMRs"). The problem is
>>>> mr_check_range returns -EFAULT after find iova and length are not
>>>> valid, so connection between two VMs can't be established.
>>>>
>>>> Revert the commit manually or apply below temporary change,  rxe works
>>>> again with rnbd/rtrs though I don't think it is the right thing to do.
>>>> Could experts provide a proper solution? Thanks.
>>>>
>>> This patch fixed failures in blktests and srp which were discussed at length. See e.g.
>>>
>>> https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/
>> Thanks for the link, which reminds me the always_invalidate parameter in rtrs_server.
>>
>>> and related messages. The conclusion was that two mappings were required. One owned by the
>>> driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping
>>> while the old one is still active and then switching them.
>>>
>>> If this case has iova and length not valid as indicated is there a problem with the test case?
>> And after disable always_invalidate (which is enabled by default), rnbd/rtrs over roce
>> works either. So I suppose there might be potential issue for always_invalidate=Y in
>> rtrs server side since invalidate works for srp IIUC, @Jinpu.
>>
>> Thanks,
>> Guoqing
> It would help to understand what you are running. My main concern is that mr_check_range
> shouldn't fail unless there is something very wrong.

Let me paste it again, I just try to map server block device to client 
side, specifically the bold
line.

1.  VM (server)

modprobe rdma_rxe
rdma link add rxe0 type rxe netdev ens3
modprobe rnbd-server

2.  VM (client)

modprobe rdma_rxe
rdma link add rxe0 type rxe netdev ens3
modprobe rnbd-client
*echo "sessname=bla path=ip:$serverip 
device_path=$block_device_in_server" > 
/sys/devices/virtual/rnbd-client/ctl/map_device*



Thanks,
Guoqing


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-02-23  5:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  7:36 [PATCH 0/3] patches and bug report for rxe Guoqing Jiang
2022-02-10  7:36 ` [PATCH 1/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_add_index Guoqing Jiang
2022-02-10 13:29   ` Zhu Yanjun
2022-02-10  7:36 ` [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index Guoqing Jiang
2022-02-10 14:16   ` Zhu Yanjun
2022-02-10 15:49     ` Bob Pearson
2022-02-11 10:09       ` Guoqing Jiang
2022-02-11 17:37         ` Bob Pearson
2022-02-12  0:59           ` Guoqing Jiang
2022-02-10  7:36 ` [PATCH 3/3] RDMA/rxe: Replace spin_lock_bh with spin_lock_irqsave in post_one_send Guoqing Jiang
2022-02-10 14:18   ` Zhu Yanjun
2022-02-22  9:50 ` bug report for rxe Guoqing Jiang
2022-02-22 10:04   ` Zhu Yanjun
2022-02-22 16:58     ` Pearson, Robert B
2022-02-23  4:43       ` Guoqing Jiang
2022-02-23  5:01         ` Bob Pearson
2022-02-23  5:50           ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.