linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
@ 2020-07-19  6:57 Leon Romanovsky
  2020-07-20 14:31 ` Jason Gunthorpe
  2020-07-24 19:51 ` Jason Gunthorpe
  0 siblings, 2 replies; 4+ messages in thread
From: Leon Romanovsky @ 2020-07-19  6:57 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Maor Gottlieb, Artemy Kovalyov, linux-rdma

From: Maor Gottlieb <maorg@mellanox.com>

According to the locking scheme, mlx5_ib_update_xlt should be called
with srcu_read_lock().
This fixes the below LOCKDEP trace.

[  861.917222] WARNING: CPU: 1 PID: 1130 at
drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180
[mlx5_ib]
[  861.921080] Modules linked in: xt_conntrack xt_MASQUERADE
nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp
scsi_transport_srp rpcrdma rdma_ucm ib_iser libiscsi
scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib
ib_uverbs ib_core mlx5_core mlxfw ptp pps_core
[  861.930133] CPU: 1 PID: 1130 Comm: kworker/u16:11 Tainted: G        W
5.8.0-rc5_for_upstream_debug_2020_07_13_11_04 #1
[  861.932034] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[  861.933801] Workqueue: events_unbound mlx5_ib_prefetch_mr_work
[mlx5_ib]
[  861.934933] RIP: 0010:mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
[  861.935954] Code: 08 e2 85 c0 0f 84 65 ff ff ff 49 8b 87 60 01 00 00
be ff ff ff ff 48 8d b8 b0 39 00 00 e8 93 e0 50 e1 85 c0 0f 85 45 ff ff
ff <0f> 0b e9 3e ff ff ff 0f 0b eb c7 0f 1f 44 00 00 48 8b 87 98 0f 00
[  861.938855] RSP: 0018:ffff88840f44fc68 EFLAGS: 00010246
[  861.939707] RAX: 0000000000000000 RBX: ffff88840cc9d000 RCX:
ffff88840efcd940
[  861.940788] RDX: 0000000000000000 RSI: ffff88844871b9b0 RDI:
ffff88840efce100
[  861.941880] RBP: ffff88840cc9d040 R08: 0000000000000040 R09:
0000000000000001
[  861.943020] R10: ffff88846ced3068 R11: 0000000000000000 R12:
00000000000156ec
[  861.944133] R13: 0000000000000004 R14: 0000000000000004 R15:
ffff888439941000
[  861.945228] FS:  0000000000000000(0000) GS:ffff88846fa80000(0000)
knlGS:0000000000000000
[  861.946557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  861.947512] CR2: 00007f8536d12430 CR3: 0000000437a5e006 CR4:
0000000000360ea0
[  861.948609] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  861.949684] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  861.950804] Call Trace:
[  861.951300]  mlx5_ib_update_xlt+0x37c/0x7c0 [mlx5_ib]
[  861.952145]  pagefault_mr+0x315/0x440 [mlx5_ib]
[  861.952895]  mlx5_ib_prefetch_mr_work+0x56/0xa0 [mlx5_ib]
[  861.953763]  process_one_work+0x215/0x5c0
[  861.954477]  worker_thread+0x3c/0x380
[  861.955127]  ? process_one_work+0x5c0/0x5c0
[  861.955851]  kthread+0x133/0x150
[  861.956426]  ? kthread_park+0x90/0x90
[  861.957061]  ret_from_fork+0x1f/0x30
[  861.957686] irq event stamp: 199569
[  861.958318] hardirqs last  enabled at (199577): [<ffffffff8119bf39>]
console_unlock+0x439/0x590
[  861.959720] hardirqs last disabled at (199584): [<ffffffff8119bb81>]
console_unlock+0x81/0x590
[  861.961080] softirqs last  enabled at (199600): [<ffffffff81e00293>]
__do_softirq+0x293/0x47e
[  861.971725] softirqs last disabled at (199613): [<ffffffff81c00f0f>]
asm_call_on_stack+0xf/0x20
[  861.973142] ---[ end trace e6f026aa6012c21e ]---

Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
It is not really need to go to -rc, not urgent.
---
 drivers/infiniband/core/uverbs_std_types_mr.c | 2 +-
 drivers/infiniband/core/verbs.c               | 3 +++
 drivers/infiniband/hw/mlx5/odp.c              | 9 +++++++++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_std_types_mr.c b/drivers/infiniband/core/uverbs_std_types_mr.c
index 62f58ad56afd..9b22bb553e8b 100644
--- a/drivers/infiniband/core/uverbs_std_types_mr.c
+++ b/drivers/infiniband/core/uverbs_std_types_mr.c
@@ -69,7 +69,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_ADVISE_MR)(

 	num_sge = uverbs_attr_ptr_get_array_size(
 		attrs, UVERBS_ATTR_ADVISE_MR_SGE_LIST, sizeof(struct ib_sge));
-	if (num_sge < 0)
+	if (num_sge <= 0)
 		return num_sge;

 	sg_list = uverbs_attr_get_alloced_ptr(attrs,
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 73f3ea8af714..674949233d21 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2073,6 +2073,9 @@ int ib_advise_mr(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice,
 	if (!pd->device->ops.advise_mr)
 		return -EOPNOTSUPP;

+	if (!num_sge)
+		return 0;
+
 	return pd->device->ops.advise_mr(pd, advice, flags, sg_list, num_sge,
 					 NULL);
 }
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index aa6f08ad0d03..4f1e46733830 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -816,6 +816,7 @@ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt,
 {
 	struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem);

+	lockdep_assert_held(&mr->dev->odp_srcu);
 	if (unlikely(io_virt < mr->mmkey.iova))
 		return -EFAULT;

@@ -1765,10 +1766,17 @@ static void mlx5_ib_prefetch_mr_work(struct work_struct *w)
 {
 	struct prefetch_mr_work *work =
 		container_of(w, struct prefetch_mr_work, work);
+	struct mlx5_ib_dev *dev;
 	u32 bytes_mapped = 0;
+	int srcu_key;
 	int ret;
 	u32 i;

+	/* We rely on IB/core that work is executed if we have num_sge != 0 only. */
+	WARN_ON(!work->num_sge);
+	dev = work->frags[0].mr->dev;
+	/* SRCU should be held when calling to mlx5_odp_populate_xlt() */
+	srcu_key = srcu_read_lock(&dev->odp_srcu);
 	for (i = 0; i < work->num_sge; ++i) {
 		ret = pagefault_mr(work->frags[i].mr, work->frags[i].io_virt,
 				   work->frags[i].length, &bytes_mapped,
@@ -1777,6 +1785,7 @@ static void mlx5_ib_prefetch_mr_work(struct work_struct *w)
 			continue;
 		mlx5_update_odp_stats(work->frags[i].mr, prefetch, ret);
 	}
+	srcu_read_unlock(&dev->odp_srcu, srcu_key);

 	destroy_prefetch_work(work);
 }
--
2.26.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
  2020-07-19  6:57 [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow Leon Romanovsky
@ 2020-07-20 14:31 ` Jason Gunthorpe
  2020-07-20 16:53   ` Leon Romanovsky
  2020-07-24 19:51 ` Jason Gunthorpe
  1 sibling, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2020-07-20 14:31 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Maor Gottlieb, Artemy Kovalyov, linux-rdma

On Sun, Jul 19, 2020 at 09:57:47AM +0300, Leon Romanovsky wrote:
> From: Maor Gottlieb <maorg@mellanox.com>
> 
> According to the locking scheme, mlx5_ib_update_xlt should be called
> with srcu_read_lock().
> This fixes the below LOCKDEP trace.
> 
> [  861.917222] WARNING: CPU: 1 PID: 1130 at
> drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180
> [mlx5_ib]

Why do I keep getting patches with oops reports word wrapped??

Do not wrap oops.

Do not include timestamp in oops.

Trim useless informatin


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
  2020-07-20 14:31 ` Jason Gunthorpe
@ 2020-07-20 16:53   ` Leon Romanovsky
  0 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2020-07-20 16:53 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, Maor Gottlieb, Artemy Kovalyov, linux-rdma

On Mon, Jul 20, 2020 at 11:31:05AM -0300, Jason Gunthorpe wrote:
> On Sun, Jul 19, 2020 at 09:57:47AM +0300, Leon Romanovsky wrote:
> > From: Maor Gottlieb <maorg@mellanox.com>
> >
> > According to the locking scheme, mlx5_ib_update_xlt should be called
> > with srcu_read_lock().
> > This fixes the below LOCKDEP trace.
> >
> > [  861.917222] WARNING: CPU: 1 PID: 1130 at
> > drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180
> > [mlx5_ib]
>
> Why do I keep getting patches with oops reports word wrapped??
>
> Do not wrap oops.
>
> Do not include timestamp in oops.
>
> Trim useless informatin

I'll try my best.

This note for INTERNAL review, please DON'T trim anything from OOPS, I
don't want to lose information just because someone was too extatic.

Thanks

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
  2020-07-19  6:57 [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow Leon Romanovsky
  2020-07-20 14:31 ` Jason Gunthorpe
@ 2020-07-24 19:51 ` Jason Gunthorpe
  1 sibling, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2020-07-24 19:51 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Maor Gottlieb, Artemy Kovalyov, linux-rdma

On Sun, Jul 19, 2020 at 09:57:47AM +0300, Leon Romanovsky wrote:
> From: Maor Gottlieb <maorg@mellanox.com>
> 
> According to the locking scheme, mlx5_ib_update_xlt should be called
> with srcu_read_lock().
> This fixes the below LOCKDEP trace.
> 
> [  861.917222] WARNING: CPU: 1 PID: 1130 at
> drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180
> [mlx5_ib]
> [  861.921080] Modules linked in: xt_conntrack xt_MASQUERADE
> nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp
> scsi_transport_srp rpcrdma rdma_ucm ib_iser libiscsi
> scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib
> ib_uverbs ib_core mlx5_core mlxfw ptp pps_core
> [  861.930133] CPU: 1 PID: 1130 Comm: kworker/u16:11 Tainted: G        W
> 5.8.0-rc5_for_upstream_debug_2020_07_13_11_04 #1
> [  861.932034] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [  861.933801] Workqueue: events_unbound mlx5_ib_prefetch_mr_work
> [mlx5_ib]
> [  861.934933] RIP: 0010:mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
> [  861.935954] Code: 08 e2 85 c0 0f 84 65 ff ff ff 49 8b 87 60 01 00 00
> be ff ff ff ff 48 8d b8 b0 39 00 00 e8 93 e0 50 e1 85 c0 0f 85 45 ff ff
> ff <0f> 0b e9 3e ff ff ff 0f 0b eb c7 0f 1f 44 00 00 48 8b 87 98 0f 00
> [  861.938855] RSP: 0018:ffff88840f44fc68 EFLAGS: 00010246
> [  861.939707] RAX: 0000000000000000 RBX: ffff88840cc9d000 RCX:
> ffff88840efcd940
> [  861.940788] RDX: 0000000000000000 RSI: ffff88844871b9b0 RDI:
> ffff88840efce100
> [  861.941880] RBP: ffff88840cc9d040 R08: 0000000000000040 R09:
> 0000000000000001
> [  861.943020] R10: ffff88846ced3068 R11: 0000000000000000 R12:
> 00000000000156ec
> [  861.944133] R13: 0000000000000004 R14: 0000000000000004 R15:
> ffff888439941000
> [  861.945228] FS:  0000000000000000(0000) GS:ffff88846fa80000(0000)
> knlGS:0000000000000000
> [  861.946557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  861.947512] CR2: 00007f8536d12430 CR3: 0000000437a5e006 CR4:
> 0000000000360ea0
> [  861.948609] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  861.949684] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [  861.950804] Call Trace:
> [  861.951300]  mlx5_ib_update_xlt+0x37c/0x7c0 [mlx5_ib]
> [  861.952145]  pagefault_mr+0x315/0x440 [mlx5_ib]
> [  861.952895]  mlx5_ib_prefetch_mr_work+0x56/0xa0 [mlx5_ib]
> [  861.953763]  process_one_work+0x215/0x5c0
> [  861.954477]  worker_thread+0x3c/0x380
> [  861.955127]  ? process_one_work+0x5c0/0x5c0
> [  861.955851]  kthread+0x133/0x150
> [  861.956426]  ? kthread_park+0x90/0x90
> [  861.957061]  ret_from_fork+0x1f/0x30
> [  861.957686] irq event stamp: 199569
> [  861.958318] hardirqs last  enabled at (199577): [<ffffffff8119bf39>]
> console_unlock+0x439/0x590
> [  861.959720] hardirqs last disabled at (199584): [<ffffffff8119bb81>]
> console_unlock+0x81/0x590
> [  861.961080] softirqs last  enabled at (199600): [<ffffffff81e00293>]
> __do_softirq+0x293/0x47e
> [  861.971725] softirqs last disabled at (199613): [<ffffffff81c00f0f>]
> asm_call_on_stack+0xf/0x20
> [  861.973142] ---[ end trace e6f026aa6012c21e ]---
> 
> Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
> It is not really need to go to -rc, not urgent.
> ---
>  drivers/infiniband/core/uverbs_std_types_mr.c | 2 +-
>  drivers/infiniband/core/verbs.c               | 3 +++
>  drivers/infiniband/hw/mlx5/odp.c              | 9 +++++++++
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> --
> 2.26.2

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-24 19:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-19  6:57 [PATCH rdma-next] RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow Leon Romanovsky
2020-07-20 14:31 ` Jason Gunthorpe
2020-07-20 16:53   ` Leon Romanovsky
2020-07-24 19:51 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).