From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>
Cc: Aharon Landau <aharonl@nvidia.com>,
Alaa Hleihel <alaa@nvidia.com>,
linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
Maor Gottlieb <maorg@nvidia.com>
Subject: [PATCH rdma-rc 2/3] RDMA/mlx5: Delete right entry from MR signature database
Date: Thu, 10 Jun 2021 10:34:26 +0300 [thread overview]
Message-ID: <f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com> (raw)
In-Reply-To: <cover.1623309971.git.leonro@nvidia.com>
From: Aharon Landau <aharonl@nvidia.com>
The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is entered here as "old", therefore, xa_cmpxchg() does not replace the
entry with NULL, which leads to the following trace:
WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
CPU: 28 PID: 2078 Comm: reboot Tainted: G X --------- --- 5.13.0-0.rc2.19.el9.x86_64 #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
FS: 00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
Call Trace:
mlx5r_remove+0x39/0x60 [mlx5_ib]
auxiliary_bus_remove+0x1b/0x30
__device_release_driver+0x17a/0x230
device_release_driver+0x24/0x30
bus_remove_device+0xdb/0x140
device_del+0x18b/0x3e0
mlx5_detach_device+0x59/0x90 [mlx5_core]
mlx5_unload_one+0x22/0x60 [mlx5_core]
shutdown+0x31/0x3a [mlx5_core]
pci_device_shutdown+0x34/0x60
device_shutdown+0x15b/0x1c0
__do_sys_reboot.cold+0x2f/0x5b
? vfs_writev+0xc7/0x140
? handle_mm_fault+0xc5/0x290
? do_writev+0x6b/0x110
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f0d5b5132e7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 8
RSP: 002b:00007ffd7c7b8388 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0d5b5132e7
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007ffd7c7b83d0 R08: 000000000000000a R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000006
R13: 00007ffd7c7b8578 R14: 0000564da83690bd R15: 000000000000000
Fixes: e6fb246ccafb ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/infiniband/hw/mlx5/mr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 9662cd39c7ff..425423dfac72 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1940,8 +1940,8 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
mlx5r_deref_wait_odp_mkey(&mr->mmkey);
if (ibmr->type == IB_MR_TYPE_INTEGRITY) {
- xa_cmpxchg(&dev->sig_mrs, mlx5_base_mkey(mr->mmkey.key), ibmr,
- NULL, GFP_KERNEL);
+ xa_cmpxchg(&dev->sig_mrs, mlx5_base_mkey(mr->mmkey.key),
+ mr->sig, NULL, GFP_KERNEL);
if (mr->mtt_mr) {
rc = mlx5_ib_dereg_mr(&mr->mtt_mr->ibmr, NULL);
--
2.31.1
next prev parent reply other threads:[~2021-06-10 7:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 7:34 [PATCH rdma-rc 0/3] Collection of small fixes to mlx5_ib Leon Romanovsky
2021-06-10 7:34 ` [PATCH rdma-rc 1/3] RDMA: Verify port when creating flow rule Leon Romanovsky
2021-06-10 7:34 ` Leon Romanovsky [this message]
2021-06-10 7:34 ` [PATCH rdma-rc 3/3] IB/mlx5: Fix initializing CQ fragments buffer Leon Romanovsky
2021-06-10 12:38 ` [PATCH rdma-rc 0/3] Collection of small fixes to mlx5_ib Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com \
--to=leon@kernel.org \
--cc=aharonl@nvidia.com \
--cc=alaa@nvidia.com \
--cc=dledford@redhat.com \
--cc=jgg@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maorg@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.