All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>
Cc: Aharon Landau <aharonl@nvidia.com>,
	Alaa Hleihel <alaa@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Maor Gottlieb <maorg@nvidia.com>
Subject: [PATCH rdma-rc 2/3] RDMA/mlx5: Delete right entry from MR signature database
Date: Thu, 10 Jun 2021 10:34:26 +0300	[thread overview]
Message-ID: <f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com> (raw)
In-Reply-To: <cover.1623309971.git.leonro@nvidia.com>

From: Aharon Landau <aharonl@nvidia.com>

The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is entered here as "old", therefore, xa_cmpxchg() does not replace the
entry with NULL, which leads to the following trace:

 WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
 CPU: 28 PID: 2078 Comm: reboot Tainted: G               X --------- ---  5.13.0-0.rc2.19.el9.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
 RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
 RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
 RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
 RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
 R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
 FS:  00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
 Call Trace:
  mlx5r_remove+0x39/0x60 [mlx5_ib]
  auxiliary_bus_remove+0x1b/0x30
  __device_release_driver+0x17a/0x230
  device_release_driver+0x24/0x30
  bus_remove_device+0xdb/0x140
  device_del+0x18b/0x3e0
  mlx5_detach_device+0x59/0x90 [mlx5_core]
  mlx5_unload_one+0x22/0x60 [mlx5_core]
  shutdown+0x31/0x3a [mlx5_core]
  pci_device_shutdown+0x34/0x60
  device_shutdown+0x15b/0x1c0
  __do_sys_reboot.cold+0x2f/0x5b
  ? vfs_writev+0xc7/0x140
  ? handle_mm_fault+0xc5/0x290
  ? do_writev+0x6b/0x110
  do_syscall_64+0x40/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f0d5b5132e7
 Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 8
 RSP: 002b:00007ffd7c7b8388 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0d5b5132e7
 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
 RBP: 00007ffd7c7b83d0 R08: 000000000000000a R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000006
 R13: 00007ffd7c7b8578 R14: 0000564da83690bd R15: 000000000000000

Fixes: e6fb246ccafb ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/mr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 9662cd39c7ff..425423dfac72 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1940,8 +1940,8 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 		mlx5r_deref_wait_odp_mkey(&mr->mmkey);
 
 	if (ibmr->type == IB_MR_TYPE_INTEGRITY) {
-		xa_cmpxchg(&dev->sig_mrs, mlx5_base_mkey(mr->mmkey.key), ibmr,
-			   NULL, GFP_KERNEL);
+		xa_cmpxchg(&dev->sig_mrs, mlx5_base_mkey(mr->mmkey.key),
+			   mr->sig, NULL, GFP_KERNEL);
 
 		if (mr->mtt_mr) {
 			rc = mlx5_ib_dereg_mr(&mr->mtt_mr->ibmr, NULL);
-- 
2.31.1


  parent reply	other threads:[~2021-06-10  7:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10  7:34 [PATCH rdma-rc 0/3] Collection of small fixes to mlx5_ib Leon Romanovsky
2021-06-10  7:34 ` [PATCH rdma-rc 1/3] RDMA: Verify port when creating flow rule Leon Romanovsky
2021-06-10  7:34 ` Leon Romanovsky [this message]
2021-06-10  7:34 ` [PATCH rdma-rc 3/3] IB/mlx5: Fix initializing CQ fragments buffer Leon Romanovsky
2021-06-10 12:38 ` [PATCH rdma-rc 0/3] Collection of small fixes to mlx5_ib Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com \
    --to=leon@kernel.org \
    --cc=aharonl@nvidia.com \
    --cc=alaa@nvidia.com \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maorg@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.