linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-rc] RDMA/mlx5: Fix crash when unbind multiport slave
@ 2021-08-10  9:25 Leon Romanovsky
  2021-08-19 13:24 ` Jason Gunthorpe
  0 siblings, 1 reply; 2+ messages in thread
From: Leon Romanovsky @ 2021-08-10  9:25 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Maor Gottlieb, linux-kernel, linux-rdma

From: Maor Gottlieb <maorg@nvidia.com>

Fix the below crash when delete slave from the unaffiliated list
twice. First time when the slave is bounded to the master and the
second when the slave is unloaded.

Fix it by checking if slave is unaffiliated (doesn't have ib device)
before removing from the list.

[ 5140.584361] RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib]
[ 5140.595866] Call Trace:
[ 5140.596213]  auxiliary_bus_remove+0x18/0x30
[ 5140.596738]  __device_release_driver+0x177/x220
[ 5140.597304]  device_release_driver+0x24/0x30
[ 5140.597832]  bus_remove_device+0xd8/0x140
[ 5140.598339]  device_del+0x18a/0x3e0
[ 5140.598795]  mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core]
[ 5140.599521]  mlx5_unregister_device+0x34/0x60 [mlx5_core]
[ 5140.600184]  mlx5_uninit_one+0x32/0x100 [mlx5_core]
[ 5140.600792]  remove_one+0x6e/0xe0 [mlx5_core]
[ 5140.601350]  pci_device_remove+0x36/0xa0
[ 5140.601846]  __device_release_driver+0x177/0x220
[ 5140.602408]  device_driver_detach+0x3c/0xa0
[ 5140.602931]  unbind_store+0x113/0x130
[ 5140.603400]  kernfs_fop_write_iter+0x110/0x1a0
[ 5140.603942]  new_sync_write+0x116/0x1a0
[ 5140.604428]  vfs_write+0x1ba/0x260
[ 5140.604873]  ksys_write+0x5f/0xe0
[ 5140.605310]  do_syscall_64+0x3d/0x90
[ 5140.605778]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 93f8244431ad ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 094c976b1eed..2507051f7b89 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -4454,7 +4454,8 @@ static void mlx5r_mp_remove(struct auxiliary_device *adev)
 	mutex_lock(&mlx5_ib_multiport_mutex);
 	if (mpi->ibdev)
 		mlx5_ib_unbind_slave_port(mpi->ibdev, mpi);
-	list_del(&mpi->list);
+	else
+		list_del(&mpi->list);
 	mutex_unlock(&mlx5_ib_multiport_mutex);
 	kfree(mpi);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH rdma-rc] RDMA/mlx5: Fix crash when unbind multiport slave
  2021-08-10  9:25 [PATCH rdma-rc] RDMA/mlx5: Fix crash when unbind multiport slave Leon Romanovsky
@ 2021-08-19 13:24 ` Jason Gunthorpe
  0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2021-08-19 13:24 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Maor Gottlieb, linux-kernel, linux-rdma

On Tue, Aug 10, 2021 at 12:25:11PM +0300, Leon Romanovsky wrote:
> From: Maor Gottlieb <maorg@nvidia.com>
> 
> Fix the below crash when delete slave from the unaffiliated list
> twice. First time when the slave is bounded to the master and the
> second when the slave is unloaded.
> 
> Fix it by checking if slave is unaffiliated (doesn't have ib device)
> before removing from the list.
> 
> [ 5140.584361] RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib]
> [ 5140.595866] Call Trace:
> [ 5140.596213]  auxiliary_bus_remove+0x18/0x30
> [ 5140.596738]  __device_release_driver+0x177/x220
> [ 5140.597304]  device_release_driver+0x24/0x30
> [ 5140.597832]  bus_remove_device+0xd8/0x140
> [ 5140.598339]  device_del+0x18a/0x3e0
> [ 5140.598795]  mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core]
> [ 5140.599521]  mlx5_unregister_device+0x34/0x60 [mlx5_core]
> [ 5140.600184]  mlx5_uninit_one+0x32/0x100 [mlx5_core]
> [ 5140.600792]  remove_one+0x6e/0xe0 [mlx5_core]
> [ 5140.601350]  pci_device_remove+0x36/0xa0
> [ 5140.601846]  __device_release_driver+0x177/0x220
> [ 5140.602408]  device_driver_detach+0x3c/0xa0
> [ 5140.602931]  unbind_store+0x113/0x130
> [ 5140.603400]  kernfs_fop_write_iter+0x110/0x1a0
> [ 5140.603942]  new_sync_write+0x116/0x1a0
> [ 5140.604428]  vfs_write+0x1ba/0x260
> [ 5140.604873]  ksys_write+0x5f/0xe0
> [ 5140.605310]  do_syscall_64+0x3d/0x90
> [ 5140.605778]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Fixes: 93f8244431ad ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus")
> Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/hw/mlx5/main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-19 13:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10  9:25 [PATCH rdma-rc] RDMA/mlx5: Fix crash when unbind multiport slave Leon Romanovsky
2021-08-19 13:24 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).