All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford@redhat.com>
To: Leon Romanovsky <leon@kernel.org>, Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: Re: [PATCH rdma-rc] RDMA/mlx5: Release locks during notifier unregister
Date: Wed, 31 Jul 2019 12:22:44 -0400	[thread overview]
Message-ID: <44863abbef5c1e233cbedfdf959fe900f7722d74.camel@redhat.com> (raw)
In-Reply-To: <20190731083852.584-1-leon@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 6111 bytes --]

On Wed, 2019-07-31 at 11:38 +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> The below kernel panic was observed when created bond mode LACP
> with GRE tunnel on top. The reason to it was not released spinlock
> during mlx5 notify unregsiter sequence.
> 
> [  234.562007] BUG: scheduling while atomic: sh/10900/0x00000002
> [  234.563005] Preemption disabled at:
> [  234.566864] ------------[ cut here ]------------
> [  234.567120] DEBUG_LOCKS_WARN_ON(val > preempt_count())
> [  234.567139] WARNING: CPU: 16 PID: 10900 at kernel/sched/core.c:3203
> preempt_count_sub+0xca/0x170
> [  234.569550] CPU: 16 PID: 10900 Comm: sh Tainted: G        W 5.2.0-
> rc1-for-linust-dbg-2019-05-25_04-57-33-60 #1
> [  234.569886] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS
> 2.6.1 02/12/2018
> [  234.570183] RIP: 0010:preempt_count_sub+0xca/0x170
> [  234.570404] Code: 03 38
> d0 7c 08 84 d2 0f 85 b0 00 00 00 8b 15 dd 02 03 04 85 d2 75 ba 48 c7
> c6
> 00 e1 88 83 48 c7 c7 40 e1 88 83 e8 76 11 f7 ff <0f> 0b 5b c3 65 8b 05
> d3 1f d8 7e 84 c0 75 82 e8 62 c3 c3 00 85 c0
> [  234.570911] RSP: 0018:ffff888b94477b08 EFLAGS: 00010286
> [  234.571133] RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 0000000000000000
> [  234.571391] RDX: 0000000000000000 RSI: 0000000000000004 RDI:
> 0000000000000246
> [  234.571648] RBP: ffff888ba5560000 R08: fffffbfff08962d5 R09:
> fffffbfff08962d5
> [  234.571902] R10: 0000000000000001 R11: fffffbfff08962d4 R12:
> ffff888bac6e9548
> [  234.572157] R13: ffff888babfaf728 R14: ffff888bac6e9568 R15:
> ffff888babfaf750
> [  234.572412] FS: 00007fcafa59b740(0000) GS:ffff888bed200000(0000)
> knlGS:0000000000000000
> [  234.572686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  234.572914] CR2: 00007f984f16b140 CR3: 0000000b2bf0a001 CR4:
> 00000000001606e0
> [  234.573172] Call Trace:
> [  234.573336] _raw_spin_unlock+0x2e/0x50
> [  234.573542] mlx5_ib_unbind_slave_port+0x1bc/0x690 [mlx5_ib]
> [  234.573793] mlx5_ib_cleanup_multiport_master+0x1d3/0x660 [mlx5_ib]
> [  234.574039] mlx5_ib_stage_init_cleanup+0x4c/0x360 [mlx5_ib]
> [  234.574271]  ? kfree+0xf5/0x2f0
> [  234.574465] __mlx5_ib_remove+0x61/0xd0 [mlx5_ib]
> [  234.574688]  ? __mlx5_ib_remove+0xd0/0xd0 [mlx5_ib]
> [  234.574951] mlx5_remove_device+0x234/0x300 [mlx5_core]
> [  234.575224] mlx5_unregister_device+0x4d/0x1e0 [mlx5_core]
> [  234.575493] remove_one+0x4f/0x160 [mlx5_core]
> [  234.575704] pci_device_remove+0xef/0x2a0
> [  234.581407]  ? pcibios_free_irq+0x10/0x10
> [  234.587143]  ? up_read+0xc1/0x260
> [  234.592785] device_release_driver_internal+0x1ab/0x430
> [  234.598442] unbind_store+0x152/0x200
> [  234.604064]  ? sysfs_kf_write+0x3b/0x180
> [  234.609441]  ? sysfs_file_ops+0x160/0x160
> [  234.615021] kernfs_fop_write+0x277/0x440
> [  234.620288]  ? __sb_start_write+0x1ef/0x2c0
> [  234.625512] vfs_write+0x15e/0x460
> [  234.630786] ksys_write+0x156/0x1e0
> [  234.635988]  ? __ia32_sys_read+0xb0/0xb0
> [  234.641120]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [  234.646163] do_syscall_64+0x95/0x470
> [  234.651106] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  234.656004] RIP: 0033:0x7fcaf9c9cfd0
> [  234.660686] Code: 73 01
> c3 48 8b 0d c0 6e 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00
> 00
> 83 3d cd cf 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73
> 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24
> [  234.670128] RSP: 002b:00007ffd3b01ddd8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  234.674811] RAX: ffffffffffffffda RBX: 000000000000000d RCX:
> 00007fcaf9c9cfd0
> [  234.679387] RDX: 000000000000000d RSI: 00007fcafa5c1000 RDI:
> 0000000000000001
> [  234.683848] RBP: 00007fcafa5c1000 R08: 000000000000000a R09:
> 00007fcafa59b740
> [  234.688167] R10: 00007ffd3b01d8e0 R11: 0000000000000246 R12:
> 00007fcaf9f75400
> [  234.692386] R13: 000000000000000d R14: 0000000000000001 R15:
> 0000000000000000
> [  234.696495] irq event stamp: 153067
> [  234.700525] hardirqs last enabled at (153067): [<ffffffff83258c39>]
> _raw_spin_unlock_irqrestore+0x59/0x70
> [  234.704665] hardirqs last disabled at (153066):
> [<ffffffff83259382>] _raw_spin_lock_irqsave+0x22/0x90
> [  234.708722] softirqs last enabled at (153058): [<ffffffff836006c5>]
> __do_softirq+0x6c5/0xb4e
> [  234.712673] softirqs last disabled at (153051):
> [<ffffffff81227c1d>] irq_exit+0x17d/0x1d0
> [  234.716601] ---[ end trace 5dbf096843ee9ce6 ]---
> 
> Fixes: df097a278c75 ("IB/mlx5: Use the new mlx5 core notifier API")
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c
> b/drivers/infiniband/hw/mlx5/main.c
> index c2a5780cb394..e12a4404096b 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -5802,13 +5802,12 @@ static void mlx5_ib_unbind_slave_port(struct
> mlx5_ib_dev *ibdev,
>  		return;
>  	}
>  
> -	if (mpi->mdev_events.notifier_call)
> -		mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events);
> -	mpi->mdev_events.notifier_call = NULL;
> -
>  	mpi->ibdev = NULL;
>  
>  	spin_unlock(&port->mp.mpi_lock);
> +	if (mpi->mdev_events.notifier_call)
> +		mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events);
> +	mpi->mdev_events.notifier_call = NULL;

I can see where this fixes the problem at hand, but this gives the
appearance of creating a new race.  Doing a check/unregister/set-null
series outside of any locks is a red flag to someone investigating the
code.  You should at least make note of the fact that calling unregister
more than once is safe.  If you're fine with it, I can add a comment and
take the patch, or you can resubmit.

>  	mlx5_remove_netdev_notifier(ibdev, port_num);
>  	spin_lock(&port->mp.mpi_lock);
>  

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-07-31 16:22 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-31  8:38 [PATCH rdma-rc] RDMA/mlx5: Release locks during notifier unregister Leon Romanovsky
2019-07-31 16:22 ` Doug Ledford [this message]
2019-07-31 17:00   ` Jason Gunthorpe
2019-07-31 17:09     ` Leon Romanovsky
2019-07-31 17:22       ` Jason Gunthorpe
2019-07-31 18:01         ` Leon Romanovsky
2019-07-31 18:51           ` Doug Ledford
2019-08-01  8:22             ` Leon Romanovsky
2019-07-31 19:55           ` Jason Gunthorpe
2019-08-01  8:27             ` Leon Romanovsky
2019-08-01 12:00               ` Jason Gunthorpe
2019-08-01 12:08                 ` Leon Romanovsky
2019-08-01 14:16                   ` Doug Ledford
2019-08-01 15:59                     ` Leon Romanovsky
2019-08-01 16:11                       ` Doug Ledford
2019-08-01 16:20                         ` Jason Gunthorpe
2019-08-01 16:40                           ` Doug Ledford
2019-08-01 16:43                             ` Jason Gunthorpe
2019-08-01 16:50                               ` Doug Ledford
2019-08-01 17:31                                 ` Leon Romanovsky
2019-08-01 16:23                         ` Leon Romanovsky
2019-08-01 16:42                           ` Doug Ledford
2019-08-01 17:33                             ` Leon Romanovsky
2019-08-01 20:09                               ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44863abbef5c1e233cbedfdf959fe900f7722d74.camel@redhat.com \
    --to=dledford@redhat.com \
    --cc=jgg@mellanox.com \
    --cc=leon@kernel.org \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.