From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 442BCC433FF for ; Thu, 1 Aug 2019 16:24:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 18600216C8 for ; Thu, 1 Aug 2019 16:24:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1564676647; bh=QHVMfSnFXuNGz+UYmrL283A2yJ0XIdJENMviMokrkXQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=C9SzqEBbxyfftIieW1BPDXEXuCeuY3BkRMZfeQQGTySyJNuo7o8wMkOVVXgl8khtQ JPBhf34cvzGHd0C8S+uDrbXC2n1V9TTZT9gkz5D8GsruCRfSLaS/sWE5CmPjN2LKYh 3OeD0rCmsb/xjFCjwWt72Usv+EAefqL8RUquoiic= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731205AbfHAQYG (ORCPT ); Thu, 1 Aug 2019 12:24:06 -0400 Received: from mail.kernel.org ([198.145.29.99]:34398 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727024AbfHAQYG (ORCPT ); Thu, 1 Aug 2019 12:24:06 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 882BA20B7C; Thu, 1 Aug 2019 16:24:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1564676645; bh=QHVMfSnFXuNGz+UYmrL283A2yJ0XIdJENMviMokrkXQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=yI32HFA4rwl+y3wVQ+GUUl5HDOURTtjZnqw0JbrQRAQtvK2sdKJVQ+RZpqAAE4t1M o5gePCJ81XZMcxYOYiYV44HzNTDciDVY4LA+mbU9ZFgb4QQCmfyLwDyZYyJm20fF+y Ga+6GDeoJOlKAXM5rUOjaRmWMG1yjwhx2x0w+aio= Date: Thu, 1 Aug 2019 19:23:56 +0300 From: Leon Romanovsky To: Doug Ledford Cc: Jason Gunthorpe , RDMA mailing list , Saeed Mahameed Subject: Re: [PATCH rdma-rc] RDMA/mlx5: Release locks during notifier unregister Message-ID: <20190801162356.GV4832@mtr-leonro.mtl.com> References: <20190731170944.GC4832@mtr-leonro.mtl.com> <20190731172215.GJ22677@mellanox.com> <20190731180124.GE4832@mtr-leonro.mtl.com> <20190731195523.GK22677@mellanox.com> <20190801082749.GH4832@mtr-leonro.mtl.com> <20190801120007.GB23885@mellanox.com> <20190801120821.GK4832@mtr-leonro.mtl.com> <060b3e8fbe48312e9af33b88ba7ba62a6b64b493.camel@redhat.com> <20190801155912.GS4832@mtr-leonro.mtl.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Thu, Aug 01, 2019 at 12:11:20PM -0400, Doug Ledford wrote: > On Thu, 2019-08-01 at 18:59 +0300, Leon Romanovsky wrote: > > > There's no need for a lockdep. The removal of the notifier callback > > > entry is re-entrant safe. The core removal routines have their own > > > spinlock they use to protect the actual notifier list. If you call > > > it > > > more than once, the second and subsequent calls merely scan the > > > list, > > > find no matching entry, and return ENOENT. The only reason this > > > might > > > need a lock and a lockdep entry is if you are protecting against a > > > race > > > with the *add* notifier code in the mlx5 driver specifically (the > > > core > > > add code won't have an issue, but since you only have a single place > > > to > > > store the notifier callback pointer, if it would be possible for you > > > to > > > add two callbacks and write over the first callback pointer with the > > > second without removing the first, then you would leak a callback > > > notifier in the core notifier list). > > > > atomic_notifier_chain_unregister() unconditionally calls to > > syncronize_rcu() and I'm not so sure that it is best thing to do > > for every port unbind. > > > > Actually, I'm completely lost here, we are all agree that the patch > > fixes issue correctly, and it returns the code to be exactly as > > it was before commit df097a278c75 ("IB/mlx5: Use the new mlx5 core > > notifier > > API"). Can we simply merge it and fix the kernel panic? > > As long as you are OK with me adding a comment to the patch so people > coming back later won't scratch their head about how can it possible be > right to do that sequence without a lock held, I'm fine merging the fix. > > Something like: > > /* > * The check/unregister/set-NULL sequence below does not need to be > * locked for correctness as it's only an optimization, and can't > * be under a lock or will throw a scheduling while atomic error. > */ I think that the best place will be in commit message for this explanation, but I'm fine with the comment inside code as well. Thanks a lot, I appreciate it. > > -- > Doug Ledford > GPG KeyID: B826A3330E572FDD > Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD