From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D292AC4320A for ; Fri, 20 Aug 2021 13:06:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AD7D360E90 for ; Fri, 20 Aug 2021 13:06:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240697AbhHTNHZ (ORCPT ); Fri, 20 Aug 2021 09:07:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:43508 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238220AbhHTNHX (ORCPT ); Fri, 20 Aug 2021 09:07:23 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id F0D3460E9B; Fri, 20 Aug 2021 13:06:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1629464805; bh=MJCKak8iu2aRCur+cdjp5giWug9nUW1JObsOHl8vAEY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=jmw/qIdKwngMp+B9XaaobJVAWRsCAdk3fvxp7b01ZtwaFX5YJDgwtO99vk++8Ucph o3XYQN3dUXg4064KbVPgP064D7ROf2ipoB4t20fHf/66Nv88PtPep2FufmAEtYKP0z DPOVbF+kCWLQMMl40ig/EB2ioPljRocDXWppzKTay9D4i2zx1euZE0MtI+PY4xrMjr UV31w5hrvAOzf3BYk2mpKTHEjyVN5nwTQobBpz62uXQ4eydwgBRQLxE0+OCdryb2r6 QZLHNU4Vf4DRmXTF2L8z1uoUyxkqzwLOz8T80C3njYJpkojZAZ2TNFRw7D48ggzHhi KJc/E0spPlx7Q== Date: Fri, 20 Aug 2021 16:06:41 +0300 From: Leon Romanovsky To: "Keller, Jacob E" Cc: Jakub Kicinski , "David S . Miller" , Guangbin Huang , Jiri Pirko , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , Salil Mehta , Shannon Nelson , Yisen Zhuang , Yufeng Mo Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers Message-ID: References: <20210816084741.1dd1c415@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> <20210816090700.313a54ba@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 18, 2021 at 05:50:11PM +0000, Keller, Jacob E wrote: > > > > -----Original Message----- > > From: Leon Romanovsky > > Sent: Wednesday, August 18, 2021 1:12 AM > > To: Keller, Jacob E > > Cc: Jakub Kicinski ; David S . Miller ; > > Guangbin Huang ; Jiri Pirko ; > > linux-kernel@vger.kernel.org; netdev@vger.kernel.org; Salil Mehta > > ; Shannon Nelson ; Yisen > > Zhuang ; Yufeng Mo > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers > > > > On Mon, Aug 16, 2021 at 09:32:17PM +0000, Keller, Jacob E wrote: > > > > > > > > > > -----Original Message----- > > > > From: Jakub Kicinski > > > > Sent: Monday, August 16, 2021 9:07 AM > > > > To: Leon Romanovsky > > > > Cc: David S . Miller ; Guangbin Huang > > > > ; Keller, Jacob E ; > > Jiri > > > > Pirko ; linux-kernel@vger.kernel.org; > > netdev@vger.kernel.org; > > > > Salil Mehta ; Shannon Nelson > > > > ; Yisen Zhuang ; Yufeng > > > > Mo > > > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers > > > > > > > > On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote: > > > > > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote: > > > > > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote: > > > > > > > From: Leon Romanovsky > > > > > > > > > > > > > > The struct devlink itself is protected by internal lock and doesn't > > > > > > > need global lock during operation. That global lock is used to protect > > > > > > > addition/removal new devlink instances from the global list in use by > > > > > > > all devlink consumers in the system. > > > > > > > > > > > > > > The future conversion of linked list to be xarray will allow us to > > > > > > > actually delete that lock, but first we need to count all struct devlink > > > > > > > users. > > > > > > > > > > > > Not a problem with this set but to state the obvious the global devlink > > > > > > lock also protects from concurrent execution of all the ops which don't > > > > > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely > > know > > > > > > this but I thought I'd comment on an off chance it helps. > > > > > > > > > > The end goal will be something like that: > > > > > 1. Delete devlink lock > > > > > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get) > > > > > 3. Convert devlink->lock to be read/write lock to make sure that we can run > > > > > get query in parallel. > > > > > 4. Open devlink netlink to parallel ops, ".parallel_ops = true". > > > > > > > > IIUC that'd mean setting eswitch mode would hold write lock on > > > > the dl instance. What locks does e.g. registering a dl port take > > > > then? > > > > > > Also that I think we have some cases where we want to allow the driver to > > allocate new devlink objects in response to adding a port, but still want to block > > other global operations from running? > > > > I don't see the flow where operations on devlink_A should block devlink_B. > > Only in such flows we will need global lock like we have now - devlink->lock. > > In all other flows, write lock of devlink instance will protect from > > parallel execution. > > > > Thanks > > > But how do we handle what is essentially recursion? Let's wait till implementation, I promise it will be covered :). > > If we add a port on the devlink A: > > userspace sends PORT_ADD for devlink A > driver responds by creating a port > adding a port causes driver to add a region, or other devlink object > > In the current design, if I understand correctly, we hold the global lock but *not* the instance lock. We can't hold the instance lock while adding port without breaking a bunch of drivers that add many devlink objects in response to port creation.. because they'll deadlock when going to add the sub objects. > > But if we don't hold the global lock, then in theory another userspace program could attempt to do something inbetween PORT_ADD starting and finishing which might not be desirable. (Remember, we had to drop the instance lock otherwise drivers get stuck when trying to add many subobjects) You just surfaced my main issue with the current devlink implementation - the purpose of devlink_lock. Over the years devlink code lost clear separation between user space flows and kernel flows. Thanks > > Thanks, > Jake