From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 359F5C433ED for ; Tue, 27 Apr 2021 04:45:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 046356101D for ; Tue, 27 Apr 2021 04:45:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229441AbhD0Eqb (ORCPT ); Tue, 27 Apr 2021 00:46:31 -0400 Received: from mail.kernel.org ([198.145.29.99]:38278 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229755AbhD0Eqa (ORCPT ); Tue, 27 Apr 2021 00:46:30 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 14910613A9; Tue, 27 Apr 2021 04:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1619498747; bh=nqUQ105I8ndt2YTwIFdFRcbhEpkwSaveMBdmhZHZPIg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=iecb3P9gO3xbUe9q0lVBsDNIoi8qVDLVw4WGxSChn7LDJpEMip6Uzka+by9W7QcjR L9LqkCk13A5sk6slhGiC57h3aIo9yT+FaXfoEgKJoi7S2UHbwmwwGRQjCpw18kZbE+ 7WmIp52vBsyI82dqrEMlePGUAWPQbvBM6ysbPEvMT0gWLsYlHwCWMSQUEjYe6CrWHz a6fwQWFuDSlVp540LsnEr7xLH30MN8ix1Bov/ITS9zHQRktMXUjXl6oz3OKomHXij7 hG494CJfkEydYmKzR++XW56lsfjdNhWioiovkTR9ZkTVOyvalhf3q0RT2bBm0QBW7C 925UZf7r6N7Ug== Date: Tue, 27 Apr 2021 07:45:43 +0300 From: Leon Romanovsky To: Jason Gunthorpe Cc: Doug Ledford , Shay Drory , linux-rdma@vger.kernel.org Subject: Re: [PATCH rdma-next] RDMA/restrack: Delay QP deletion till all users are gone Message-ID: References: <20210422142939.GA2407382@nvidia.com> <20210425130857.GN1370958@nvidia.com> <20210425172254.GO1370958@nvidia.com> <20210426120349.GP1370958@nvidia.com> <20210426131107.GR1370958@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210426131107.GR1370958@nvidia.com> Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Mon, Apr 26, 2021 at 10:11:07AM -0300, Jason Gunthorpe wrote: > On Mon, Apr 26, 2021 at 04:08:42PM +0300, Leon Romanovsky wrote: > > On Mon, Apr 26, 2021 at 09:03:49AM -0300, Jason Gunthorpe wrote: > > > On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > > > > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > > > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > > > > > > > It was an example of a flow that wasn't restored properly. > > > > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > > > > from the code. > > > > > > > > > > Well, exactly, user flows are not allowed to do extra stuff before > > > > > calling the driver destroy > > > > > > > > > > So the arrangement I gave is reasonable and make sense, it is > > > > > certainly better than the hodge podge of ordering that we have today > > > > > > > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > > > > callbacks together with attempt to readd res object if destroy fails. > > > > > > Is isn't simpler, now add can fail and can't be recovered > > > > It is not different from failure during first call to rdma_restrack_add(). > > You didn't like the idea to be strict with addition of restrack, but > > want to be strict in reinsert. > > It is ugly we couldn't fix the add side, lets not repeat that uglyness > in other places Why can't we fix _add? Thanks > > Jason