From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5929C433EF for ; Thu, 19 May 2022 19:33:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244555AbiESTdz (ORCPT ); Thu, 19 May 2022 15:33:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234946AbiESTdw (ORCPT ); Thu, 19 May 2022 15:33:52 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 885705930F for ; Thu, 19 May 2022 12:33:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1652988830; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ZlRs+bfxWYbBbjnLR2rK0SLWkDzLq4+HCX4OhBU9Tww=; b=hYFP7a8nqhhsDjtO8o2QXtSOFGwzcrUnFkRSiE6tBMDMviu2w1RXcfp67DkcqQ/UnRlfUR 9GYA8Ga4DdRyljHTmT3NeSEKHn6UkYEJApFPhFl4oiQGosbbuGb+dz4t0dgfOnaOQrB04u ytyI5wbNGZ8QAI/dxgjObObGYkhttYM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-655-zrJHUk78NhaUxtngxbrHzw-1; Thu, 19 May 2022 15:33:45 -0400 X-MC-Unique: zrJHUk78NhaUxtngxbrHzw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EDFDB29AA2F2; Thu, 19 May 2022 19:33:44 +0000 (UTC) Received: from horse.redhat.com (unknown [10.22.18.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id D46C0400E895; Thu, 19 May 2022 19:33:44 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 85CC52208FA; Thu, 19 May 2022 15:33:44 -0400 (EDT) Date: Thu, 19 May 2022 15:33:44 -0400 From: Vivek Goyal To: Miklos Szeredi Cc: Dharmendra Singh , linux-fsdevel@vger.kernel.org, fuse-devel , linux-kernel@vger.kernel.org, Bernd Schubert Subject: Re: [PATCH v5 0/3] FUSE: Implement atomic lookup + open/create Message-ID: References: <20220517100744.26849-1-dharamhans87@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 19, 2022 at 11:39:01AM +0200, Miklos Szeredi wrote: > On Tue, 17 May 2022 at 12:08, Dharmendra Singh wrote: > > > > In FUSE, as of now, uncached lookups are expensive over the wire. > > E.g additional latencies and stressing (meta data) servers from > > thousands of clients. These lookup calls possibly can be avoided > > in some cases. Incoming three patches address this issue. > > > > > > Fist patch handles the case where we are creating a file with O_CREAT. > > Before we go for file creation, we do a lookup on the file which is most > > likely non-existent. After this lookup is done, we again go into libfuse > > to create file. Such lookups where file is most likely non-existent, can > > be avoided. > > I'd really like to see a bit wider picture... > > We have several cases, first of all let's look at plain O_CREAT > without O_EXCL (assume that there were no changes since the last > lookup for simplicity): Hi Miklos, Thanks for providing this breakup. There are too many cases here and this data helps a lot with that. I feel this should really be captured in commit logs to show the current paths and how these have been optimized with ATOMIC_OPEN/CREATE_EXT. > > [not cached, negative] > ->atomic_open() > LOOKUP > CREATE > > [not cached, positive] > ->atomic_open() > LOOKUP > ->open() > OPEN > > [cached, negative, validity timeout not expired] > ->d_revalidate() > return 1 > ->atomic_open() > CREATE > > [cached, negative, validity timeout expired] > ->d_revalidate() > return 0 > ->atomic_open() > LOOKUP > CREATE > > [cached, positive, validity timeout not expired] > ->d_revalidate() > return 1 > ->open() > OPEN > > [cached, positive, validity timeout expired] > ->d_revalidate() > LOOKUP > return 1 > ->open() > OPEN > > (Caveat emptor: I'm just looking at the code and haven't actually > tested what happens.) > > Apparently in all of these cases we are doing at least one request, so > it would make sense to make them uniform: > > [not cached] > ->atomic_open() > CREATE_EXT > > [cached] > ->d_revalidate() > return 0 So fuse_dentry_revalidate() will return 0 even if timeout has not expired (if server supports so called atomic_open()). And that will lead to calling d_invalidate() on existing positive dentry always. IOW, if I am calling open() on a dentry, dentry will always be dropped and a new dentry will always be created from ->atomic_open() path, is that right. I am not sure what does it mean from VFS perspective to always call d_invalidate() on a cached positive dentry when open() is called. /** * d_invalidate - detach submounts, prune dcache, and drop * @dentry: dentry to invalidate (aka detach, prune and drop) */ Thanks Vivek > ->atomic_open() > CREATE_EXT > > Similarly we can look at the current O_CREAT | O_EXCL cases: > > [not cached, negative] > ->atomic_open() > LOOKUP > CREATE > > [not cached, positive] > ->atomic_open() > LOOKUP > return -EEXIST > > [cached, negative] > ->d_revalidate() > return 0 (see LOOKUP_EXCL check) > ->atomic_open() > LOOKUP > CREATE > > [cached, positive] > ->d_revalidate() > LOOKUP > return 1 > return -EEXIST > > Again we are doing at least one request, so we can unconditionally > replace them with CREATE_EXT like the non-O_EXCL case. > > > > > > Second patch handles the case where we open first time a file/dir > > but do a lookup first on it. After lookup is performed we make another > > call into libfuse to open the file. Now these two separate calls into > > libfuse can be combined and performed as a single call into libfuse. > > And here's my analysis: > > [not cached, negative] > ->lookup() > LOOKUP > return -ENOENT > > [not cached, positive] > ->lookup() > LOOKUP > ->open() > OPEN > > [cached, negative, validity timeout not expired] > ->d_revalidate() > return 1 > return -ENOENT > > [cached, negative, validity timeout expired] > ->d_revalidate() > return 0 > ->atomic_open() > LOOKUP > return -ENOENT > > [cached, positive, validity timeout not expired] > ->d_revalidate() > return 1 > ->open() > OPEN > > [cached, positive, validity timeout expired] > ->d_revalidate() > LOOKUP > return 1 > ->open() > OPEN > > There's one case were no request is sent: a valid cached negative > dentry. Possibly we can also make this uniform, e.g.: > > [not cached] > ->atomic_open() > OPEN_ATOMIC > > [cached, negative, validity timeout not expired] > ->d_revalidate() > return 1 > return -ENOENT > > [cached, negative, validity timeout expired] > ->d_revalidate() > return 0 > ->atomic_open() > OPEN_ATOMIC > > [cached, positive] > ->d_revalidate() > return 0 > ->atomic_open() > OPEN_ATOMIC > > It may even make the code simpler to clearly separate the cases where > the atomic variants are supported and when not. I'd also consider > merging CREATE_EXT into OPEN_ATOMIC, since a filesystem implementing > one will highly likely want to implement the other as well. > > Thanks, > Miklos >