All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bschubert@ddn.com>
To: Vivek Goyal <vgoyal@redhat.com>,
	Dharmendra Hans <dharamhans87@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	linux-kernel@vger.kernel.org, Dharmendra Singh <dsingh@ddn.com>
Subject: Re: [PATCH v4 1/3] FUSE: Implement atomic lookup + create
Date: Fri, 6 May 2022 18:41:17 +0200	[thread overview]
Message-ID: <78c2beed-b221-71b4-019f-b82522d98f1e@ddn.com> (raw)
In-Reply-To: <YnUsw4O3F4wgtxTr@redhat.com>



On 5/6/22 16:12, Vivek Goyal wrote:

[...]

> On Fri, May 06, 2022 at 11:04:05AM +0530, Dharmendra Hans wrote:

> 
> Ok, looks like your fuse file server is talking to a another file
> server on network and that's why you are mentioning two network trips.
> 
> Let us differentiate between two things first.
> 
> A. FUSE protocol semantics
> B. Implementation of FUSE protocl by libfuse.
> 
> I think I am stressing on A and you are stressing on B. I just want
> to see what's the difference between FUSE_CREATE and FUSE_ATOMIC_CREATE
> from fuse protocol point of view. Again look at from kernel's point of
> view and don't worry about libfuse is going to implement it.
> Implementations can vary.

Agreed, I don't think we need to bring in network for the kernel to 
libfuse API.

> 
>  From kernel's perspective FUSE_CREATE is supposed to create + open a
> file. It is possible file already exists. Look at include/fuse_lowlevel.h
> description for create().
> 
>          /**
>           * Create and open a file
>           *
>           * If the file does not exist, first create it with the specified
>           * mode, and then open it.
>           */
> 
> I notice that fuse is offering a high level API as well as low level
> API. I primarily know about low level API. To me these are just two
> different implementation but things don't change how kernel sends
> fuse messages and what it expects from server in return.
> 
> Now with FUSE_ATOMIC_CREATE, from kernel's perspective, only difference
> is that in reply message file server will also indicate if file was
> actually created or not. Is that right?
> 
> And I am focussing on this FUSE API apsect. I am least concerned at
> this point of time who libfuse decides to actually implement FUSE_CREATE
> or FUSE_ATOMIC_CREATE etc. You might make a single call in libfuse
> server (instead of two) and that's performance optimization in libfuse.
> Kernel does not care how many calls did you make in file server to
> implement FUSE_CREATE or FUSE_ATOMIC_CREATE. All it cares is that
> create and open the file.
> 
> So while you might do things in more atomic manner in file server and
> cut down on network traffic, kernel fuse API does not care. All it cares
> about is create + open a file.
> 
> Anyway, from kernel's perspective, I think you should be able to
> just use FUSE_CREATE and still be do "lookup + create + open".
> FUSE_ATOMIC_CREATE is just allows one additional optimization so
> that you know whether to invalidate parent dir's attrs or not.
> 
> In fact kernel is not putting any atomicity requirements as well on
> file server. And that's why I think this new command should probably
> be called FUSE_CREATE_EXT because it just sends back additional
> info.
> 
> All the atomicity stuff you have been describing is that you are
> trying to do some optimizations in libfuse implementation to implement
> FUSE_ATOMIC_CREATE so that you send less number of commands over
> network. That's a good idea but fuse kernel API does not require you
> do these atomically, AFAICS.
> 
> Given I know little bit of fuse low level API, If I were to implement
> this in virtiofs/passthrough_ll.c, I probably will do following.
> 
> A. Check if caller provided O_EXCL flag.
> B. openat(O_CREAT | O_EXCL)
> C. If success, we created the file. Set file_created = 1.
> 
> D. If error and error != -EEXIST, send error back to client.
> E. If error and error == -EEXIST, if caller did provide O_EXCL flag,
>     return error.
> F. openat() returned -EEXIST and caller did not provide O_EXCL flag,
>     that means file already exists.  Set file_created = 0.
> G. Do lookup() etc to create internal lo_inode and stat() of file.
> H. Send response back to client using fuse_reply_create().
>     
> This is one sample implementation for fuse lowlevel API. There could
> be other ways to implement. But all that is libfuse + filesystem
> specific and kernel does not care how many operations you use to
> complete and what's the atomicity etc. Of course less number of
> operations you do better it is.
> 
> Anyway, I think I have said enough on this topic. IMHO, FUSE_CREATE
> descritpion (fuse_lowlevel.h) already mentions that "If the file does not
> exist, first create it with the specified mode and then open it". That
> means intent of protocol is that file could already be there as well.
> So I think we probably should implement this optimization (in kernel)
> using FUSE_CREATE command and then add FUSE_CREATE_EXT to add optimization
> about knowing whether file was actually created or not.
> 
> W.r.t libfuse optimizations, I am not sure why can't you do optimizations
> with FUSE_CREATE and why do you need FUSE_CREATE_EXT necessarily. If
> are you worried that some existing filesystems will break, I think
> you can create an internal helper say fuse_create_atomic() and then
> use that if filesystem offers it. IOW, libfuse will have two
> ways to implement FUSE_CREATE. And if filesystem offers a new way which
> cuts down on network traffic, libfuse uses more efficient method. We
> should not have to change kernel FUSE API just because libfuse can
> do create + open operation more efficiently.

Ah right, I like this. As I had written before, the first patch version 
was using FUSE_CREATE and I was worried to break something. Yes, it 
should be possible split into lookup+create on the libfuse side. That 
being said, libfuse will need to know which version it is - there might 
be an old kernel sending the non-optimized version - libfuse should not 
do another lookup then. Now there is 'fi.flags = arg->flags', but these 
are already taken by open/fcntl flags - I would not feel comfortable to 
overload these. At best, struct fuse_create_in currently had a padding 
field, we could convert these to something like 'ext_fuse_open_flags' 
and then use it for fuse internal things. Difficulty here is that I 
don't know if all kernel implementations zero the struct (BSD, MacOS), 
so I guess we would need to negotiate at startup/init time and would 
need another main feature flag? And with that I'm not be sure anymore if 
the result would be actually more simple than what we have right now for 
the first patch.


Thanks,
Bernd


  reply	other threads:[~2022-05-06 16:41 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-02 10:25 [PATCH v4 0/3] FUSE: Implement atomic lookup + open/create Dharmendra Singh
2022-05-02 10:25 ` [PATCH v4 1/3] FUSE: Implement atomic lookup + create Dharmendra Singh
2022-05-03 12:43   ` Vivek Goyal
2022-05-03 14:13   ` Vivek Goyal
2022-05-03 19:53   ` Vivek Goyal
2022-05-03 20:48     ` Bernd Schubert
2022-05-04  4:26     ` Dharmendra Hans
2022-05-04 14:47       ` Vivek Goyal
2022-05-04 15:46         ` Bernd Schubert
2022-05-04 17:31           ` Vivek Goyal
2022-05-05  4:51         ` Dharmendra Hans
2022-05-05 14:26           ` Vivek Goyal
2022-05-06  5:34             ` Dharmendra Hans
2022-05-06 14:12               ` Vivek Goyal
2022-05-06 16:41                 ` Bernd Schubert [this message]
2022-05-06 17:07                   ` Vivek Goyal
2022-05-06 18:45                     ` Bernd Schubert
2022-05-07 10:42                       ` Jean-Pierre André
2022-05-07 10:42                         ` Jean-Pierre André
2022-05-11 10:08                         ` Bernd Schubert
2022-05-02 10:25 ` [PATCH v4 2/3] FUSE: Implement atomic lookup + open Dharmendra Singh
2022-05-04 18:20   ` Vivek Goyal
2022-05-05  6:39     ` Dharmendra Hans
2022-05-02 10:25 ` [PATCH v4 3/3] FUSE: Avoid lookup in d_revalidate() Dharmendra Singh
2022-05-04 20:39   ` Vivek Goyal
2022-05-04 21:05     ` Bernd Schubert
2022-05-05  5:49     ` Dharmendra Hans
2022-05-04 19:18 ` [PATCH v4 0/3] FUSE: Implement atomic lookup + open/create Vivek Goyal
2022-05-05  6:12   ` Dharmendra Hans
2022-05-05 12:54     ` Vivek Goyal
2022-05-05 15:13       ` Bernd Schubert
2022-05-05 19:59         ` Vivek Goyal
2022-05-11  9:40           ` Miklos Szeredi
2022-05-11  9:59             ` Bernd Schubert
2022-05-11 17:21             ` Vivek Goyal
2022-05-11 19:30               ` Vivek Goyal
2022-05-12  8:16                 ` Dharmendra Hans
2022-05-12 15:24                   ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78c2beed-b221-71b4-019f-b82522d98f1e@ddn.com \
    --to=bschubert@ddn.com \
    --cc=dharamhans87@gmail.com \
    --cc=dsingh@ddn.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.