linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alessio Balsini <balsini@android.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Alessio Balsini <balsini@android.com>,
	Akilesh Kailash <akailash@google.com>,
	Antonio SJ Musumeci <trapexit@spawn.link>,
	David Anderson <dvander@google.com>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Jann Horn <jannh@google.com>, Jens Axboe <axboe@kernel.dk>,
	Martijn Coenen <maco@android.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Lawrence <paullawrence@google.com>,
	Peng Tao <bergwolf@gmail.com>,
	Stefano Duo <duostefano93@gmail.com>,
	Zimuzo Ezeozue <zezeozue@google.com>, wuyan <wu-yan@tcl.com>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	kernel-team <kernel-team@android.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RESEND V12 4/8] fuse: Passthrough initialization and release
Date: Mon, 17 May 2021 12:36:49 +0100	[thread overview]
Message-ID: <YKJVUUUapNSijV38@google.com> (raw)
In-Reply-To: <CAOQ4uxjOGx8gZ2biTEb4a54gw5c_aDn+FFkUvRpY+cmgEEh=sA@mail.gmail.com>

On Wed, May 05, 2021 at 03:21:26PM +0300, Amir Goldstein wrote:
> On Wed, Feb 17, 2021 at 3:52 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Mon, Jan 25, 2021 at 4:31 PM Alessio Balsini <balsini@android.com> wrote:
> > >
> > > Implement the FUSE passthrough ioctl that associates the lower
> > > (passthrough) file system file with the fuse_file.
> > >
> > > The file descriptor passed to the ioctl by the FUSE daemon is used to
> > > access the relative file pointer, that will be copied to the fuse_file
> > > data structure to consolidate the link between the FUSE and lower file
> > > system.
> > >
> > > To enable the passthrough mode, user space triggers the
> > > FUSE_DEV_IOC_PASSTHROUGH_OPEN ioctl and, if the call succeeds, receives
> > > back an identifier that will be used at open/create response time in the
> > > fuse_open_out field to associate the FUSE file to the lower file system
> > > file.
> > > The value returned by the ioctl to user space can be:
> > > - > 0: success, the identifier can be used as part of an open/create
> > > reply.
> > > - <= 0: an error occurred.
> > > The value 0 represents an error to preserve backward compatibility: the
> > > fuse_open_out field that is used to pass the passthrough_fh back to the
> > > kernel uses the same bits that were previously as struct padding, and is
> > > commonly zero-initialized (e.g., in the libfuse implementation).
> > > Removing 0 from the correct values fixes the ambiguity between the case
> > > in which 0 corresponds to a real passthrough_fh, a missing
> > > implementation of FUSE passthrough or a request for a normal FUSE file,
> > > simplifying the user space implementation.
> > >
> > > For the passthrough mode to be successfully activated, the lower file
> > > system file must implement both read_iter and write_iter file
> > > operations. This extra check avoids special pseudo files to be targeted
> > > for this feature.
> > > Passthrough comes with another limitation: no further file system
> > > stacking is allowed for those FUSE file systems using passthrough.
> > >
> > > Signed-off-by: Alessio Balsini <balsini@android.com>
> > > ---
> > >  fs/fuse/inode.c       |  5 +++
> > >  fs/fuse/passthrough.c | 87 ++++++++++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 90 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > > index a1104d5abb70..7ebc398fbacb 100644
> > > --- a/fs/fuse/inode.c
> > > +++ b/fs/fuse/inode.c
> > > @@ -1133,6 +1133,11 @@ EXPORT_SYMBOL_GPL(fuse_send_init);
> > >
> > >  static int free_fuse_passthrough(int id, void *p, void *data)
> > >  {
> > > +       struct fuse_passthrough *passthrough = (struct fuse_passthrough *)p;
> > > +
> > > +       fuse_passthrough_release(passthrough);
> > > +       kfree(p);
> > > +
> > >         return 0;
> > >  }
> > >
> > > diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> > > index 594060c654f8..cf993e83803e 100644
> > > --- a/fs/fuse/passthrough.c
> > > +++ b/fs/fuse/passthrough.c
> > > @@ -3,19 +3,102 @@
> > >  #include "fuse_i.h"
> > >
> > >  #include <linux/fuse.h>
> > > +#include <linux/idr.h>
> > >
> > >  int fuse_passthrough_open(struct fuse_dev *fud,
> > >                           struct fuse_passthrough_out *pto)
> > >  {
> > > -       return -EINVAL;
> > > +       int res;
> > > +       struct file *passthrough_filp;
> > > +       struct fuse_conn *fc = fud->fc;
> > > +       struct inode *passthrough_inode;
> > > +       struct super_block *passthrough_sb;
> > > +       struct fuse_passthrough *passthrough;
> > > +
> > > +       if (!fc->passthrough)
> > > +               return -EPERM;
> > > +
> > > +       /* This field is reserved for future implementation */
> > > +       if (pto->len != 0)
> > > +               return -EINVAL;
> > > +
> > > +       passthrough_filp = fget(pto->fd);
> > > +       if (!passthrough_filp) {
> > > +               pr_err("FUSE: invalid file descriptor for passthrough.\n");
> > > +               return -EBADF;
> > > +       }
> > > +
> > > +       if (!passthrough_filp->f_op->read_iter ||
> > > +           !passthrough_filp->f_op->write_iter) {
> > > +               pr_err("FUSE: passthrough file misses file operations.\n");
> > > +               res = -EBADF;
> > > +               goto err_free_file;
> > > +       }
> > > +
> > > +       passthrough_inode = file_inode(passthrough_filp);
> > > +       passthrough_sb = passthrough_inode->i_sb;
> > > +       if (passthrough_sb->s_stack_depth >= FILESYSTEM_MAX_STACK_DEPTH) {
> > > +               pr_err("FUSE: fs stacking depth exceeded for passthrough\n");
> >
> > No need to print an error to the logs, this can be a perfectly normal
> > occurrence.  However I'd try to find a more unique error value than
> > EINVAL so that the fuse server can interpret this as "not your fault,
> > but can't support passthrough on this file".  E.g. EOPNOTSUPP.
> >
> >
> 
> Sorry for the fashionably late response...
> Same comment for !{read,write}_iter case above.
> EBAFD is really not appropriate there.
> May I suggest ELOOP for s_stack_depth and EOPNOTSUPP
> for no rw iter ops.
> 
> Are you planning to post another version of the patches soon?
> 
> Thanks,
> Amir.

Hi Amir,

Thanks for your feedback. I like the ELOOP for the stack_depth, and the
EOPNOTSUPP is already in my local branch.

I've been busy with the integration of this last patchset in Android,
that unfortunately will be shipped as out-of-tree. I'm keeping all my
fingers crossed for my workarounds to prevent future passthrough UAPI
breakages to work. :)

I have an ugly patch which uses IDR as Miklos asked, but unfortunately
I'm facing some performance issues due to the locking mechanisms to keep
guarantee the RCU consistency. I can post the new patch set as an RFC
soon for the community to take a look.
At a glance what happens is:
- the IDR, one for each fuse_conn, contains pointers to "struct
  fuse_passthrough" containing:
  - fuse_file *: which is using passthrough,
  - file *: native file system file target,
  - cred of the FUSE server,
- ioctl(PASSTHROUGH_OPEN): updates IDR, requires spinlock:
  - kmalloc(fuse_passthrough), update file and cred,
  - ID = idr_alloc(),
  - return ID,
- fuse_open reply from FUSE server with passthrough ID: updates IDR,
  requires spinlock:
  - pt = idr_find(ID),
  - update fuse_file with the current fuse_file,
  - update fuse_file->passthrough_id = ID,
  - idr_replace(),
- read/write/mmap: lock-free IDR read:
  - idr_find(fuse_file::passthrough_id),
  - forward request to lower file system as for the current FUSE
    passthrough patches.
- ioctl(PASSTHROUGH_CLOSE): updates IDR, requires spinlock:
  - idr_remove();
  - call_rcu(): tell possible open fuse_file user that the ID is no more
    valid and kfree() the allocated struct;
- close(fuse_file): updates IDR, requires spinlock:
  - ID = fuse_file::passthrough_id
  - idr_find(ID),
  - fuse_passthrough::fuse_file = NULL,
  - idr_replace().

This would be fine if passthrough is activated for a few, big files,
where the passthrough overhead is dominated by the direct access
benefits, but if passthrough is enabled for many small files which just
do one or two read/writes (as what I did for my benchmark of total build
time for the kernel, where I was passing-through every opened file), the
overhead becomes a real issue.

If you have any thoughts on how to make this simpler, I'm absolutely
open to fix this.

We are also in the preliminary design step of having a hierarchical
passthrough feature, for which all the dir/file operations performed in
any of the contents of the passthrough folder, are automatically
forwarded to the lower file system.
That may fix this overhead issue, but it's still early to forecast what
is going to happen with this idea. Right now I'm not even sure it's
feasible.

Looking forward to receiving some feedback, thanks in advance!
Alessio

  reply	other threads:[~2021-05-17 11:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 15:30 [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 1/8] fs: Generic function to convert iocb to rw flags Alessio Balsini
2021-01-25 16:46   ` Alessio Balsini
2021-03-24  7:43     ` Rokudo Yan
2021-03-24 14:02       ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 2/8] fuse: 32-bit user space ioctl compat for fuse device Alessio Balsini
     [not found]   ` <CAMAHBGzkfEd9-1u0iKXp65ReJQgUi_=4sMpmfkwEOaMp6Ux7pg@mail.gmail.com>
2021-01-27 13:40     ` Alessio Balsini
     [not found]       ` <CAMAHBGwpKW+30kNQ_Apt8A-FTmr94hBOzkT21cjEHHW+t7yUMQ@mail.gmail.com>
2021-01-28 14:15         ` Alessio Balsini
2021-02-05  9:54           ` Peng Tao
2021-03-16 18:57           ` Arnd Bergmann
2021-02-17 10:21   ` Miklos Szeredi
2021-03-01 12:26     ` Alessio Balsini
2021-03-16 18:53   ` Arnd Bergmann
2021-03-18 16:13     ` Alessio Balsini
2021-03-18 21:15       ` Arnd Bergmann
2021-03-19 15:21         ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Alessio Balsini
2021-02-17 13:41   ` Miklos Szeredi
2021-02-19  7:05     ` Peng Tao
2021-02-19  8:40       ` Miklos Szeredi
2021-03-01 17:05         ` Alessio Balsini
2022-09-08 15:36           ` Amir Goldstein
2022-09-09 19:07             ` Miklos Szeredi
2022-09-10  8:52               ` Amir Goldstein
2022-09-10 13:03                 ` Bernd Schubert
2022-09-12  9:29                 ` Miklos Szeredi
2022-09-12 12:29                   ` Amir Goldstein
2022-09-12 13:03                     ` Miklos Szeredi
2022-09-12 13:05                       ` Miklos Szeredi
2022-09-12 13:26                       ` Amir Goldstein
2022-09-12 14:22                         ` Miklos Szeredi
2022-09-12 15:39                           ` Amir Goldstein
2022-09-12 17:43                             ` Hao Luo
2022-09-12 18:28                               ` Overlayfs with writable lower layer Amir Goldstein
2022-09-13 18:26                                 ` Hao Luo
2022-09-13 18:54                                   ` Amir Goldstein
2022-09-13 20:33                                     ` Hao Luo
2022-09-14  3:46                                       ` Amir Goldstein
2022-09-14 18:00                                         ` Hao Luo
2022-09-14 19:23                                           ` Amir Goldstein
2022-09-14 19:33                                             ` Hao Luo
2022-09-15 10:54                                               ` Amir Goldstein
2023-05-12 19:37                     ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Amir Goldstein
2023-05-15  7:29                       ` Miklos Szeredi
2023-05-15 14:00                         ` Amir Goldstein
2023-05-15 20:16                           ` [fuse-devel] " Nikolaus Rath
2023-05-15 21:11                             ` Bernd Schubert
2023-05-15 21:45                               ` Paul Lawrence
2023-05-16  8:43                                 ` Miklos Szeredi
2023-05-16 10:16                                   ` Nikolaus Rath
2023-05-16  8:48                                 ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 4/8] fuse: Passthrough initialization and release Alessio Balsini
2021-02-17 13:52   ` Miklos Szeredi
2021-05-05 12:21     ` Amir Goldstein
2021-05-17 11:36       ` Alessio Balsini [this message]
2021-05-17 13:21         ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 5/8] fuse: Introduce synchronous read and write for passthrough Alessio Balsini
2021-02-17 14:00   ` Miklos Szeredi
2021-01-25 15:30 ` [PATCH RESEND V12 6/8] fuse: Handle asynchronous read and write in passthrough Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 7/8] fuse: Use daemon creds in passthrough mode Alessio Balsini
2021-02-05  9:23   ` Peng Tao
2021-02-05 11:21     ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 8/8] fuse: Introduce passthrough for mmap Alessio Balsini
2021-02-17 14:05   ` Miklos Szeredi
2021-04-01 11:24     ` Alessio Balsini
2021-11-18 18:31 ` [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKJVUUUapNSijV38@google.com \
    --to=balsini@android.com \
    --cc=akailash@google.com \
    --cc=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bergwolf@gmail.com \
    --cc=duostefano93@gmail.com \
    --cc=dvander@google.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=gscrivan@redhat.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=miklos@szeredi.hu \
    --cc=palmer@dabbelt.com \
    --cc=paullawrence@google.com \
    --cc=trapexit@spawn.link \
    --cc=wu-yan@tcl.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).