linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Alessio Balsini <balsini@android.com>,
	Peng Tao <bergwolf@gmail.com>,
	Akilesh Kailash <akailash@google.com>,
	Antonio SJ Musumeci <trapexit@spawn.link>,
	David Anderson <dvander@google.com>,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Jann Horn <jannh@google.com>, Jens Axboe <axboe@kernel.dk>,
	Martijn Coenen <maco@android.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Lawrence <paullawrence@google.com>,
	Stefano Duo <duostefano93@gmail.com>,
	Zimuzo Ezeozue <zezeozue@google.com>, wuyan <wu-yan@tcl.com>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	kernel-team <kernel-team@android.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough
Date: Mon, 12 Sep 2022 15:29:46 +0300	[thread overview]
Message-ID: <CAOQ4uxgXhVOpF8NgAcJCeW67QMKBOytzMXwy-GjdmS=DGGZ0hA@mail.gmail.com> (raw)
In-Reply-To: <CAJfpegvbMKadnsBZmEvZpCxeWaMEGDRiDBqEZqaBSXcWyPZnpA@mail.gmail.com>

On Mon, Sep 12, 2022 at 12:29 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Sat, 10 Sept 2022 at 10:52, Amir Goldstein <amir73il@gmail.com> wrote:
>
> > I think we should accept the fact that just as any current FUSE
> > passthrough (in userspace) implementation is limited to max number of
> > open files as the server's process limitation, kernel passthrough implementation
> > will be limited by inheriting the mounter's process limitation.
> >
> > There is no reason that the server should need to keep more
> > passthrough fd's open than client open fds.
>
> Maybe you're right.
>
> > If we only support FOPEN_PASSTHROUGH_AUTOCLOSE as v12
> > patches implicitly do, then the memory overhead is not much different
> > than the extra overlayfs pseudo realfiles.
>
> How exactly would this work?
>
> ioctl(F_D_I_P_OPEN) - create passthrough fd with ref 1
> open/FOPEN_PASSTHOUGH -  inc refcount in passthrough fd
> release - put refcount in passthrough fd
> ioctl(F_D_I_P_CLOSE) - put ref in passthrough fd
>
> Due to being refcounted the F_D_I_P_CLOSE can come at any point past
> the finished open request.
>
> Or did you have something else in mind?
>

What I had in mind is that FOPEN_PASSTHROUGH_AUTOCLOSE
"transfers" the server's refcount to the kernel and server does
not need to call explicit F_D_I_P_CLOSE.

This is useful for servers that don't care about reusing mappings.

> > > One other question that's nagging me is how to "unhide" these pseudo-fds.
> > >
> > > Could we create a kernel thread for each fuse sb which has normal
> > > file-table for these?  This would would allow inspecting state through
> > > /proc/$KTHREDID/fd, etc..
> > >
> >
> > Yeah that sounds like a good idea.
> > As I mentioned elsewhere in the thread, io_uring also has a mechanism
> > to register open files with the kernel to perform IO on them later.
> > I assume those files are also visible via some /proc/$KTHREDID/fd,
> > but I'll need to check.
> >
> > BTW, I see that the Android team is presenting eBPF-FUSE on LPC
> > coming Tuesday [1].
>
> At first glance it looks like a filtered kernel-only passthrough +
> fuse fallback, where filtering is provided by eBPF scripts and only
> falls back to userspace access on more complex cases.  Maybe it's a
> good direction, we'll see.

Yeh, we'll see.

> Apparently the passthrough case is
> important enough for various use cases.
>

Indeed.
My use case is HSM and I think that using FUSE for HSM is becoming
more and more common these days.

One of the things that bothers me is that both this FUSE_PASSTHROUGH
patch set and any future eBPF-FUSE passthrough implementation is
bound to duplicate a lot of code and know how from overlayfs
(along with the bugs).

We could try to factor out some common bits to a kernel fs passthough
library.

Anotehr options to consider is not to add any passthrough logic
to FUSE at all.

Instead, implement a "switch" fs to choose between passthrough
to one of several underlying fs "branches", where one of the branches
could be local fs and another a FUSE fs (i.e. for the complex cases).

A similar design was described at:
https://github.com/github/libprojfs/blob/master/docs/design.md#phase-2--hybrid

This "switch" fs is not that much different from overlayfs, when
removing the "merge dir" logic and replacing the "is_upper" logic
with a generic eBPF "choose_branch" logic.

Food for thought.

Thanks,
Amir.

  reply	other threads:[~2022-09-12 12:30 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 15:30 [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 1/8] fs: Generic function to convert iocb to rw flags Alessio Balsini
2021-01-25 16:46   ` Alessio Balsini
2021-03-24  7:43     ` Rokudo Yan
2021-03-24 14:02       ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 2/8] fuse: 32-bit user space ioctl compat for fuse device Alessio Balsini
     [not found]   ` <CAMAHBGzkfEd9-1u0iKXp65ReJQgUi_=4sMpmfkwEOaMp6Ux7pg@mail.gmail.com>
2021-01-27 13:40     ` Alessio Balsini
     [not found]       ` <CAMAHBGwpKW+30kNQ_Apt8A-FTmr94hBOzkT21cjEHHW+t7yUMQ@mail.gmail.com>
2021-01-28 14:15         ` Alessio Balsini
2021-02-05  9:54           ` Peng Tao
2021-03-16 18:57           ` Arnd Bergmann
2021-02-17 10:21   ` Miklos Szeredi
2021-03-01 12:26     ` Alessio Balsini
2021-03-16 18:53   ` Arnd Bergmann
2021-03-18 16:13     ` Alessio Balsini
2021-03-18 21:15       ` Arnd Bergmann
2021-03-19 15:21         ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Alessio Balsini
2021-02-17 13:41   ` Miklos Szeredi
2021-02-19  7:05     ` Peng Tao
2021-02-19  8:40       ` Miklos Szeredi
2021-03-01 17:05         ` Alessio Balsini
2022-09-08 15:36           ` Amir Goldstein
2022-09-09 19:07             ` Miklos Szeredi
2022-09-10  8:52               ` Amir Goldstein
2022-09-10 13:03                 ` Bernd Schubert
2022-09-12  9:29                 ` Miklos Szeredi
2022-09-12 12:29                   ` Amir Goldstein [this message]
2022-09-12 13:03                     ` Miklos Szeredi
2022-09-12 13:05                       ` Miklos Szeredi
2022-09-12 13:26                       ` Amir Goldstein
2022-09-12 14:22                         ` Miklos Szeredi
2022-09-12 15:39                           ` Amir Goldstein
2022-09-12 17:43                             ` Hao Luo
2022-09-12 18:28                               ` Overlayfs with writable lower layer Amir Goldstein
2022-09-13 18:26                                 ` Hao Luo
2022-09-13 18:54                                   ` Amir Goldstein
2022-09-13 20:33                                     ` Hao Luo
2022-09-14  3:46                                       ` Amir Goldstein
2022-09-14 18:00                                         ` Hao Luo
2022-09-14 19:23                                           ` Amir Goldstein
2022-09-14 19:33                                             ` Hao Luo
2022-09-15 10:54                                               ` Amir Goldstein
2023-05-12 19:37                     ` [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough Amir Goldstein
2023-05-15  7:29                       ` Miklos Szeredi
2023-05-15 14:00                         ` Amir Goldstein
2023-05-15 20:16                           ` [fuse-devel] " Nikolaus Rath
2023-05-15 21:11                             ` Bernd Schubert
2023-05-15 21:45                               ` Paul Lawrence
2023-05-16  8:43                                 ` Miklos Szeredi
2023-05-16 10:16                                   ` Nikolaus Rath
2023-05-16  8:48                                 ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 4/8] fuse: Passthrough initialization and release Alessio Balsini
2021-02-17 13:52   ` Miklos Szeredi
2021-05-05 12:21     ` Amir Goldstein
2021-05-17 11:36       ` Alessio Balsini
2021-05-17 13:21         ` Amir Goldstein
2021-01-25 15:30 ` [PATCH RESEND V12 5/8] fuse: Introduce synchronous read and write for passthrough Alessio Balsini
2021-02-17 14:00   ` Miklos Szeredi
2021-01-25 15:30 ` [PATCH RESEND V12 6/8] fuse: Handle asynchronous read and write in passthrough Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 7/8] fuse: Use daemon creds in passthrough mode Alessio Balsini
2021-02-05  9:23   ` Peng Tao
2021-02-05 11:21     ` Alessio Balsini
2021-01-25 15:30 ` [PATCH RESEND V12 8/8] fuse: Introduce passthrough for mmap Alessio Balsini
2021-02-17 14:05   ` Miklos Szeredi
2021-04-01 11:24     ` Alessio Balsini
2021-11-18 18:31 ` [PATCH RESEND V12 0/8] fuse: Add support for passthrough read/write Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxgXhVOpF8NgAcJCeW67QMKBOytzMXwy-GjdmS=DGGZ0hA@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=akailash@google.com \
    --cc=axboe@kernel.dk \
    --cc=balsini@android.com \
    --cc=bergwolf@gmail.com \
    --cc=duostefano93@gmail.com \
    --cc=dvander@google.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=gscrivan@redhat.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=miklos@szeredi.hu \
    --cc=palmer@dabbelt.com \
    --cc=paullawrence@google.com \
    --cc=trapexit@spawn.link \
    --cc=wu-yan@tcl.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).