linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Daniel Rosenberg <drosen@google.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-unionfs@vger.kernel.org,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Joanne Koong <joannelkoong@gmail.com>,
	Mykola Lysenko <mykolal@fb.com>,
	kernel-team@android.com
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE
Date: Wed, 17 May 2023 09:51:39 +0300	[thread overview]
Message-ID: <CAOQ4uxjCebxGxkguAh9s4_Vg7QHM=oBoV0LUPZpb+0pcm3z1bw@mail.gmail.com> (raw)
In-Reply-To: <93e0e991-147f-0021-d635-95e615057273@linux.alibaba.com>

On Wed, May 17, 2023 at 5:50 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>
>
>
> On 2023/5/2 17:07, Daniel Rosenberg wrote:
> > On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >>
> >>
> >> The security model needs to be thought about and documented.  Think
> >> about this: the fuse server now delegates operations it would itself
> >> perform to the passthrough code in fuse.  The permissions that would
> >> have been checked in the context of the fuse server are now checked in
> >> the context of the task performing the operation.  The server may be
> >> able to bypass seccomp restrictions.  Files that are open on the
> >> backing filesystem are now hidden (e.g. lsof won't find these), which
> >> allows the server to obfuscate accesses to backing files.  Etc.
> >>
> >> These are not particularly worrying if the server is privileged, but
> >> fuse comes with the history of supporting unprivileged servers, so we
> >> should look at supporting passthrough with unprivileged servers as
> >> well.
> >>
> >
> > This is on my todo list. My current plan is to grab the creds that the
> > daemon uses to respond to FUSE_INIT. That should keep behavior fairly
> > similar. I'm not sure if there are cases where the fuse server is
> > operating under multiple contexts.
> > I don't currently have a plan for exposing open files via lsof. Every
> > such file should relate to one that will show up though. I haven't dug
> > into how that's set up, but I'm open to suggestions.
> >
> >> My other generic comment is that you should add justification for
> >> doing this in the first place.  I guess it's mainly performance.  So
> >> how performance can be won in real life cases?   It would also be good
> >> to measure the contribution of individual ops to that win.   Is there
> >> another reason for this besides performance?
> >>
> >> Thanks,
> >> Miklos
> >
> > Our main concern with it is performance. We have some preliminary
> > numbers looking at the pure passthrough case. We've been testing using
> > a ramdrive on a somewhat slow machine, as that should highlight
> > differences more. We ran fio for sequential reads, and random
> > read/write. For sequential reads, we were seeing libfuse's
> > passthrough_hp take about a 50% hit, with fuse-bpf not being
> > detectably slower. For random read/write, we were seeing a roughly 90%
> > drop in performance from passthrough_hp, while fuse-bpf has about a 7%
> > drop in read and write speed. When we use a bpf that traces every
> > opcode, that performance hit increases to a roughly 1% drop in
> > sequential read performance, and a 20% drop in both read and write
> > performance for random read/write. We plan to make more complex bpf
> > examples, with fuse daemon equivalents to compare against.
> >
> > We have not looked closely at the impact of individual opcodes yet.
> >
> > There's also a potential ease of use for fuse-bpf. If you're
> > implementing a fuse daemon that is largely mirroring a backing
> > filesystem, you only need to write code for the differences in
> > behavior. For instance, say you want to remove image metadata like
> > location. You could give bpf information on what range of data is
> > metadata, and zero out that section without having to handle any other
> > operations.
>
> A bit out of topic (although I'm not quite look into FUSE BPF internals)
> After roughly listening to this topic in FS track last week, I'm not
> quite sure (at least in the long term) if it might be better if
> ebpf-related filter/redirect stuffs could be landed in vfs or in a
> somewhat stackable fs so that we could redirect/filter any sub-fstree
> in principle?    It's just an open question and I have no real tendency
> of this but do we really need a BPF-filter functionality for each
> individual fs?

I think that is a valid question, but the answer is that even if it makes sense,
doing something like this in vfs would be a much bigger project with larger
consequences on performance and security and whatnot, so even if
(and a very big if) this ever happens, using FUSE-BPF as a playground for
this sort of stuff would be a good idea.

This reminds me of union mounts - it made sense to have union mount
functionality in vfs, but after a long winding road, a stacked fs (overlayfs)
turned out to be a much more practical solution.

>
> It sounds much like
> https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers
>

Nice reference.
I must admit that I found it hard to understand what Windows filter drivers
can do compared to FUSE-BPF design.
It'd be nice to get some comparison from what is planned for FUSE-BPF.

Interesting to note that there is a "legacy" Windows filter driver API,
so Windows didn't get everything right for the first API - that is especially
interesting to look at as repeating other people's mistakes would be a shame.

Thanks,
Amir.

  reply	other threads:[~2023-05-17  6:52 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18  1:40 [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 01/37] bpf: verifier: Accept dynptr mem as mem in herlpers Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 02/37] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw) Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 03/37] selftests/bpf: Test allowing NULL buffer in dynptr slice Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 04/37] fs: Generic function to convert iocb to rw flags Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 05/37] fuse-bpf: Update fuse side uapi Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 06/37] fuse-bpf: Add data structures for fuse-bpf Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 07/37] fuse-bpf: Prepare for fuse-bpf patch Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 08/37] fuse: Add fuse-bpf, a stacked fs extension for FUSE Daniel Rosenberg
2023-05-02  3:38   ` Alexei Starovoitov
2023-05-03  3:45     ` Amir Goldstein
2023-04-18  1:40 ` [RFC PATCH v3 09/37] fuse-bpf: Add ioctl interface for /dev/fuse Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 10/37] fuse-bpf: Don't support export_operations Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 11/37] fuse-bpf: Add support for access Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 12/37] fuse-bpf: Partially add mapping support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 13/37] fuse-bpf: Add lseek support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 14/37] fuse-bpf: Add support for fallocate Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 15/37] fuse-bpf: Support file/dir open/close Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 16/37] fuse-bpf: Support mknod/unlink/mkdir/rmdir Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 17/37] fuse-bpf: Add support for read/write iter Daniel Rosenberg
2023-05-16 20:21   ` Amir Goldstein
2023-04-18  1:40 ` [RFC PATCH v3 18/37] fuse-bpf: support readdir Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 19/37] fuse-bpf: Add support for sync operations Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 20/37] fuse-bpf: Add Rename support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 21/37] fuse-bpf: Add attr support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 22/37] fuse-bpf: Add support for FUSE_COPY_FILE_RANGE Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 23/37] fuse-bpf: Add xattr support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 24/37] fuse-bpf: Add symlink/link support Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 25/37] fuse-bpf: allow mounting with no userspace daemon Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 26/37] bpf: Increase struct_op limits Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 27/37] fuse-bpf: Add fuse-bpf constants Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 28/37] WIP: bpf: Add fuse_ops struct_op programs Daniel Rosenberg
2023-04-27  4:18   ` Andrii Nakryiko
2023-05-03  1:53     ` Daniel Rosenberg
2023-05-03 18:22       ` Andrii Nakryiko
2023-04-18  1:40 ` [RFC PATCH v3 29/37] fuse-bpf: Export Functions Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 30/37] fuse: Provide registration functions for fuse-bpf Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 31/37] fuse-bpf: Set fuse_ops at mount or lookup time Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 32/37] fuse-bpf: Call bpf for pre/post filters Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 33/37] fuse-bpf: Add userspace " Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 34/37] WIP: fuse-bpf: add error_out Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 35/37] tools: Add FUSE, update bpf includes Daniel Rosenberg
2023-04-27  4:24   ` Andrii Nakryiko
2023-04-28  0:48     ` Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 36/37] fuse-bpf: Add selftests Daniel Rosenberg
2023-04-18  1:40 ` [RFC PATCH v3 37/37] fuse: Provide easy way to test fuse struct_op call Daniel Rosenberg
2023-04-18  5:33 ` [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE Amir Goldstein
2023-04-21  1:41   ` Daniel Rosenberg
2023-04-23 14:50     ` Amir Goldstein
2023-04-24 15:32 ` Miklos Szeredi
2023-05-02  0:07   ` Daniel Rosenberg
2023-05-17  2:50     ` Gao Xiang
2023-05-17  6:51       ` Amir Goldstein [this message]
2023-05-17  7:05         ` Gao Xiang
2023-05-17  7:19           ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxjCebxGxkguAh9s4_Vg7QHM=oBoV0LUPZpb+0pcm3z1bw@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=drosen@google.com \
    --cc=haoluo@google.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=joannelkoong@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-team@android.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=miklos@szeredi.hu \
    --cc=mykolal@fb.com \
    --cc=sdf@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).