All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Eugene Zemtsov <ezemtsov@google.com>
Cc: linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Amir Goldstein <amir73il@gmail.com>,
	Richard Weinberger <richard.weinberger@gmail.com>
Subject: Re: Initial patches for Incremental FS
Date: Fri, 3 May 2019 06:22:23 -0400	[thread overview]
Message-ID: <CAJfpeguyajzHwhae=4PWLF4CUBorwFWeybO-xX6UBD2Ekg81fg@mail.gmail.com> (raw)
In-Reply-To: <CAK8JDrFZW1jwOmhq+YVDPJi9jWWrCRkwpqQ085EouVSyzw-1cg@mail.gmail.com>

On Fri, May 3, 2019 at 12:23 AM Eugene Zemtsov <ezemtsov@google.com> wrote:
>
> On Thu, May 2, 2019 at 6:26 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Why not CODA, though, with local fs as cache?
>
> On Thu, May 2, 2019 at 4:20 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > This sounds very useful.
> >
> > Why does it have to be a new special-purpose Linux virtual file?
> > Why not FUSE, which is meant for this purpose?
> > Those are things that you should explain when you are proposing a new
> > filesystem,
> > but I will answer for you - because FUSE page fault will incur high
> > latency also after
> > blocks are locally available in your backend store. Right?
> >
> > How about fscache support for FUSE then?
> > You can even write your own fscache backend if the existing ones don't
> > fit your needs for some reason.
> >
> > Piling logic into the kernel is not the answer.
> > Adding the missing interfaces to the kernel is the answer.
> >
>
> Thanks for the interest and feedback. What I dreaded most was silence.
>
> Probably I should have given a bit more details in the introductory email.
> Important features we’re aiming for:
>
> 1. An attempt to read a missing data block gives a userspace data loader a
> chance to fetch it. Once a block is loaded (in advance or after a page fault)
> it is saved into a local backing storage and following reads of the same block
> are done directly by the kernel. [Implemented]
>
> 2. Block level compression. It saves space on a device, while still allowing
> very granular loading and mapping. Less granular compression would trigger
> loading of more data than absolutely necessary, and that’s the thing we
> want to avoid. [Implemented]
>
> 3. Block level integrity verification. The signature scheme is similar to
> DMverity or fs-verity. In other words, each file has a Merkle tree with
> crypto-digests of 4KB blocks. The root digest is signed with RSASSA or ECDSA.
> Each time a data block is read digest is calculated and checked with the
> Merkle tree, if the signature check fails the read operation fails as well.
> Ideally I’d like to use fs-verity API for that. [Not implemented yet.]
>
> 4. New files can be pushed into incremental-fs “externally” when an app needs
> a new resource or a binary. This is needed for situations when a new resource
> or a new version of code is available, e.g. a user just changed the system
> language to Spanish, or a developer rolled out an app update.
> Things change over time and this means that we can’t just incrementally
> load a precooked ext4 image and mount it via a loopback device.   [Implemented]
>
> 5. No need to support writes or file resizing. It eliminates a lot of
> complexity.
>
> Currently not all of these features are implemented yet, but they all will be
> needed to achieve our goals:
>  - Apps can be delivered incrementally without having to wait for extra data.
>    At the same time given enough time the app can be downloaded fully without
>    having to keep a connection open after that.
> - App’s integrity should be verifiable without having to read all its blocks.
> - Local storage and battery need to be conserved.
> - Apps binaries and resources can change over time.
>    Such changes are triggered by external events.
>

Good summary.  I understand the requirements better now.

I still have issues with this design, because it looks very android
specific.  For example I know that lazy download  is something
actually being heavily used by distributed computing (see cernvm-fs)
so it's not a specific requirement of android.   By bundling these
features together into a kernel module you are basically limiting the
user base and hence possibly missing out on some of the advantages of
having a more varied user base.

I wonder how much of the performance issues with the fuse prototype
was because of 4k reads/disabling re adahead?   I know you require
that for the data loading part, but it would be trivial to turn that
behavior off once everything is in place.   Does the prototype do
that?  Have you tried doing that?  Is the prototype in a good enough
shape to perhaps move it to a public repository for review?

I'm also wondering about some of the features you describle above.
Why a new block fs?  A normal fs (ext4) provides most of those things:
you can add files to it, etc...  The one thing it doesn't provide is
compression, and that's because it's hard for the non-incremental
case.   So do we really need a new disk format for this?  Or can the
missing compression feature (perhaps with limits) be implemented in
ext4/f2fs?  In that case we even can take that work off of fuse and
just leave the loading to the fuse part. Cernvm-fs does that with a
fuse fs on the lower layer that does  lazy downloading, and putting
already downloaded files in an upper layer of overlayfs for faster
access, but it's possible that there's a better way of doing that not
involving even overlayfs.

Thanks,
Miklos

  parent reply	other threads:[~2019-05-03 10:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-02  4:03 Initial patches for Incremental FS ezemtsov
2019-05-02  4:03 ` [PATCH 1/6] incfs: Add first files of incrementalfs ezemtsov
2019-05-02 19:06   ` Miklos Szeredi
2019-05-02 20:41   ` Randy Dunlap
2019-05-07 15:57   ` Jann Horn
2019-05-07 17:13   ` Greg KH
2019-05-07 17:18   ` Greg KH
2019-05-02  4:03 ` [PATCH 2/6] incfs: Backing file format ezemtsov
2019-05-02  4:03 ` [PATCH 3/6] incfs: Management of in-memory FS data structures ezemtsov
2019-05-02  4:03 ` [PATCH 4/6] incfs: Integration with VFS layer ezemtsov
2019-05-02  4:03 ` [PATCH 6/6] incfs: Integration tests for incremental-fs ezemtsov
2019-05-02 11:19 ` Initial patches for Incremental FS Amir Goldstein
2019-05-02 13:10   ` Theodore Ts'o
2019-05-02 13:26     ` Al Viro
2019-05-03  4:23       ` Eugene Zemtsov
2019-05-03  5:19         ` Amir Goldstein
2019-05-08 20:09           ` Eugene Zemtsov
2019-05-09  8:15             ` Amir Goldstein
     [not found]               ` <CAK8JDrEQnXTcCtAPkb+S4r4hORiKh_yX=0A0A=LYSVKUo_n4OA@mail.gmail.com>
2019-05-21  1:32                 ` Yurii Zubrytskyi
2019-05-22  8:32                   ` Miklos Szeredi
2019-05-22 17:25                     ` Yurii Zubrytskyi
2019-05-23  4:25                       ` Miklos Szeredi
2019-05-29 21:06                         ` Yurii Zubrytskyi
2019-05-30  9:22                           ` Miklos Szeredi
2019-05-30 22:45                             ` Yurii Zubrytskyi
2019-05-31  9:02                               ` Miklos Szeredi
2019-05-22 10:54                   ` Amir Goldstein
2019-05-03  7:23         ` Richard Weinberger
2019-05-03 10:22         ` Miklos Szeredi [this message]
2019-05-02 13:46     ` Amir Goldstein
2019-05-02 18:16   ` Richard Weinberger
2019-05-02 18:33     ` Richard Weinberger
2019-05-02 13:47 ` J. R. Okajima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJfpeguyajzHwhae=4PWLF4CUBorwFWeybO-xX6UBD2Ekg81fg@mail.gmail.com' \
    --to=miklos@szeredi.hu \
    --cc=amir73il@gmail.com \
    --cc=ezemtsov@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=richard.weinberger@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.