linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sargun Dhillon <sargun@sargun.me>
To: linux-unionfs@vger.kernel.org
Subject: Lazy Loading Layers (Userfaultfd for filesystems?)
Date: Mon, 25 Jan 2021 19:48:49 +0000	[thread overview]
Message-ID: <20210125194848.GA12389@ircssh-2.c.rugged-nimbus-611.internal> (raw)

One of the projects I'm playing with for containers is lazy-loading of layers. 
We've found that less than 10% of the files on a layer actually get used, which 
is an unfortunate waste. It also means in some cases downloading ~100s of MB, or 
~1s of GB of files before starting a container workload. This is unfortunate.

It would be nice if there was a way to start a container workload, and have
it so that if it tries to access and unpopulated (not yet downloaded) part
of the filesystem block while trying to be accessed. This is trivial to do
if the "lowest" layer is FUSE, where one can just stall in userspace on
loads. Unfortunately, AFAIK, there's not a good way to swap out the FUSE
filesystem with the "real" filesystem once it's done fully populating,
and you have to pay for the full FUSE cost on each read / write.

I've tossed around:
1. Mutable lowerdirs and having something like this:

layer0 --> Writeable space
layer1 --> Real XFS filesystem
layer2 --> FUSE FS

and if there is a "miss" on layer 1, it will then look it up on
layer 2 while layer 1 is being populated. Then the FUSE FS can block.
This is neat, but it requires the FUSE FS to always be up, and incurs
a userspace bounce on every miss.

It also means things like metadata only copies don't work.

Does anyone have a suggestion of a mechanism to handle this? I've looked into 
swapping out layers on the fly, and what it would take to add a mechanism like 
userfaultfd to overlayfs, but I was wondering if anything like this was already 
built, or if someone has thought it through more than me.


             reply	other threads:[~2021-01-25 19:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 19:48 Sargun Dhillon [this message]
2021-01-26  5:18 ` Lazy Loading Layers (Userfaultfd for filesystems?) Amir Goldstein
2021-01-26 13:12   ` Alessio Balsini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210125194848.GA12389@ircssh-2.c.rugged-nimbus-611.internal \
    --to=sargun@sargun.me \
    --cc=linux-unionfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).