Lazy Loading Layers (Userfaultfd for filesystems?)

* Lazy Loading Layers (Userfaultfd for filesystems?)
@ 2021-01-25 19:48 Sargun Dhillon
  2021-01-26  5:18 ` Amir Goldstein
  2023-05-29 15:15 ` Detaching lower layers (Was: Lazy Loading Layers) Amir Goldstein
  0 siblings, 2 replies; 5+ messages in thread
From: Sargun Dhillon @ 2021-01-25 19:48 UTC (permalink / raw)
  To: linux-unionfs

One of the projects I'm playing with for containers is lazy-loading of layers. 
We've found that less than 10% of the files on a layer actually get used, which 
is an unfortunate waste. It also means in some cases downloading ~100s of MB, or 
~1s of GB of files before starting a container workload. This is unfortunate.

It would be nice if there was a way to start a container workload, and have
it so that if it tries to access and unpopulated (not yet downloaded) part
of the filesystem block while trying to be accessed. This is trivial to do
if the "lowest" layer is FUSE, where one can just stall in userspace on
loads. Unfortunately, AFAIK, there's not a good way to swap out the FUSE
filesystem with the "real" filesystem once it's done fully populating,
and you have to pay for the full FUSE cost on each read / write.

I've tossed around:
1. Mutable lowerdirs and having something like this:

layer0 --> Writeable space
layer1 --> Real XFS filesystem
layer2 --> FUSE FS

and if there is a "miss" on layer 1, it will then look it up on
layer 2 while layer 1 is being populated. Then the FUSE FS can block.
This is neat, but it requires the FUSE FS to always be up, and incurs
a userspace bounce on every miss.

It also means things like metadata only copies don't work.

Does anyone have a suggestion of a mechanism to handle this? I've looked into 
swapping out layers on the fly, and what it would take to add a mechanism like 
userfaultfd to overlayfs, but I was wondering if anything like this was already 
built, or if someone has thought it through more than me.

^ permalink raw reply	[flat|nested] 5+ messages in thread