All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Vivek Goyal <vgoyal@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"linux-unionfs@vger.kernel.org" <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 05/11] ovl: lookup redirect by file handle
Date: Thu, 27 Apr 2017 09:27:13 +0300	[thread overview]
Message-ID: <CAOQ4uxiGzp2G3qSC7be3nVnvjYJZuHAmY7Q1Oc+jM+pZWtuHig@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxhi4GReBbBzd8NdCJ+emzMEa1COYAtcaKziMiaZAV96uw@mail.gmail.com>

On Wed, Apr 26, 2017 at 5:51 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 3:15 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Apr 26, 2017 at 12:17 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Wed, Apr 26, 2017 at 12:55 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>> On Wed, Apr 26, 2017 at 11:40 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>>>>
>>>>> Just to see that I understand you correctly.
>>>>>
>>>>> I am now working on storing the following:
>>>>>
>>>>> /*
>>>>>  * The tuple origin.{fh,layer,uuid} is a universal unique identifier
>>>>>  * for a copy up origin, where:
>>>>>  * origin.fh    - exported file handle of the lower file
>>>>>  * origin.root - exported file handle of the lower layer root
>>>>>  * origin.uuid  - uuid of the lower filesystem
>>>>
>>>> I wouldn't even store origin.root.
>>>>
>>>>>  *
>>>>>  * origin.{fh,root} are stored in format of a variable length binary blob
>>>>>  * with struct ovl_fh header (total blob size up to 20 bytes).
>>>>>  * uuid is stored in raw format (16 bytes) as published by sb->s_uuid.
>>>>>  */
>>>>>
>>>>> I intend to implement lookup as follows:
>>>>> - compare(origin.uuid, same_lower_sb->s_uuid)
>>>>> # layer root dentries cannot be DCACHE_DISCONNECTED, so
>>>>> # exportfs_decode_fh ignores mnt arg and returns the cached dentry
>>>>> - root = exportfs_decode_fh(lowerstack[0].mnt, origin.root)
>>>>> - find layer where lowerstack[layer].dentry == root
>>>>> - this = exportfs_decode_fh(lowerstack[layer].mnt, origin.fh)
>>>>>
>>>>> is_subdir() is NOT needed for decoding the layer root
>>>>> is_subdir() is optional for decoding the lower file, because
>>>>> it is not needed to identify the layer
>>>>
>>>> Hmm, we can just force exportfs_decode_fh() to return a connected
>>>> dentry (return false from *acceptable() if the dentry is disconnected)
>>>> before going on to iterate the layers to see which one contains it.
>>>>
>>>
>>> Hmm, this might work, but to quote from exportfs_decode_fh():
>>> "It's not a directory.  Life is a little more complicated."
>>>
>>> IIUC, 'connected' means 'connected to sb root', and not
>>> 'connected to mnt root', so in the optimal case where
>>> all lower dentries are cached,  exportfs_decode_fh() will return
>>> a connected dentry for every fh we give it regardless of the
>>> mnt argument, so we will have to use is_subdir() to find the
>>> right layer, which brings us back to O(numlower*depth)
>>
>> It just means that we might have to make up an artificial mount which
>> has its root at the sb root to be able to decode the handle into a
>> connected one.
>>
>
> I'm not sure I understand what this artificial mount buys us.

Let me try to explain the problem with a worse case, but not
improbable example:

Suppose I have an overlay with deep file at /a/b/c/.../z
Suppose the layers are at /old/{lower,upper} I copy them
over to /new/{lower,upper} and mount the overlay at new path.

Suppose that dcache is fully populated under /new and fully
evicted under /old.

When trying to decode the file handle for z, exportfs_decode_fh()
will call the file system to actually read all directories a..z from disk
in order to reconnect the dentry of old z all the way up to /old
and it will do that *before* calling the acceptable() callback.

Alternatively, if we first try to decode the file handle for /old/lower,
decoding will be very fast (most likely already in cache) and we will
not have to continue to decoding z and reading all directories a..z
from disk.

This is why and how I implemented lookup by origin.{root+fh}
in v3 patch set.

>
>>>
>>> With the extra cost of storing the deducible information origin.root,
>>> we will have less complex and more efficient lookup code.
>>>
>>> Let me try and implement it and see if I am right.
>>> We can always discard origin.root from v4 if it turns
>>> out to be unhelpful.
>>
>> I don't have good feelings about storing the root fh just because we
>> don't special case the layer root anywhere yet, and I wouldn't want to
>> do that unless there's a good reason.
>>

Wait, what do you mean by "we don't special case the layer root?"
Do you mean that we could mount an overlay at a subdir path?
i.e. in the example below, we could mount an overlay with
upperdir=/new/upper/a/b/c,lowerdir=/new/lower/a/b/c?

If this is what you mean then it is not true that we don't special case
layer root. We do it with path redirect relative to layer root.
If anything, we should be storing origin.root along with overlay.redirect
in order to verify that we are not redirecting into the wrong relative
path.

>
> There are a few reasons for origin.root, not sure if they are good:
> 1. lookup is O(numlower+depth) instead of O(numlower*depth)
> 2. origin.uuid validates that we are still on the same sb
>     origin.root validates that we are still using the same lower dirs
>     and that files from old lower were not moved around to find themselves
>     inside a different lower dir
> 3. hardlinks between layers (!!!) will still get to the right layer
>
> I personally think that reason #1 is the important one, but I think we
> disagree on the technical details of exportfs_decode_fh() and we
> need to sort this out.
>
> Here is my untested implementation of find layer by uuid/rootfh
> with the relevant comments. Maybe it helps you point out what
> I am missing or what you are missing:
>
> /* Find lower layer index by layer root file handle and uuid */
> static int ovl_find_layer_by_fh(struct dentry *dentry, struct
> ovl_lookup_data *d)
> {
>         struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
>         struct super_block *lower_sb = ovl_same_lower_sb(dentry->d_sb);
>         struct dentry *this;
>         int i;
>
>         /*
>          * For now, we only support lookup by fh for all lower layers on the
>          * same sb.  Not all filesystems set sb->s_uuid.  For those who don't
>          * this code will compare zeros, which at least ensures us that the
>          * file handles are not crossing from filesystem with sb->s_uuid to
>          * a filesystem without sb->s_uuid and vice versa.
>          */
>         if (!lower_sb || memcmp(lower_sb->s_uuid, &d->uuid, sizeof(d->uuid)))
>                 return -1;
>
>         /*
>          * Layer root dentries are pinned, there are no aliases for dirs, and
>          * all lower layers are on the same sb.  If rootfh is correct,
>          * exportfs_decode_fh() will find it in dcache and return the only
>          * instance, regardless of the mnt argument and we can compare the
>          * returned pointer with the pointers in lowerstack.
>          */
>         this = ovl_decode_fh(roe->lowerstack[0].mnt, d->rootfh, ovl_is_dir);
>         if (IS_ERR(this))
>                 return -1;
>
>         for (i = 0; i < roe->numlower; i++) {
>                 if (this == roe->lowerstack[i].dentry)
>                         break;
>         }
>
>         dput(this);
>         return i < roe->numlower ? i : -1;
> }
>
> Amir.

  reply	other threads:[~2017-04-27  6:27 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24  9:14 [PATCH v2 00/11] overlayfs constant inode numbers Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 01/11] ovl: store path type in dentry Amir Goldstein
2017-04-24 12:59   ` Vivek Goyal
2017-04-24 13:10     ` Amir Goldstein
2017-04-24 13:36       ` Vivek Goyal
2017-04-24 13:41         ` Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 02/11] ovl: cram opaque boolean into type flags Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 03/11] ovl: check if all layers are on the same fs Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 04/11] ovl: store file handle of lower inode on copy up Amir Goldstein
2017-04-24 13:32   ` kbuild test robot
2017-04-24 13:57     ` Amir Goldstein
2017-04-25 14:53   ` Miklos Szeredi
2017-04-26  5:47     ` Amir Goldstein
2017-04-26  9:21       ` Miklos Szeredi
2017-04-26  9:27         ` Amir Goldstein
2017-04-26  9:35           ` Miklos Szeredi
2017-04-26  9:39   ` Miklos Szeredi
2017-04-26  9:53     ` Amir Goldstein
2017-04-26  9:57       ` Miklos Szeredi
2017-04-24  9:14 ` [PATCH v2 05/11] ovl: lookup redirect by file handle Amir Goldstein
2017-04-25  8:10   ` Amir Goldstein
2017-04-25 15:13   ` Miklos Szeredi
2017-04-25 17:41     ` Amir Goldstein
2017-04-25 19:11       ` Amir Goldstein
2017-04-26  9:06         ` Miklos Szeredi
2017-04-26  9:40           ` Amir Goldstein
2017-04-26  9:55             ` Miklos Szeredi
2017-04-26 10:17               ` Amir Goldstein
2017-04-26 12:15                 ` Miklos Szeredi
2017-04-26 14:51                   ` Amir Goldstein
2017-04-27  6:27                     ` Amir Goldstein [this message]
2017-04-27  7:48                       ` Miklos Szeredi
2017-04-27  9:22                         ` Amir Goldstein
2017-04-27  9:26                         ` Miklos Szeredi
     [not found]                           ` <CAOQ4uxiweaqzR3eT-StgtDFAHBuYhGRvAJE6v=XpH33MevpmoA@mail.gmail.com>
     [not found]                             ` <CAJfpegtTJmcLVrLOeQbhu4Q6sM0Mi_FRgr+vStF0k95QsWm5uQ@mail.gmail.com>
2017-04-27 13:53                               ` Amir Goldstein
2017-04-27 14:46                                 ` Miklos Szeredi
2017-04-27 16:08                                   ` Amir Goldstein
2017-04-28  7:25                                     ` Amir Goldstein
2017-04-28  7:55                                       ` Miklos Szeredi
2017-04-28  8:15                                         ` Amir Goldstein
2017-04-28  9:37                                           ` Miklos Szeredi
2017-04-28  9:57                                             ` Amir Goldstein
2017-04-28 10:05                                               ` Miklos Szeredi
2017-04-28 10:45                                                 ` Amir Goldstein
2017-04-27  7:40                     ` Miklos Szeredi
2017-04-24  9:14 ` [PATCH v2 06/11] ovl: lookup non-dir inode copy up origin Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 07/11] ovl: set the COPYUP type flag for non-dirs Amir Goldstein
2017-04-26 14:40   ` Miklos Szeredi
2017-04-26 14:53     ` Miklos Szeredi
2017-04-26 15:02       ` Amir Goldstein
2017-04-26 18:51         ` Amir Goldstein
2017-04-27  9:32         ` Miklos Szeredi
2017-04-26 14:57     ` Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 08/11] ovl: redirect non-dir by path on rename Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 09/11] ovl: constant st_ino/st_dev across copy up Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 10/11] ovl: persistent and constant inode number for directories Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 11/11] ovl: fix du --one-file-system on overlay mount Amir Goldstein
2017-04-24 18:40 ` [PATCH v2 12/12] ovl: persistent inode numbers for hardlinks Amir Goldstein
2017-04-24 18:51 ` [PATCH v2 00/11] overlayfs constant inode numbers Amir Goldstein
2017-04-25 11:52 ` Vivek Goyal
2017-04-25 12:05   ` Amir Goldstein
2017-04-25 12:16 ` Vivek Goyal
2017-04-25 12:41   ` Amir Goldstein
2017-04-25 12:52     ` Vivek Goyal
2017-04-25 13:23       ` Amir Goldstein
2017-04-25 13:29         ` Vivek Goyal
2017-04-25 13:49           ` Amir Goldstein
2017-04-25 13:53             ` Vivek Goyal
2017-04-25 14:20               ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxiGzp2G3qSC7be3nVnvjYJZuHAmY7Q1Oc+jM+pZWtuHig@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.