All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"linux-unionfs@vger.kernel.org" <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 05/11] ovl: lookup redirect by file handle
Date: Thu, 27 Apr 2017 16:46:55 +0200	[thread overview]
Message-ID: <CAJfpegv4kbUU1+t6+ZApQmnzbOEo-qiHOTy3Dj0Zh84zw9JeUw@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxhQFis+uDL9KbiJ87hNU=QRsjHn6O4LShnt6JJFaOtNuA@mail.gmail.com>

On Thu, Apr 27, 2017 at 3:53 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Apr 27, 2017 at 4:11 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:

>>  - hardlinks: need to set overlay.redirect  when hardlink is created
>> from a copied up file; similar to what we do on rename
>>
>
> Partly agree.
> 1. This is not atomic, because hardlink is not created in workdir.

Doesn't matter; we can set overlay.redirect before we do the hardlink.
If the hardlink fails, we are left with the redirect, but that's not a
problem.

> 2. Reverse mapping will take care of this anyway. Remember?
>     there is an extra hardlink in workdir/inodes with has overlay.redirect
>     set on first alias copy up

Yeah, but I don't really see why we'd need to set overlay.redirect on
copy-up.  When reverse mapping (i.e. trying to reconstruct the overlay
dentry from the lower fh)  we'll just do the same thing as for normal
lookup: use the path from the upper root to the dentry and look up
each component in the lower layers (taking into account any
overlay.redirect encountered).

>>  - for non-redirect case looking up by overlay.origin is almost surely
>> a pessimization
>>
>
> Not sure about that.
> For directories by fh and by name are probably on par -
> At most one lookup of ".." compared to one lookup of d.name.
>
> For non-dir there is a better way that is better than both (see below).

Not that simple (see below).

> So I managed to confuse myself about the technical facts of decoding,
> because I was used to dealing only with decoding of dir handles
> (for snapshots) and just now added decoding of non-dir.
>
> When decoding a directory handle, it is always being connected up to
> root. It sounds harsh, but in fact I think it will always be faster or on par
> with regular lookup by path, because:
> When looking up backwards until the first connected ancestor, filesystem
> always has to read the ".." entry.
> When looking up forward by path, then filesystem need to read entries
> from connected ancestor by name and that is most likely indexed only
> worse then the ".." entry of the backwards lookup.

Problem is there's more going on than just lookup of "..".  In fact it
*must* entail the lookup of "name" as well, because that's the way the
dentry gets connected.  There's an even bigger snag: where do we get
the name?  There's a ->get_name() export op, but most fs don't define
it, and the default action is to iterate the parent dir and find the
one matching our inum.  There goes the performance...

That's why I'm saying it's almost certainly will be slower.  Exception
might be the cached case, but even there lookup by inum might be
slower than the super optimized cached path lookup of a few filenames.
Since we are looking up the overlay dentry, which isn't cached at this
point, so why would the lower ones be?

> When decoding directories you also want to get a connected dentry and
> verifying is_subdir() makes sense.
>
> HOWEVER, and this is big thing that I missed, when decoding a non-dir
> we DON'T need to get a connected dentry.
> It's perfectly fine to get a disconnected alias and getting a disconnected
> alias is always O(1) for the filesystem.

It's O(1) but so is a single component lookup (case without
overlay.redirect).  In the cold cache cache both will be slow, since
most of the time will be spent on getting the inode from disk.  In the
hot cache case, odds are the name lookup will win, since it's the more
optimized codepath...

> The only thing we really need from this alias is to know its inode number
> (and to know that it is still valid).
>
> So if we encode non-connectable fh for non-dir (like knfsd does by default)
> then:
> 1. decoding them will always be faster then any other lookup method
> 2. we cannot verify is_subdir() - so what?
>
> What's the worse thing that can happen if the decoded entry is not under
> the layer anymore? We only use its inode number, and the only thing we
> need to know is that it is unique within the lower layers inode namespace
> and we don't need is_subdir() for that.

Okay.

>
> But I just realized something very very bad about non-samefs case.
> We must use made up st_dev for lower layers, we can certainly no
> longer use the real lower st_dev.
> If we do, then we will have 2 files in 2 different overlay mounts,
> who have the same lower inode but 2 different upper inodes with
> different content and those 2 overlay files will have the same
> st_dev/st_ino.

Yeah, I told you about this issue a couple of mails back.

>
> I just found that in my debian based  kvm-xfstests machine, diff
> reports 2 broken hardlinks with different content as equal, because
> they have the same st_dev/st_ino.
>
> So in conclusion:
> 1. Encode non-connectable file handles to non-dir

Fine by me.

> 2. Always try to lookup non-dir by fh first - it's O(1)

Well... for simplicity sake, okay.  Probably not a big loss.

> 3. Non-samefs needs fake st_dev before reporting constant inodes

Yes.

> 4. Broken hardlinks should NOT report same inode

Yes.

Thanks,
Miklos

  reply	other threads:[~2017-04-27 14:46 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24  9:14 [PATCH v2 00/11] overlayfs constant inode numbers Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 01/11] ovl: store path type in dentry Amir Goldstein
2017-04-24 12:59   ` Vivek Goyal
2017-04-24 13:10     ` Amir Goldstein
2017-04-24 13:36       ` Vivek Goyal
2017-04-24 13:41         ` Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 02/11] ovl: cram opaque boolean into type flags Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 03/11] ovl: check if all layers are on the same fs Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 04/11] ovl: store file handle of lower inode on copy up Amir Goldstein
2017-04-24 13:32   ` kbuild test robot
2017-04-24 13:57     ` Amir Goldstein
2017-04-25 14:53   ` Miklos Szeredi
2017-04-26  5:47     ` Amir Goldstein
2017-04-26  9:21       ` Miklos Szeredi
2017-04-26  9:27         ` Amir Goldstein
2017-04-26  9:35           ` Miklos Szeredi
2017-04-26  9:39   ` Miklos Szeredi
2017-04-26  9:53     ` Amir Goldstein
2017-04-26  9:57       ` Miklos Szeredi
2017-04-24  9:14 ` [PATCH v2 05/11] ovl: lookup redirect by file handle Amir Goldstein
2017-04-25  8:10   ` Amir Goldstein
2017-04-25 15:13   ` Miklos Szeredi
2017-04-25 17:41     ` Amir Goldstein
2017-04-25 19:11       ` Amir Goldstein
2017-04-26  9:06         ` Miklos Szeredi
2017-04-26  9:40           ` Amir Goldstein
2017-04-26  9:55             ` Miklos Szeredi
2017-04-26 10:17               ` Amir Goldstein
2017-04-26 12:15                 ` Miklos Szeredi
2017-04-26 14:51                   ` Amir Goldstein
2017-04-27  6:27                     ` Amir Goldstein
2017-04-27  7:48                       ` Miklos Szeredi
2017-04-27  9:22                         ` Amir Goldstein
2017-04-27  9:26                         ` Miklos Szeredi
     [not found]                           ` <CAOQ4uxiweaqzR3eT-StgtDFAHBuYhGRvAJE6v=XpH33MevpmoA@mail.gmail.com>
     [not found]                             ` <CAJfpegtTJmcLVrLOeQbhu4Q6sM0Mi_FRgr+vStF0k95QsWm5uQ@mail.gmail.com>
2017-04-27 13:53                               ` Amir Goldstein
2017-04-27 14:46                                 ` Miklos Szeredi [this message]
2017-04-27 16:08                                   ` Amir Goldstein
2017-04-28  7:25                                     ` Amir Goldstein
2017-04-28  7:55                                       ` Miklos Szeredi
2017-04-28  8:15                                         ` Amir Goldstein
2017-04-28  9:37                                           ` Miklos Szeredi
2017-04-28  9:57                                             ` Amir Goldstein
2017-04-28 10:05                                               ` Miklos Szeredi
2017-04-28 10:45                                                 ` Amir Goldstein
2017-04-27  7:40                     ` Miklos Szeredi
2017-04-24  9:14 ` [PATCH v2 06/11] ovl: lookup non-dir inode copy up origin Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 07/11] ovl: set the COPYUP type flag for non-dirs Amir Goldstein
2017-04-26 14:40   ` Miklos Szeredi
2017-04-26 14:53     ` Miklos Szeredi
2017-04-26 15:02       ` Amir Goldstein
2017-04-26 18:51         ` Amir Goldstein
2017-04-27  9:32         ` Miklos Szeredi
2017-04-26 14:57     ` Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 08/11] ovl: redirect non-dir by path on rename Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 09/11] ovl: constant st_ino/st_dev across copy up Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 10/11] ovl: persistent and constant inode number for directories Amir Goldstein
2017-04-24  9:14 ` [PATCH v2 11/11] ovl: fix du --one-file-system on overlay mount Amir Goldstein
2017-04-24 18:40 ` [PATCH v2 12/12] ovl: persistent inode numbers for hardlinks Amir Goldstein
2017-04-24 18:51 ` [PATCH v2 00/11] overlayfs constant inode numbers Amir Goldstein
2017-04-25 11:52 ` Vivek Goyal
2017-04-25 12:05   ` Amir Goldstein
2017-04-25 12:16 ` Vivek Goyal
2017-04-25 12:41   ` Amir Goldstein
2017-04-25 12:52     ` Vivek Goyal
2017-04-25 13:23       ` Amir Goldstein
2017-04-25 13:29         ` Vivek Goyal
2017-04-25 13:49           ` Amir Goldstein
2017-04-25 13:53             ` Vivek Goyal
2017-04-25 14:20               ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJfpegv4kbUU1+t6+ZApQmnzbOEo-qiHOTy3Dj0Zh84zw9JeUw@mail.gmail.com \
    --to=miklos@szeredi.hu \
    --cc=amir73il@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.