qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Miklos Szeredi <miklos@szeredi.hu>, Vivek Goyal <vgoyal@redhat.com>
Cc: virtio-fs-list <virtio-fs@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH 1/3] virtiofsd: Find original inode ID of mount points
Date: Fri, 4 Jun 2021 18:22:28 +0200	[thread overview]
Message-ID: <497d1ed1-fbd4-0aad-1349-826be8b49834@redhat.com> (raw)
In-Reply-To: <CAJfpegtKjx7y8+bMQUivjxHuF0iiLYxKCtd0H3AisVymHHjtTA@mail.gmail.com>

On 02.06.21 20:59, Miklos Szeredi wrote:
> On Wed, 2 Jun 2021 at 20:20, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Wed, May 12, 2021 at 02:55:42PM +0200, Max Reitz wrote:
>>> Mount point directories represent two inodes: On one hand, they are a
>>> normal directory on their parent filesystem.  On the other, they are the
>>> root node of the filesystem mounted there.  Thus, they have two inode
>>> IDs.
>>>
>>> Right now, we only report the latter inode ID (i.e. the inode ID of the
>>> mounted filesystem's root node).  This is fine once the guest has
>>> auto-mounted a submount there (so this inode ID goes with a device ID
>>> that is distinct from the parent filesystem), but before the auto-mount,
>>> they have the device ID of the parent and the inode ID for the submount.
>>> This is problematic because this is likely exactly the same
>>> st_dev/st_ino combination as the parent filesystem's root node.  This
>>> leads to problems for example with `find`, which will thus complain
>>> about a filesystem loop if it has visited the parent filesystem's root
>>> node before, and then refuse to descend into the submount.
>>>
>>> There is a way to find the mount directory's original inode ID, and that
>>> is to readdir(3) the parent directory, look for the mount directory, and
>>> read the dirent.d_ino field.  Using this, we can let lookup and
>>> readdirplus return that original inode ID, which the guest will thus
>>> show until the submount is auto-mounted.  (Then, it will invoke getattr
>>> and that stat(2) call will return the inode ID for the submount.)
>> [ CC miklos ]
>>
>> Hi Max,
>>
>> Though we discussed this in chat room, I am still responding to this
>> email with the concern I have, so that there is a record of it.

That sounds scary :)

>> So with this patch for FUSE_LOOKUP  we always return submount's parentinode
>> id and with GETATTR request we return actual inode id of submount. That
>> kind of bothers me a bit as we are assuming that there is always going
>> to be a GETATTR request after FUSE_LOOKUP. FUSE_LOOKUP itself has attrs
>> returned with it and it might happen that after FUSE_LOOKUP, FUSE_GETATTR
>> might not be called at all because FUSE_LOOKUP itself got the latest
>> updated attrs with certain timeout.
>>
>> For example, if I call stat on a normal file (not submount), I see that
>> after FUSE_LOOKUP, no FUSE_GETATTR request is going and
>> fuse_update_get_attr() is using attrs from locally cached inode attrs.
>>
>> But same thing does not seem to be happening in case of submount. Once
>> submount is created in guest, I see that we never seem to be calling
>> ->revalidate() on newly created dentry of submount root. I am not sure
>> why. And that means we don't do FUSE_LOOKUP and that means FUSE_GETATTR
>> always gets called.

Even if it worked reliably, I have to admit it’s kind of relying on at 
most implementation-defined behavior.  Having two inodes would 
definitely be the right solution.

> Yes, this sounds normal.
>
> The lookup sequence for "/mnt/virtiofs/dir1/submount/file" will be:
>
> LOOKUP(1, "dir")
>      create inode I1 in sb1
>      create dentry "dir" with inode I1 in sb1
> LOOKUP(I1, "submount")
>      create inode I2 in sb1, set S_AUTOMOUNT
>      create dentry "submount" with inode I2 in sb1
>      automount triggered:
>      create sb2
>      create inode I2 in sb2
>      create dentry "/" with inode I2 in sb2
> GETATTR(I2)
>       fill inode I2
> LOOKUP(I2, "file")
>       create inode I3
>       create dentry "file" with inode I3 in sb2
>
> Notice how there's two inodes numbered I2, but in separate sb's.
> However the server has only the notion of a single I2 inode which is
> the root of the submount, since the mountpoint is not visible (except
> for d_ino in readdir()).
>
> Now AFAICS what this patch does is set the inode number in the
> attributes returned by LOOKUP(I1, "submount") to the covered inode, so
> that AT_NO_AUTOMOUNT stat will return the correct value.   While this
> seems to work, it's not a fundamental fix to the problem, since the
> attributes on sb1/I2 will time out and the next stat(...,
> AT_NO_AUTOMOUNT) will trigger a GETATTR request, which will return the
> inode number of the submount root.

Ah, yeah.

> Solving this properly would mean that the server would have to
> generate separate nodeid's for the mountpoint and the submount root
> and the protocol would have to be extended so that this information
> could be transferred to the client.

Yes, we’d somehow need to do a separate lookup for the root inode of the 
submount...

> Not sure whether this is worth it or not.

I’m wondering the same.  This series was mostly the result of an 
incidental finding and seeing that getting it to work and do what I want 
seemed to be pretty easy.  Turns out doing something right can never 
really be easy...

But we have already decided that we don’t deem the inode ID problem too 
important (not least considering NFS has the same issue), so fixing it 
is not really top priority.  Maybe I’ll get back to it, but I think for 
now I consider it on the backlog.

Max



  reply	other threads:[~2021-06-04 17:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12 12:55 [PATCH 0/3] virtiofsd: Find original inode ID of mount points Max Reitz
2021-05-12 12:55 ` [PATCH 1/3] " Max Reitz
2021-05-12 15:59   ` Connor Kuehl
2021-05-17 14:57   ` Vivek Goyal
2021-05-17 17:26     ` Max Reitz
2021-05-20 11:28       ` Dr. David Alan Gilbert
2021-05-26 18:13   ` Vivek Goyal
2021-05-26 18:50     ` [Virtio-fs] " Vivek Goyal
2021-05-27 15:00       ` Max Reitz
2021-06-02 18:19   ` Vivek Goyal
2021-06-02 18:59     ` Miklos Szeredi
2021-06-04 16:22       ` Max Reitz [this message]
2021-05-12 12:55 ` [PATCH 2/3] virtiofs_submounts.py: Do not generate ssh key Max Reitz
2021-05-12 12:55 ` [PATCH 3/3] virtiofs_submounts.py: Check `find` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=497d1ed1-fbd4-0aad-1349-826be8b49834@redhat.com \
    --to=mreitz@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=miklos@szeredi.hu \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).