From mboxrd@z Thu Jan 1 00:00:00 1970 From: Max Reitz References: <20210609155551.44437-1-mreitz@redhat.com> <20210609155551.44437-8-mreitz@redhat.com> <20210611200459.GB767764@redhat.com> <9cea5642-e5ea-961f-d816-0998e52aad9f@redhat.com> <20210617212143.GD1142820@redhat.com> <1e5dafd2-34e0-1a25-2cb5-6822eaf2502c@redhat.com> <20210618182901.GB1252241@redhat.com> Message-ID: Date: Tue, 13 Jul 2021 17:07:31 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [Virtio-fs] [PATCH v2 7/9] virtiofsd: Add inodes_by_handle hash table List-Id: Development discussions about virtio-fs List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vivek Goyal Cc: virtio-fs@redhat.com, qemu-devel@nongnu.org So I’m coming back to this after three weeks (well, PTO), and this again turns into a bit of a pain, actually. I don’t think it’s anything serious, but I had thought we had found something that would make us both happy because it wouldn’t be too ugly, and now it’s turning ugly again...  So I’m sending this mail as a heads up before I send v3 in the next days, to explain my thought process. On 21.06.21 11:02, Max Reitz wrote: > On 18.06.21 20:29, Vivek Goyal wrote: > [...] >> I am still reading your code and trying to understand it. But one >> question came to mind. What happens if we can generate file handle >> during lookup. But can't generate when same file is looked up again. >> >> - A file foo.txt is looked. We can create file handle and we add it >>    to lo->inodes_by_handle as well as lo->inodes_by_ds. >> >> - Say somebody deleted file and created again and inode number got >>    reused. >> >> - Now during ->revalidation path, lookup happens again. This time say >>    we can't generate file handle. If am reading lo_do_find() code >>    correctly, it will find the old inode using ids and return same >>    inode as result of lookup. And we did not recognize that inode >>    number has been reused. > > Oh, that’s a good point.  If an lo_inode has no O_PATH fd but is only > addressed by handle, we must always look it up by handle. Also, just wanted to throw in this remark: Now that I read the code again, lo_do_find() already has a condition to prevent this.  It’s this: if (p && fhandle != NULL && p->fhandle != NULL) {     p = NULL; } There’s just one thing wrong with it, and that is the `fhandle != NULL` part.  It has no place there.  But this piece of code does exactly what we’d need it do if it were just: if (p && p->fhandle != NULL) {     p = NULL; } [...] > However, you made a good point in that we must require > name_to_handle_at() to work if it worked before for some inode, not > because it would be simpler, but because it would be wrong otherwise. > > As for the other way around...  Well, now I don’t have a strong > opinion on it.  Handling temporary name_to_handle_at() failure after > it worked the first time should not add extra complexity, but it > wouldn’t be symmetric.  Like, allowing temporary failure sometimes but > not at other times. (I think I mistyped here, it should be “Handling name_to_handle_at() randomly working after it failed the first time”.) > The next question is, how do we detect temporary failure, because if > we look up some new inode, name_to_handle_at() fails, we ignore it, > and then it starts to work and we fail all further lookups, that’s not > good.  We should have the first lookup fail.  I suppose ENOTSUPP means > “OK to ignore”, and for everything else we should let lookup fail?  > (And that pretty much answers my "what if name_to_handle_at() works > the first time, but then fails" question.  If we let anything but > ENOTSUPP let the lookup fail, then we should do so every time.) I don’t think this will work as cleanly as I’d hoped. The problem I’m facing is that get_file_handle() doesn’t only call name_to_handle_at(), but also contains a lot of code managing mount_fds.  There are a lot of places that can fail, too, and I think we should have them fall back to using an O_PATH FD: Say mount_fds doesn’t contain an FD for the new handle’s mount ID yet, so we want to add one.  However, it turns out that the file is not a regular file or directory, so we cannot open it as a regular FD and add it to mount_fds; or that it is a regular file, but without permission to open it O_RDONLY.  So we cannot return a file handle, because it will not be usable until a mount FD is added. I think in such a case we should fall back to an O_PATH FD, because this is not some unexpected error, but just an unfortunate (but reproducible and valid) circumstance where using `-o inode_file_handles` fails to do something that works without it. Now, however, this means that the next time we try to generate a handle for this file (to look it up), it will absolutely work if some other FD was added to mount_fds for this mount ID in the meantime. We could get around this by not trying to open the file for which we are to generate a handle to add its FD to mount_fds, but instead doing what the open_by_handle_at() man page suggests: > The mount_id argument returns an identifier for the filesystem mount > that corresponds to pathname. This corresponds to the first field in > one of the records in /proc/self/mountinfo. Opening the pathname in > the fifth field of that record yields a file descriptor for the mount > point; that file descriptor can be used in a subsequent call to > open_by_handle_at(). However, I’d rather avoid parsing mountinfo.  And as far as I understand, the only problem here is that we’ll have to cope with the fact that sometimes on lookups, we can generate a file handle, but the lo_inode we want to find has no file handle attached to it (because get_file_handle() failed the first time), and so we won’t find it by that handle but have to look it up by its inode ID. (Which is safe, because that lo_inode must have an O_PATH FD attached to it, so the inode ID cannot be reused.)  And that’s something that this series already does, so I tend to favor that over parsing mountinfo. Max