All of lore.kernel.org
 help / color / mirror / Atom feed
* [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
@ 2021-12-01 12:06 German Maglione
  2021-12-01 22:10 ` Vivek Goyal
  0 siblings, 1 reply; 9+ messages in thread
From: German Maglione @ 2021-12-01 12:06 UTC (permalink / raw)
  To: virtio-fs-list; +Cc: Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]

Hi,

I was working on [1] (related to [2]), and I saw that both versions
(C and rust) of virtiofsd make the NFs client to "silly rename" an open
file that were unlinked, because we keep each file open as O_PATH in the
lo_do_lookup/do_lookup function. David pointed me to this problem, and I
confirmed that this is the case.

This fires the silly rename in the nfs client. I'm talking with
Bruce Fields (nfs team) to see how to move the silly rename functionality
to the nfs server and outside the directory tree [4], to avoid having
non-really-empty
dirs full of .nfsxxx files. But it is not an easy task.
Also, this will not solve some potential issues with the resource usage:
disk space and open file limits (I hit this limit a couple of times during
my
tests). And, in some cases could be worst, more than one guest sharing the
same
exported dir, one guest just 'ls' or 'find' while the other 'rm' some files.
(The 'find' command will open all files, btw)

Vivek, I saw in [5] that you mentioned that this could be fixed introducing
synchronous, could you elaborate a bit or point me to some code?

Thanks,
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2018072
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1908490
[4] https://wiki.linux-nfs.org/wiki/index.php/Server-side_silly_rename
[5]
https://sourceforge.net/p/fuse/mailman/fuse-devel/?viewmonth=202101&viewday=4


-- 
German

[-- Attachment #2: Type: text/html, Size: 1992 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-01 12:06 [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS German Maglione
@ 2021-12-01 22:10 ` Vivek Goyal
  2021-12-02 15:03   ` German Maglione
  0 siblings, 1 reply; 9+ messages in thread
From: Vivek Goyal @ 2021-12-01 22:10 UTC (permalink / raw)
  To: German Maglione; +Cc: virtio-fs-list, Hanna Reitz, Miklos Szeredi

On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
> Hi,
> 
> I was working on [1] (related to [2]), and I saw that both versions
> (C and rust) of virtiofsd make the NFs client to "silly rename" an open
> file that were unlinked, because we keep each file open as O_PATH in the
> lo_do_lookup/do_lookup function. David pointed me to this problem, and I
> confirmed that this is the case.
> 
> This fires the silly rename in the nfs client. I'm talking with
> Bruce Fields (nfs team) to see how to move the silly rename functionality
> to the nfs server and outside the directory tree [4], to avoid having
> non-really-empty
> dirs full of .nfsxxx files. But it is not an easy task.
> Also, this will not solve some potential issues with the resource usage:
> disk space and open file limits (I hit this limit a couple of times during
> my
> tests). And, in some cases could be worst, more than one guest sharing the
> same
> exported dir, one guest just 'ls' or 'find' while the other 'rm' some files.
> (The 'find' command will open all files, btw)
> 
> Vivek, I saw in [5] that you mentioned that this could be fixed introducing
> synchronous, could you elaborate a bit or point me to some code?

Hi German,

Right now forget messages are asynchronous. They are sent to server and
client does not wait for any reply. That means when unlink() returns,
it is possible that a FORGET message is in progress and has not finished
yet.

Idea behind synchronous FORGET messages is that it will generate a reply
and client will wait for it. Given inode on server should be gone,
I am not sure how much sense does it make. But anyway conceputaully
that's the idea. If we want for FORGET message to finish, that will
mean that O_PATH fd opened by virtiofsd is closed and we will not
have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
client has file open, then issue will still happen.

I think using file handles in virtiofsd_rs (--inode-file-handles) is
a reasonable solution for this problem. Trying to add synchronous
FORGET might be overkill.

Thanks
Vivek
> 
> Thanks,
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2018072
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1908490
> [4] https://wiki.linux-nfs.org/wiki/index.php/Server-side_silly_rename
> [5]
> https://sourceforge.net/p/fuse/mailman/fuse-devel/?viewmonth=202101&viewday=4
> 
> 
> -- 
> German


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-01 22:10 ` Vivek Goyal
@ 2021-12-02 15:03   ` German Maglione
  2021-12-02 15:51     ` Vivek Goyal
  0 siblings, 1 reply; 9+ messages in thread
From: German Maglione @ 2021-12-02 15:03 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, Hanna Reitz, Miklos Szeredi

[-- Attachment #1: Type: text/plain, Size: 2791 bytes --]

On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:

> On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
> > Hi,
> >
> > I was working on [1] (related to [2]), and I saw that both versions
> > (C and rust) of virtiofsd make the NFs client to "silly rename" an open
> > file that were unlinked, because we keep each file open as O_PATH in the
> > lo_do_lookup/do_lookup function. David pointed me to this problem, and I
> > confirmed that this is the case.
> >
> > This fires the silly rename in the nfs client. I'm talking with
> > Bruce Fields (nfs team) to see how to move the silly rename functionality
> > to the nfs server and outside the directory tree [4], to avoid having
> > non-really-empty
> > dirs full of .nfsxxx files. But it is not an easy task.
> > Also, this will not solve some potential issues with the resource usage:
> > disk space and open file limits (I hit this limit a couple of times
> during
> > my
> > tests). And, in some cases could be worst, more than one guest sharing
> the
> > same
> > exported dir, one guest just 'ls' or 'find' while the other 'rm' some
> files.
> > (The 'find' command will open all files, btw)
> >
> > Vivek, I saw in [5] that you mentioned that this could be fixed
> introducing
> > synchronous, could you elaborate a bit or point me to some code?
>
> Hi German,
>
> Right now forget messages are asynchronous. They are sent to server and
> client does not wait for any reply. That means when unlink() returns,
> it is possible that a FORGET message is in progress and has not finished
> yet.
>
> Idea behind synchronous FORGET messages is that it will generate a reply
> and client will wait for it. Given inode on server should be gone,
> I am not sure how much sense does it make. But anyway conceputaully
> that's the idea. If we want for FORGET message to finish, that will
> mean that O_PATH fd opened by virtiofsd is closed and we will not
> have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
> client has file open, then issue will still happen.
>
> I think using file handles in virtiofsd_rs (--inode-file-handles) is
> a reasonable solution for this problem. Trying to add synchronous
> FORGET might be overkill.
>
>
Hi Vivek,

Yes, as you said the solution is using --inode-file-hanldes, turns out
the problem was the --inode-file-handles failed silently when
choosing a sandbox other than namespace (now fixed by Hanna).

So now the thing is, what we do if it fails? Hanna posted an Issue about
that:
"[RFE] Reporting failure to generate file handles".

There is any problem to use file handles as default? I mean without
--inode-file-handles so let them fail and force the user to use something
like
--no-file-handles/--force-no-file-handles with a warning.


-- 
German

[-- Attachment #2: Type: text/html, Size: 3725 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-02 15:03   ` German Maglione
@ 2021-12-02 15:51     ` Vivek Goyal
  2021-12-02 17:52       ` Hanna Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Vivek Goyal @ 2021-12-02 15:51 UTC (permalink / raw)
  To: German Maglione; +Cc: virtio-fs-list, Hanna Reitz, Miklos Szeredi

On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
> On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
> > > Hi,
> > >
> > > I was working on [1] (related to [2]), and I saw that both versions
> > > (C and rust) of virtiofsd make the NFs client to "silly rename" an open
> > > file that were unlinked, because we keep each file open as O_PATH in the
> > > lo_do_lookup/do_lookup function. David pointed me to this problem, and I
> > > confirmed that this is the case.
> > >
> > > This fires the silly rename in the nfs client. I'm talking with
> > > Bruce Fields (nfs team) to see how to move the silly rename functionality
> > > to the nfs server and outside the directory tree [4], to avoid having
> > > non-really-empty
> > > dirs full of .nfsxxx files. But it is not an easy task.
> > > Also, this will not solve some potential issues with the resource usage:
> > > disk space and open file limits (I hit this limit a couple of times
> > during
> > > my
> > > tests). And, in some cases could be worst, more than one guest sharing
> > the
> > > same
> > > exported dir, one guest just 'ls' or 'find' while the other 'rm' some
> > files.
> > > (The 'find' command will open all files, btw)
> > >
> > > Vivek, I saw in [5] that you mentioned that this could be fixed
> > introducing
> > > synchronous, could you elaborate a bit or point me to some code?
> >
> > Hi German,
> >
> > Right now forget messages are asynchronous. They are sent to server and
> > client does not wait for any reply. That means when unlink() returns,
> > it is possible that a FORGET message is in progress and has not finished
> > yet.
> >
> > Idea behind synchronous FORGET messages is that it will generate a reply
> > and client will wait for it. Given inode on server should be gone,
> > I am not sure how much sense does it make. But anyway conceputaully
> > that's the idea. If we want for FORGET message to finish, that will
> > mean that O_PATH fd opened by virtiofsd is closed and we will not
> > have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
> > client has file open, then issue will still happen.
> >
> > I think using file handles in virtiofsd_rs (--inode-file-handles) is
> > a reasonable solution for this problem. Trying to add synchronous
> > FORGET might be overkill.
> >
> >
> Hi Vivek,
> 
> Yes, as you said the solution is using --inode-file-hanldes, turns out
> the problem was the --inode-file-handles failed silently when
> choosing a sandbox other than namespace (now fixed by Hanna).
> 
> So now the thing is, what we do if it fails? Hanna posted an Issue about
> that:
> "[RFE] Reporting failure to generate file handles".

My take from the beginning has been that if file handle generation fails,
then report it back to user (instead of falling back to O_PATH fd
silently). That way user atleast knows that file handles can't be used.

If file handles can't be generated due to lack of resources in system,
then error should be returned to caller as well.

> 
> There is any problem to use file handles as default?

It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by default
might not be desirable. Especially given the fact that we want to move
towards user namespaces and run virtiofsd with least priviliges. So
I will think user needs to enable it if they need it.

> I mean without
> --inode-file-handles so let them fail and force the user to use something
> like
> --no-file-handles/--force-no-file-handles with a warning.

If we were to enable it by default, we probably should test if file
handles are supported on shared dir. If yes, then enable it by default
otherwise continue to use O_PATH fd. But this will be mode switch for
the whole shared filesystem.

I think given we have notion of submounts and some of the submounted
filesystems might not support file handles, so key question will be
what do we do here. Do we return error in this case or fallback to
O_PATH fd for that submount. If we stick to our design philosophy,
then I would say return error. But some people might object because
they might want a mode where there is mix of filesystems in shared
dir and they want to use file handles where supported. So I am sitting
on the fence on this one.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-02 15:51     ` Vivek Goyal
@ 2021-12-02 17:52       ` Hanna Reitz
  2021-12-02 20:14         ` Vivek Goyal
  0 siblings, 1 reply; 9+ messages in thread
From: Hanna Reitz @ 2021-12-02 17:52 UTC (permalink / raw)
  To: Vivek Goyal, German Maglione; +Cc: virtio-fs-list, Miklos Szeredi

On 02.12.21 16:51, Vivek Goyal wrote:
> On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
>> On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>>
>>> On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
>>>> Hi,
>>>>
>>>> I was working on [1] (related to [2]), and I saw that both versions
>>>> (C and rust) of virtiofsd make the NFs client to "silly rename" an open
>>>> file that were unlinked, because we keep each file open as O_PATH in the
>>>> lo_do_lookup/do_lookup function. David pointed me to this problem, and I
>>>> confirmed that this is the case.
>>>>
>>>> This fires the silly rename in the nfs client. I'm talking with
>>>> Bruce Fields (nfs team) to see how to move the silly rename functionality
>>>> to the nfs server and outside the directory tree [4], to avoid having
>>>> non-really-empty
>>>> dirs full of .nfsxxx files. But it is not an easy task.
>>>> Also, this will not solve some potential issues with the resource usage:
>>>> disk space and open file limits (I hit this limit a couple of times
>>> during
>>>> my
>>>> tests). And, in some cases could be worst, more than one guest sharing
>>> the
>>>> same
>>>> exported dir, one guest just 'ls' or 'find' while the other 'rm' some
>>> files.
>>>> (The 'find' command will open all files, btw)
>>>>
>>>> Vivek, I saw in [5] that you mentioned that this could be fixed
>>> introducing
>>>> synchronous, could you elaborate a bit or point me to some code?
>>> Hi German,
>>>
>>> Right now forget messages are asynchronous. They are sent to server and
>>> client does not wait for any reply. That means when unlink() returns,
>>> it is possible that a FORGET message is in progress and has not finished
>>> yet.
>>>
>>> Idea behind synchronous FORGET messages is that it will generate a reply
>>> and client will wait for it. Given inode on server should be gone,
>>> I am not sure how much sense does it make. But anyway conceputaully
>>> that's the idea. If we want for FORGET message to finish, that will
>>> mean that O_PATH fd opened by virtiofsd is closed and we will not
>>> have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
>>> client has file open, then issue will still happen.
>>>
>>> I think using file handles in virtiofsd_rs (--inode-file-handles) is
>>> a reasonable solution for this problem. Trying to add synchronous
>>> FORGET might be overkill.
>>>
>>>
>> Hi Vivek,
>>
>> Yes, as you said the solution is using --inode-file-hanldes, turns out
>> the problem was the --inode-file-handles failed silently when
>> choosing a sandbox other than namespace (now fixed by Hanna).
>>
>> So now the thing is, what we do if it fails? Hanna posted an Issue about
>> that:
>> "[RFE] Reporting failure to generate file handles".
> My take from the beginning has been that if file handle generation fails,
> then report it back to user (instead of falling back to O_PATH fd
> silently). That way user atleast knows that file handles can't be used.

I remember that we had a discussion about whether to introduce a 
mandatory mode where this would be the behavior.  I thought we agreed 
that a best-effort mode always made sense, for example for the situation 
you describe below, where you have mixed filesystems, with some perhaps 
supporting file handles, and others not.

As for a mandatory mode, I couldn’t imagine how exactly it would be 
useful, though.  I think my argument was: “What would a user do after 
launching virtiofsd with file handles forced on, and then noticing they 
don’t work?”  And I still don’t really know, even though I’m proposing 
such a mode in said virtiofsd-rs gitlab issue for the NFS case.  I 
suppose the answer is, check every configuration and find out why it 
doesn’t work; but you don’t need a mandatory file handle mode for this, 
logging errors whenever we need to fall back to O_PATH FDs would be 
completely sufficient.

(I’m mostly proposing it for NFS, because non-working file handles are 
something that seems likely to cause other problems later. Emitting a 
warning would technically be completely fine in order to inform the user 
of this, but I feel like in this case it’d be better to nag them even more.)

> If file handles can't be generated due to lack of resources in system,
> then error should be returned to caller as well.
>
>> There is any problem to use file handles as default?
> It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by default
> might not be desirable. Especially given the fact that we want to move
> towards user namespaces and run virtiofsd with least priviliges. So
> I will think user needs to enable it if they need it.
>
>> I mean without
>> --inode-file-handles so let them fail and force the user to use something
>> like
>> --no-file-handles/--force-no-file-handles with a warning.
> If we were to enable it by default, we probably should test if file
> handles are supported on shared dir. If yes, then enable it by default
> otherwise continue to use O_PATH fd. But this will be mode switch for
> the whole shared filesystem.
>
> I think given we have notion of submounts and some of the submounted
> filesystems might not support file handles, so key question will be
> what do we do here. Do we return error in this case or fallback to
> O_PATH fd for that submount. If we stick to our design philosophy,
> then I would say return error. But some people might object because
> they might want a mode where there is mix of filesystems in shared
> dir and they want to use file handles where supported. So I am sitting
> on the fence on this one.

I think at this point I prefer making --inode-file-handles take an 
optional argument:
- no: Default, don’t use file handles.
- best-effort: Try to generate file handles, fall back to FDs on error.
- mandatory: Always use file handles, return errors to the guest.

And then let the user choose.

(We could definitely make “best-effort” the default for when 
CAP_DAC_READ_SEARCH is available, but I’m not sure we should.  I believe 
in a case like NFS, we should let the user know we recommend 
“mandatory”.  In other cases it just depends.  If the user doesn’t care 
about how many FDs are open, “no” is perfectly OK.  Using file handles 
probably has a small performance penalty, so I don’t know how I feel 
about (defaulting to) using them if they aren’t necessary.)

Hanna


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-02 17:52       ` Hanna Reitz
@ 2021-12-02 20:14         ` Vivek Goyal
  2021-12-03  9:39           ` Hanna Reitz
  0 siblings, 1 reply; 9+ messages in thread
From: Vivek Goyal @ 2021-12-02 20:14 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: virtio-fs-list, Miklos Szeredi

On Thu, Dec 02, 2021 at 06:52:17PM +0100, Hanna Reitz wrote:
> On 02.12.21 16:51, Vivek Goyal wrote:
> > On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
> > > On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> > > 
> > > > On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
> > > > > Hi,
> > > > > 
> > > > > I was working on [1] (related to [2]), and I saw that both versions
> > > > > (C and rust) of virtiofsd make the NFs client to "silly rename" an open
> > > > > file that were unlinked, because we keep each file open as O_PATH in the
> > > > > lo_do_lookup/do_lookup function. David pointed me to this problem, and I
> > > > > confirmed that this is the case.
> > > > > 
> > > > > This fires the silly rename in the nfs client. I'm talking with
> > > > > Bruce Fields (nfs team) to see how to move the silly rename functionality
> > > > > to the nfs server and outside the directory tree [4], to avoid having
> > > > > non-really-empty
> > > > > dirs full of .nfsxxx files. But it is not an easy task.
> > > > > Also, this will not solve some potential issues with the resource usage:
> > > > > disk space and open file limits (I hit this limit a couple of times
> > > > during
> > > > > my
> > > > > tests). And, in some cases could be worst, more than one guest sharing
> > > > the
> > > > > same
> > > > > exported dir, one guest just 'ls' or 'find' while the other 'rm' some
> > > > files.
> > > > > (The 'find' command will open all files, btw)
> > > > > 
> > > > > Vivek, I saw in [5] that you mentioned that this could be fixed
> > > > introducing
> > > > > synchronous, could you elaborate a bit or point me to some code?
> > > > Hi German,
> > > > 
> > > > Right now forget messages are asynchronous. They are sent to server and
> > > > client does not wait for any reply. That means when unlink() returns,
> > > > it is possible that a FORGET message is in progress and has not finished
> > > > yet.
> > > > 
> > > > Idea behind synchronous FORGET messages is that it will generate a reply
> > > > and client will wait for it. Given inode on server should be gone,
> > > > I am not sure how much sense does it make. But anyway conceputaully
> > > > that's the idea. If we want for FORGET message to finish, that will
> > > > mean that O_PATH fd opened by virtiofsd is closed and we will not
> > > > have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
> > > > client has file open, then issue will still happen.
> > > > 
> > > > I think using file handles in virtiofsd_rs (--inode-file-handles) is
> > > > a reasonable solution for this problem. Trying to add synchronous
> > > > FORGET might be overkill.
> > > > 
> > > > 
> > > Hi Vivek,
> > > 
> > > Yes, as you said the solution is using --inode-file-hanldes, turns out
> > > the problem was the --inode-file-handles failed silently when
> > > choosing a sandbox other than namespace (now fixed by Hanna).
> > > 
> > > So now the thing is, what we do if it fails? Hanna posted an Issue about
> > > that:
> > > "[RFE] Reporting failure to generate file handles".
> > My take from the beginning has been that if file handle generation fails,
> > then report it back to user (instead of falling back to O_PATH fd
> > silently). That way user atleast knows that file handles can't be used.
> 
> I remember that we had a discussion about whether to introduce a mandatory
> mode where this would be the behavior.  I thought we agreed that a
> best-effort mode always made sense, for example for the situation you
> describe below, where you have mixed filesystems, with some perhaps
> supporting file handles, and others not.

I think my primary objection was falling back to O_PATH fd for temporary
failures. So file systems supports file handle but if we could not
generate one due to lack of resources, I would rather return error. In
viritofsd C version, we identified a case where it could lead to two
inodes in hash table using different keys. I am not sure if virtiofsd_rs
version suffers from same issue or not.

W.r.t what to do if filesystem does not support file handles, should
we fall back to O_PATH fd (and just emit a warning), I am fine with
falling back to O_PATH fd and just log it so that users know.

I have both the kind of users. Some users prefer fallback and other
users prefer to get an error if a certain feature can't be enabled.
I guess this will slowly evolve depending what users are asking for.
But logging a warning if file handles can't be enabled, sounds ok.

> 
> As for a mandatory mode, I couldn’t imagine how exactly it would be useful,
> though.  I think my argument was: “What would a user do after launching
> virtiofsd with file handles forced on, and then noticing they don’t work?” 

I guess they will relaunch without --inodes-file-handle.

> And I still don’t really know, even though I’m proposing such a mode in said
> virtiofsd-rs gitlab issue for the NFS case. 

Can you please provide a link to that issue. So why are you proposing this
mode for NFS?

> I suppose the answer is, check
> every configuration and find out why it doesn’t work; but you don’t need a
> mandatory file handle mode for this, logging errors whenever we need to fall
> back to O_PATH FDs would be completely sufficient.

Logging warnings should probably be good enough for lot of use cases.

> 
> (I’m mostly proposing it for NFS, because non-working file handles are
> something that seems likely to cause other problems later. Emitting a
> warning would technically be completely fine in order to inform the user of
> this, but I feel like in this case it’d be better to nag them even more.)

So why NFS is special? Due to silly renames issue? So argument will be
if there is a hard failure, then user can try to fix underlying filesystem
configuration to support file handles? This will be true for other
filesystems too. 

In fact, that will be the argument for hard failure (instead of fallback).
Some users will say, I can fix underlying filesystem configuration if
I know virtiofsd can't use file handles. Falling back to O_PATH fd will
give them false impression that configuration is alright and file
handles are being used.

So I can imagine both kind of users. One will prefer hard failure and
other will be happy with fallback.

> 
> > If file handles can't be generated due to lack of resources in system,
> > then error should be returned to caller as well.
> > 
> > > There is any problem to use file handles as default?
> > It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by default
> > might not be desirable. Especially given the fact that we want to move
> > towards user namespaces and run virtiofsd with least priviliges. So
> > I will think user needs to enable it if they need it.
> > 
> > > I mean without
> > > --inode-file-handles so let them fail and force the user to use something
> > > like
> > > --no-file-handles/--force-no-file-handles with a warning.
> > If we were to enable it by default, we probably should test if file
> > handles are supported on shared dir. If yes, then enable it by default
> > otherwise continue to use O_PATH fd. But this will be mode switch for
> > the whole shared filesystem.
> > 
> > I think given we have notion of submounts and some of the submounted
> > filesystems might not support file handles, so key question will be
> > what do we do here. Do we return error in this case or fallback to
> > O_PATH fd for that submount. If we stick to our design philosophy,
> > then I would say return error. But some people might object because
> > they might want a mode where there is mix of filesystems in shared
> > dir and they want to use file handles where supported. So I am sitting
> > on the fence on this one.
> 
> I think at this point I prefer making --inode-file-handles take an optional
> argument:
> - no: Default, don’t use file handles.
> - best-effort: Try to generate file handles, fall back to FDs on error.

Will be nice if we fallback only if filesystem does not support file
handles. For temporary failures, we should return errors to callers. There
is no good reason to fallback to O_PATH fd in that case.

> - mandatory: Always use file handles, return errors to the guest.

This sounds reasonable. Another naming scheme could be "no, fallback,
always".

Thanks
Vivek

> 
> And then let the user choose.
> 
> (We could definitely make “best-effort” the default for when
> CAP_DAC_READ_SEARCH is available, but I’m not sure we should.  I believe in
> a case like NFS, we should let the user know we recommend “mandatory”.  In
> other cases it just depends.  If the user doesn’t care about how many FDs
> are open, “no” is perfectly OK.  Using file handles probably has a small
> performance penalty, so I don’t know how I feel about (defaulting to) using
> them if they aren’t necessary.)
> 
> Hanna
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-02 20:14         ` Vivek Goyal
@ 2021-12-03  9:39           ` Hanna Reitz
  2021-12-03 14:09             ` Vivek Goyal
  0 siblings, 1 reply; 9+ messages in thread
From: Hanna Reitz @ 2021-12-03  9:39 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, Miklos Szeredi

On 02.12.21 21:14, Vivek Goyal wrote:
> On Thu, Dec 02, 2021 at 06:52:17PM +0100, Hanna Reitz wrote:
>> On 02.12.21 16:51, Vivek Goyal wrote:
>>> On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
>>>> On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>>>>
>>>>> On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I was working on [1] (related to [2]), and I saw that both versions
>>>>>> (C and rust) of virtiofsd make the NFs client to "silly rename" an open
>>>>>> file that were unlinked, because we keep each file open as O_PATH in the
>>>>>> lo_do_lookup/do_lookup function. David pointed me to this problem, and I
>>>>>> confirmed that this is the case.
>>>>>>
>>>>>> This fires the silly rename in the nfs client. I'm talking with
>>>>>> Bruce Fields (nfs team) to see how to move the silly rename functionality
>>>>>> to the nfs server and outside the directory tree [4], to avoid having
>>>>>> non-really-empty
>>>>>> dirs full of .nfsxxx files. But it is not an easy task.
>>>>>> Also, this will not solve some potential issues with the resource usage:
>>>>>> disk space and open file limits (I hit this limit a couple of times
>>>>> during
>>>>>> my
>>>>>> tests). And, in some cases could be worst, more than one guest sharing
>>>>> the
>>>>>> same
>>>>>> exported dir, one guest just 'ls' or 'find' while the other 'rm' some
>>>>> files.
>>>>>> (The 'find' command will open all files, btw)
>>>>>>
>>>>>> Vivek, I saw in [5] that you mentioned that this could be fixed
>>>>> introducing
>>>>>> synchronous, could you elaborate a bit or point me to some code?
>>>>> Hi German,
>>>>>
>>>>> Right now forget messages are asynchronous. They are sent to server and
>>>>> client does not wait for any reply. That means when unlink() returns,
>>>>> it is possible that a FORGET message is in progress and has not finished
>>>>> yet.
>>>>>
>>>>> Idea behind synchronous FORGET messages is that it will generate a reply
>>>>> and client will wait for it. Given inode on server should be gone,
>>>>> I am not sure how much sense does it make. But anyway conceputaully
>>>>> that's the idea. If we want for FORGET message to finish, that will
>>>>> mean that O_PATH fd opened by virtiofsd is closed and we will not
>>>>> have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
>>>>> client has file open, then issue will still happen.
>>>>>
>>>>> I think using file handles in virtiofsd_rs (--inode-file-handles) is
>>>>> a reasonable solution for this problem. Trying to add synchronous
>>>>> FORGET might be overkill.
>>>>>
>>>>>
>>>> Hi Vivek,
>>>>
>>>> Yes, as you said the solution is using --inode-file-hanldes, turns out
>>>> the problem was the --inode-file-handles failed silently when
>>>> choosing a sandbox other than namespace (now fixed by Hanna).
>>>>
>>>> So now the thing is, what we do if it fails? Hanna posted an Issue about
>>>> that:
>>>> "[RFE] Reporting failure to generate file handles".
>>> My take from the beginning has been that if file handle generation fails,
>>> then report it back to user (instead of falling back to O_PATH fd
>>> silently). That way user atleast knows that file handles can't be used.
>> I remember that we had a discussion about whether to introduce a mandatory
>> mode where this would be the behavior.  I thought we agreed that a
>> best-effort mode always made sense, for example for the situation you
>> describe below, where you have mixed filesystems, with some perhaps
>> supporting file handles, and others not.
> I think my primary objection was falling back to O_PATH fd for temporary
> failures. So file systems supports file handle but if we could not
> generate one due to lack of resources, I would rather return error. In
> viritofsd C version, we identified a case where it could lead to two
> inodes in hash table using different keys. I am not sure if virtiofsd_rs
> version suffers from same issue or not.

No (I hope not), I’ve tried to incorporate your feedback on that, and so 
the Rust version does distinguish between temporary and persistent 
name_to_handle_at() failures.  (The former returns `Err(_)`, the latter 
`Ok(None)`.  `Err(_)`s are returned to the guest.)

(All errors for making a file handle openable, i.e. opening a mount FD, 
are still not returned to the guest and only lead to a fallback.  I 
remember at one point I tried looking into this code to find out where 
it would be reasonable to return an error, and where we should fall 
back, and I just found it very difficult to decide. It also (luckily? 
:)) didn’t have anything to do with the bug you describe, because that 
was about whether we have a handle for the lookup, and that’s now 
independent of the whole mount FD mess.)

> W.r.t what to do if filesystem does not support file handles, should
> we fall back to O_PATH fd (and just emit a warning), I am fine with
> falling back to O_PATH fd and just log it so that users know.
>
> I have both the kind of users. Some users prefer fallback and other
> users prefer to get an error if a certain feature can't be enabled.
> I guess this will slowly evolve depending what users are asking for.
> But logging a warning if file handles can't be enabled, sounds ok.
>
>> As for a mandatory mode, I couldn’t imagine how exactly it would be useful,
>> though.  I think my argument was: “What would a user do after launching
>> virtiofsd with file handles forced on, and then noticing they don’t work?”
> I guess they will relaunch without --inodes-file-handle.

Yes, that’s what I thought, too, and so it mostly sounded like an 
inconvenience to me, and not really helpful. :/

>> And I still don’t really know, even though I’m proposing such a mode in said
>> virtiofsd-rs gitlab issue for the NFS case.
> Can you please provide a link to that issue.

Ah, sure: https://gitlab.com/virtio-fs/virtiofsd-rs/-/issues/17

> So why are you proposing this
> mode for NFS?

Because it looks like with NFS, not using file handles may cause real 
problems, and so I feel like failing early (failure to generate file 
handles) is nicer than later (recursively deleting directories).

>> I suppose the answer is, check
>> every configuration and find out why it doesn’t work; but you don’t need a
>> mandatory file handle mode for this, logging errors whenever we need to fall
>> back to O_PATH FDs would be completely sufficient.
> Logging warnings should probably be good enough for lot of use cases.
>
>> (I’m mostly proposing it for NFS, because non-working file handles are
>> something that seems likely to cause other problems later. Emitting a
>> warning would technically be completely fine in order to inform the user of
>> this, but I feel like in this case it’d be better to nag them even more.)
> So why NFS is special? Due to silly renames issue?

Yes.

> So argument will be
> if there is a hard failure, then user can try to fix underlying filesystem
> configuration to support file handles?

The argument is that the silly renames issue is kind of likely to 
produce real tangible problems when file handles don’t work.  I feel 
like that warrants more than just logging the error.

> This will be true for other
> filesystems too.

Yes, but for other filesystems, using or not using file handles seems 
like a tuning thing, and not really anything that could produce tangible 
errors.  So to me, logging errors seems more appropriate then.

> In fact, that will be the argument for hard failure (instead of fallback).
> Some users will say, I can fix underlying filesystem configuration if
> I know virtiofsd can't use file handles. Falling back to O_PATH fd will
> give them false impression that configuration is alright and file
> handles are being used.
>
> So I can imagine both kind of users. One will prefer hard failure and
> other will be happy with fallback.

Agreed!  (Which is why I’m proposing a switch to let the user decide.)

>>> If file handles can't be generated due to lack of resources in system,
>>> then error should be returned to caller as well.
>>>
>>>> There is any problem to use file handles as default?
>>> It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by default
>>> might not be desirable. Especially given the fact that we want to move
>>> towards user namespaces and run virtiofsd with least priviliges. So
>>> I will think user needs to enable it if they need it.
>>>
>>>> I mean without
>>>> --inode-file-handles so let them fail and force the user to use something
>>>> like
>>>> --no-file-handles/--force-no-file-handles with a warning.
>>> If we were to enable it by default, we probably should test if file
>>> handles are supported on shared dir. If yes, then enable it by default
>>> otherwise continue to use O_PATH fd. But this will be mode switch for
>>> the whole shared filesystem.
>>>
>>> I think given we have notion of submounts and some of the submounted
>>> filesystems might not support file handles, so key question will be
>>> what do we do here. Do we return error in this case or fallback to
>>> O_PATH fd for that submount. If we stick to our design philosophy,
>>> then I would say return error. But some people might object because
>>> they might want a mode where there is mix of filesystems in shared
>>> dir and they want to use file handles where supported. So I am sitting
>>> on the fence on this one.
>> I think at this point I prefer making --inode-file-handles take an optional
>> argument:
>> - no: Default, don’t use file handles.
>> - best-effort: Try to generate file handles, fall back to FDs on error.
> Will be nice if we fallback only if filesystem does not support file
> handles. For temporary failures, we should return errors to callers. There
> is no good reason to fallback to O_PATH fd in that case.
>
>> - mandatory: Always use file handles, return errors to the guest.
> This sounds reasonable. Another naming scheme could be "no, fallback,
> always".

Naming things is hard :)

Hanna


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-03  9:39           ` Hanna Reitz
@ 2021-12-03 14:09             ` Vivek Goyal
  2021-12-07 15:41               ` German Maglione
  0 siblings, 1 reply; 9+ messages in thread
From: Vivek Goyal @ 2021-12-03 14:09 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: virtio-fs-list, Miklos Szeredi

On Fri, Dec 03, 2021 at 10:39:25AM +0100, Hanna Reitz wrote:
> On 02.12.21 21:14, Vivek Goyal wrote:
> > On Thu, Dec 02, 2021 at 06:52:17PM +0100, Hanna Reitz wrote:
> > > On 02.12.21 16:51, Vivek Goyal wrote:
> > > > On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
> > > > > On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I was working on [1] (related to [2]), and I saw that both versions
> > > > > > > (C and rust) of virtiofsd make the NFs client to "silly rename" an open
> > > > > > > file that were unlinked, because we keep each file open as O_PATH in the
> > > > > > > lo_do_lookup/do_lookup function. David pointed me to this problem, and I
> > > > > > > confirmed that this is the case.
> > > > > > > 
> > > > > > > This fires the silly rename in the nfs client. I'm talking with
> > > > > > > Bruce Fields (nfs team) to see how to move the silly rename functionality
> > > > > > > to the nfs server and outside the directory tree [4], to avoid having
> > > > > > > non-really-empty
> > > > > > > dirs full of .nfsxxx files. But it is not an easy task.
> > > > > > > Also, this will not solve some potential issues with the resource usage:
> > > > > > > disk space and open file limits (I hit this limit a couple of times
> > > > > > during
> > > > > > > my
> > > > > > > tests). And, in some cases could be worst, more than one guest sharing
> > > > > > the
> > > > > > > same
> > > > > > > exported dir, one guest just 'ls' or 'find' while the other 'rm' some
> > > > > > files.
> > > > > > > (The 'find' command will open all files, btw)
> > > > > > > 
> > > > > > > Vivek, I saw in [5] that you mentioned that this could be fixed
> > > > > > introducing
> > > > > > > synchronous, could you elaborate a bit or point me to some code?
> > > > > > Hi German,
> > > > > > 
> > > > > > Right now forget messages are asynchronous. They are sent to server and
> > > > > > client does not wait for any reply. That means when unlink() returns,
> > > > > > it is possible that a FORGET message is in progress and has not finished
> > > > > > yet.
> > > > > > 
> > > > > > Idea behind synchronous FORGET messages is that it will generate a reply
> > > > > > and client will wait for it. Given inode on server should be gone,
> > > > > > I am not sure how much sense does it make. But anyway conceputaully
> > > > > > that's the idea. If we want for FORGET message to finish, that will
> > > > > > mean that O_PATH fd opened by virtiofsd is closed and we will not
> > > > > > have NFS silly rename issue (atleast due to virtiofsd). If virtiofs
> > > > > > client has file open, then issue will still happen.
> > > > > > 
> > > > > > I think using file handles in virtiofsd_rs (--inode-file-handles) is
> > > > > > a reasonable solution for this problem. Trying to add synchronous
> > > > > > FORGET might be overkill.
> > > > > > 
> > > > > > 
> > > > > Hi Vivek,
> > > > > 
> > > > > Yes, as you said the solution is using --inode-file-hanldes, turns out
> > > > > the problem was the --inode-file-handles failed silently when
> > > > > choosing a sandbox other than namespace (now fixed by Hanna).
> > > > > 
> > > > > So now the thing is, what we do if it fails? Hanna posted an Issue about
> > > > > that:
> > > > > "[RFE] Reporting failure to generate file handles".
> > > > My take from the beginning has been that if file handle generation fails,
> > > > then report it back to user (instead of falling back to O_PATH fd
> > > > silently). That way user atleast knows that file handles can't be used.
> > > I remember that we had a discussion about whether to introduce a mandatory
> > > mode where this would be the behavior.  I thought we agreed that a
> > > best-effort mode always made sense, for example for the situation you
> > > describe below, where you have mixed filesystems, with some perhaps
> > > supporting file handles, and others not.
> > I think my primary objection was falling back to O_PATH fd for temporary
> > failures. So file systems supports file handle but if we could not
> > generate one due to lack of resources, I would rather return error. In
> > viritofsd C version, we identified a case where it could lead to two
> > inodes in hash table using different keys. I am not sure if virtiofsd_rs
> > version suffers from same issue or not.
> 
> No (I hope not), I’ve tried to incorporate your feedback on that, and so the
> Rust version does distinguish between temporary and persistent
> name_to_handle_at() failures.  (The former returns `Err(_)`, the latter
> `Ok(None)`.  `Err(_)`s are returned to the guest.)

Sounds good. Temporary failures should be returned to client.

> 
> (All errors for making a file handle openable, i.e. opening a mount FD, are
> still not returned to the guest and only lead to a fallback. 

I would return error the moment you encounter one. That seems to be theme
of all of the virtiofsd code. While trying to perform any operation, if
it encounters an error, it immediately returns it to caller. So opening a
mount FD should be no different and there should not be any need for
fallback.

> I remember at
> one point I tried looking into this code to find out where it would be
> reasonable to return an error, and where we should fall back, and I just
> found it very difficult to decide. It also (luckily? :)) didn’t have
> anything to do with the bug you describe, because that was about whether we
> have a handle for the lookup, and that’s now independent of the whole mount
> FD mess.)
> 
> > W.r.t what to do if filesystem does not support file handles, should
> > we fall back to O_PATH fd (and just emit a warning), I am fine with
> > falling back to O_PATH fd and just log it so that users know.
> > 
> > I have both the kind of users. Some users prefer fallback and other
> > users prefer to get an error if a certain feature can't be enabled.
> > I guess this will slowly evolve depending what users are asking for.
> > But logging a warning if file handles can't be enabled, sounds ok.
> > 
> > > As for a mandatory mode, I couldn’t imagine how exactly it would be useful,
> > > though.  I think my argument was: “What would a user do after launching
> > > virtiofsd with file handles forced on, and then noticing they don’t work?”
> > I guess they will relaunch without --inodes-file-handle.
> 
> Yes, that’s what I thought, too, and so it mostly sounded like an
> inconvenience to me, and not really helpful. :/

Alternatively, If there is anything configurable in underlying filesystem
(to enable file handle support), they could take an action to fix it and
restart virtiofsd with --inodes-file-handle.

> 
> > > And I still don’t really know, even though I’m proposing such a mode in said
> > > virtiofsd-rs gitlab issue for the NFS case.
> > Can you please provide a link to that issue.
> 
> Ah, sure: https://gitlab.com/virtio-fs/virtiofsd-rs/-/issues/17
> 
> > So why are you proposing this
> > mode for NFS?
> 
> Because it looks like with NFS, not using file handles may cause real
> problems, and so I feel like failing early (failure to generate file
> handles) is nicer than later (recursively deleting directories).

Ok. One could argue that even regular file systems could cause problem
by running into open file descriptor limit. So not sure NFS should be
an exception.

> 
> > > I suppose the answer is, check
> > > every configuration and find out why it doesn’t work; but you don’t need a
> > > mandatory file handle mode for this, logging errors whenever we need to fall
> > > back to O_PATH FDs would be completely sufficient.
> > Logging warnings should probably be good enough for lot of use cases.
> > 
> > > (I’m mostly proposing it for NFS, because non-working file handles are
> > > something that seems likely to cause other problems later. Emitting a
> > > warning would technically be completely fine in order to inform the user of
> > > this, but I feel like in this case it’d be better to nag them even more.)
> > So why NFS is special? Due to silly renames issue?
> 
> Yes.
> 
> > So argument will be
> > if there is a hard failure, then user can try to fix underlying filesystem
> > configuration to support file handles?
> 
> The argument is that the silly renames issue is kind of likely to produce
> real tangible problems when file handles don’t work.  I feel like that
> warrants more than just logging the error.
> 
> > This will be true for other
> > filesystems too.
> 
> Yes, but for other filesystems, using or not using file handles seems like a
> tuning thing, and not really anything that could produce tangible errors. 
> So to me, logging errors seems more appropriate then.

Hmm..., open file descriptor limit is around 1M, IIRC. So if guest has
cached enough inodes, one can run into it and that's the core problem
file handle support solves. 

IMHO, NFS is not an exception. If we could not enable file handle support,
both NFS and regular file systems can run into issues later.

I would say I like the idea of adding argument to --inodes-file-handle and
let user choose if they want hard failures or a fallback. By default we
could choose "fallback" if user just says --inodes-file-handle.

> 
> > In fact, that will be the argument for hard failure (instead of fallback).
> > Some users will say, I can fix underlying filesystem configuration if
> > I know virtiofsd can't use file handles. Falling back to O_PATH fd will
> > give them false impression that configuration is alright and file
> > handles are being used.
> > 
> > So I can imagine both kind of users. One will prefer hard failure and
> > other will be happy with fallback.
> 
> Agreed!  (Which is why I’m proposing a switch to let the user decide.)

Sounds good.

> 
> > > > If file handles can't be generated due to lack of resources in system,
> > > > then error should be returned to caller as well.
> > > > 
> > > > > There is any problem to use file handles as default?
> > > > It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by default
> > > > might not be desirable. Especially given the fact that we want to move
> > > > towards user namespaces and run virtiofsd with least priviliges. So
> > > > I will think user needs to enable it if they need it.
> > > > 
> > > > > I mean without
> > > > > --inode-file-handles so let them fail and force the user to use something
> > > > > like
> > > > > --no-file-handles/--force-no-file-handles with a warning.
> > > > If we were to enable it by default, we probably should test if file
> > > > handles are supported on shared dir. If yes, then enable it by default
> > > > otherwise continue to use O_PATH fd. But this will be mode switch for
> > > > the whole shared filesystem.
> > > > 
> > > > I think given we have notion of submounts and some of the submounted
> > > > filesystems might not support file handles, so key question will be
> > > > what do we do here. Do we return error in this case or fallback to
> > > > O_PATH fd for that submount. If we stick to our design philosophy,
> > > > then I would say return error. But some people might object because
> > > > they might want a mode where there is mix of filesystems in shared
> > > > dir and they want to use file handles where supported. So I am sitting
> > > > on the fence on this one.
> > > I think at this point I prefer making --inode-file-handles take an optional
> > > argument:
> > > - no: Default, don’t use file handles.
> > > - best-effort: Try to generate file handles, fall back to FDs on error.
> > Will be nice if we fallback only if filesystem does not support file
> > handles. For temporary failures, we should return errors to callers. There
> > is no good reason to fallback to O_PATH fd in that case.
> > 
> > > - mandatory: Always use file handles, return errors to the guest.
> > This sounds reasonable. Another naming scheme could be "no, fallback,
> > always".
> 
> Naming things is hard :)

Agreed. I am fine with the naming scheme you have proposed. Not sure what
will you do with "mandatory" when rootfs supports file handles but one of
the submounts does not. You will not come to know about it at startup
time and hard failures are better detected at startup time.

Vivek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS
  2021-12-03 14:09             ` Vivek Goyal
@ 2021-12-07 15:41               ` German Maglione
  0 siblings, 0 replies; 9+ messages in thread
From: German Maglione @ 2021-12-07 15:41 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, Hanna Reitz, Miklos Szeredi

[-- Attachment #1: Type: text/plain, Size: 13459 bytes --]

I just want to add the link to the issue, since part of the discussion is
being held there:
https://gitlab.com/virtio-fs/virtiofsd-rs/-/issues/17


On Fri, Dec 3, 2021 at 3:09 PM Vivek Goyal <vgoyal@redhat.com> wrote:

> On Fri, Dec 03, 2021 at 10:39:25AM +0100, Hanna Reitz wrote:
> > On 02.12.21 21:14, Vivek Goyal wrote:
> > > On Thu, Dec 02, 2021 at 06:52:17PM +0100, Hanna Reitz wrote:
> > > > On 02.12.21 16:51, Vivek Goyal wrote:
> > > > > On Thu, Dec 02, 2021 at 04:03:20PM +0100, German Maglione wrote:
> > > > > > On Wed, Dec 1, 2021 at 11:10 PM Vivek Goyal <vgoyal@redhat.com>
> wrote:
> > > > > >
> > > > > > > On Wed, Dec 01, 2021 at 01:06:23PM +0100, German Maglione
> wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I was working on [1] (related to [2]), and I saw that both
> versions
> > > > > > > > (C and rust) of virtiofsd make the NFs client to "silly
> rename" an open
> > > > > > > > file that were unlinked, because we keep each file open as
> O_PATH in the
> > > > > > > > lo_do_lookup/do_lookup function. David pointed me to this
> problem, and I
> > > > > > > > confirmed that this is the case.
> > > > > > > >
> > > > > > > > This fires the silly rename in the nfs client. I'm talking
> with
> > > > > > > > Bruce Fields (nfs team) to see how to move the silly rename
> functionality
> > > > > > > > to the nfs server and outside the directory tree [4], to
> avoid having
> > > > > > > > non-really-empty
> > > > > > > > dirs full of .nfsxxx files. But it is not an easy task.
> > > > > > > > Also, this will not solve some potential issues with the
> resource usage:
> > > > > > > > disk space and open file limits (I hit this limit a couple
> of times
> > > > > > > during
> > > > > > > > my
> > > > > > > > tests). And, in some cases could be worst, more than one
> guest sharing
> > > > > > > the
> > > > > > > > same
> > > > > > > > exported dir, one guest just 'ls' or 'find' while the other
> 'rm' some
> > > > > > > files.
> > > > > > > > (The 'find' command will open all files, btw)
> > > > > > > >
> > > > > > > > Vivek, I saw in [5] that you mentioned that this could be
> fixed
> > > > > > > introducing
> > > > > > > > synchronous, could you elaborate a bit or point me to some
> code?
> > > > > > > Hi German,
> > > > > > >
> > > > > > > Right now forget messages are asynchronous. They are sent to
> server and
> > > > > > > client does not wait for any reply. That means when unlink()
> returns,
> > > > > > > it is possible that a FORGET message is in progress and has
> not finished
> > > > > > > yet.
> > > > > > >
> > > > > > > Idea behind synchronous FORGET messages is that it will
> generate a reply
> > > > > > > and client will wait for it. Given inode on server should be
> gone,
> > > > > > > I am not sure how much sense does it make. But anyway
> conceputaully
> > > > > > > that's the idea. If we want for FORGET message to finish, that
> will
> > > > > > > mean that O_PATH fd opened by virtiofsd is closed and we will
> not
> > > > > > > have NFS silly rename issue (atleast due to virtiofsd). If
> virtiofs
> > > > > > > client has file open, then issue will still happen.
> > > > > > >
> > > > > > > I think using file handles in virtiofsd_rs
> (--inode-file-handles) is
> > > > > > > a reasonable solution for this problem. Trying to add
> synchronous
> > > > > > > FORGET might be overkill.
> > > > > > >
> > > > > > >
> > > > > > Hi Vivek,
> > > > > >
> > > > > > Yes, as you said the solution is using --inode-file-hanldes,
> turns out
> > > > > > the problem was the --inode-file-handles failed silently when
> > > > > > choosing a sandbox other than namespace (now fixed by Hanna).
> > > > > >
> > > > > > So now the thing is, what we do if it fails? Hanna posted an
> Issue about
> > > > > > that:
> > > > > > "[RFE] Reporting failure to generate file handles".
> > > > > My take from the beginning has been that if file handle generation
> fails,
> > > > > then report it back to user (instead of falling back to O_PATH fd
> > > > > silently). That way user atleast knows that file handles can't be
> used.
> > > > I remember that we had a discussion about whether to introduce a
> mandatory
> > > > mode where this would be the behavior.  I thought we agreed that a
> > > > best-effort mode always made sense, for example for the situation you
> > > > describe below, where you have mixed filesystems, with some perhaps
> > > > supporting file handles, and others not.
> > > I think my primary objection was falling back to O_PATH fd for
> temporary
> > > failures. So file systems supports file handle but if we could not
> > > generate one due to lack of resources, I would rather return error. In
> > > viritofsd C version, we identified a case where it could lead to two
> > > inodes in hash table using different keys. I am not sure if
> virtiofsd_rs
> > > version suffers from same issue or not.
> >
> > No (I hope not), I’ve tried to incorporate your feedback on that, and so
> the
> > Rust version does distinguish between temporary and persistent
> > name_to_handle_at() failures.  (The former returns `Err(_)`, the latter
> > `Ok(None)`.  `Err(_)`s are returned to the guest.)
>
> Sounds good. Temporary failures should be returned to client.
>
> >
> > (All errors for making a file handle openable, i.e. opening a mount FD,
> are
> > still not returned to the guest and only lead to a fallback.
>
> I would return error the moment you encounter one. That seems to be theme
> of all of the virtiofsd code. While trying to perform any operation, if
> it encounters an error, it immediately returns it to caller. So opening a
> mount FD should be no different and there should not be any need for
> fallback.
>
> > I remember at
> > one point I tried looking into this code to find out where it would be
> > reasonable to return an error, and where we should fall back, and I just
> > found it very difficult to decide. It also (luckily? :)) didn’t have
> > anything to do with the bug you describe, because that was about whether
> we
> > have a handle for the lookup, and that’s now independent of the whole
> mount
> > FD mess.)
> >
> > > W.r.t what to do if filesystem does not support file handles, should
> > > we fall back to O_PATH fd (and just emit a warning), I am fine with
> > > falling back to O_PATH fd and just log it so that users know.
> > >
> > > I have both the kind of users. Some users prefer fallback and other
> > > users prefer to get an error if a certain feature can't be enabled.
> > > I guess this will slowly evolve depending what users are asking for.
> > > But logging a warning if file handles can't be enabled, sounds ok.
> > >
> > > > As for a mandatory mode, I couldn’t imagine how exactly it would be
> useful,
> > > > though.  I think my argument was: “What would a user do after
> launching
> > > > virtiofsd with file handles forced on, and then noticing they don’t
> work?”
> > > I guess they will relaunch without --inodes-file-handle.
> >
> > Yes, that’s what I thought, too, and so it mostly sounded like an
> > inconvenience to me, and not really helpful. :/
>
> Alternatively, If there is anything configurable in underlying filesystem
> (to enable file handle support), they could take an action to fix it and
> restart virtiofsd with --inodes-file-handle.
>
> >
> > > > And I still don’t really know, even though I’m proposing such a mode
> in said
> > > > virtiofsd-rs gitlab issue for the NFS case.
> > > Can you please provide a link to that issue.
> >
> > Ah, sure: https://gitlab.com/virtio-fs/virtiofsd-rs/-/issues/17
> >
> > > So why are you proposing this
> > > mode for NFS?
> >
> > Because it looks like with NFS, not using file handles may cause real
> > problems, and so I feel like failing early (failure to generate file
> > handles) is nicer than later (recursively deleting directories).
>
> Ok. One could argue that even regular file systems could cause problem
> by running into open file descriptor limit. So not sure NFS should be
> an exception.
>
> >
> > > > I suppose the answer is, check
> > > > every configuration and find out why it doesn’t work; but you don’t
> need a
> > > > mandatory file handle mode for this, logging errors whenever we need
> to fall
> > > > back to O_PATH FDs would be completely sufficient.
> > > Logging warnings should probably be good enough for lot of use cases.
> > >
> > > > (I’m mostly proposing it for NFS, because non-working file handles
> are
> > > > something that seems likely to cause other problems later. Emitting a
> > > > warning would technically be completely fine in order to inform the
> user of
> > > > this, but I feel like in this case it’d be better to nag them even
> more.)
> > > So why NFS is special? Due to silly renames issue?
> >
> > Yes.
> >
> > > So argument will be
> > > if there is a hard failure, then user can try to fix underlying
> filesystem
> > > configuration to support file handles?
> >
> > The argument is that the silly renames issue is kind of likely to produce
> > real tangible problems when file handles don’t work.  I feel like that
> > warrants more than just logging the error.
> >
> > > This will be true for other
> > > filesystems too.
> >
> > Yes, but for other filesystems, using or not using file handles seems
> like a
> > tuning thing, and not really anything that could produce tangible
> errors.
> > So to me, logging errors seems more appropriate then.
>
> Hmm..., open file descriptor limit is around 1M, IIRC. So if guest has
> cached enough inodes, one can run into it and that's the core problem
> file handle support solves.
>
> IMHO, NFS is not an exception. If we could not enable file handle support,
> both NFS and regular file systems can run into issues later.
>
> I would say I like the idea of adding argument to --inodes-file-handle and
> let user choose if they want hard failures or a fallback. By default we
> could choose "fallback" if user just says --inodes-file-handle.
>
> >
> > > In fact, that will be the argument for hard failure (instead of
> fallback).
> > > Some users will say, I can fix underlying filesystem configuration if
> > > I know virtiofsd can't use file handles. Falling back to O_PATH fd will
> > > give them false impression that configuration is alright and file
> > > handles are being used.
> > >
> > > So I can imagine both kind of users. One will prefer hard failure and
> > > other will be happy with fallback.
> >
> > Agreed!  (Which is why I’m proposing a switch to let the user decide.)
>
> Sounds good.
>
> >
> > > > > If file handles can't be generated due to lack of resources in
> system,
> > > > > then error should be returned to caller as well.
> > > > >
> > > > > > There is any problem to use file handles as default?
> > > > > It gives CAP_DAC_READ_SEARCH in init_user_ns. So enabling it by
> default
> > > > > might not be desirable. Especially given the fact that we want to
> move
> > > > > towards user namespaces and run virtiofsd with least priviliges. So
> > > > > I will think user needs to enable it if they need it.
> > > > >
> > > > > > I mean without
> > > > > > --inode-file-handles so let them fail and force the user to use
> something
> > > > > > like
> > > > > > --no-file-handles/--force-no-file-handles with a warning.
> > > > > If we were to enable it by default, we probably should test if file
> > > > > handles are supported on shared dir. If yes, then enable it by
> default
> > > > > otherwise continue to use O_PATH fd. But this will be mode switch
> for
> > > > > the whole shared filesystem.
> > > > >
> > > > > I think given we have notion of submounts and some of the
> submounted
> > > > > filesystems might not support file handles, so key question will be
> > > > > what do we do here. Do we return error in this case or fallback to
> > > > > O_PATH fd for that submount. If we stick to our design philosophy,
> > > > > then I would say return error. But some people might object because
> > > > > they might want a mode where there is mix of filesystems in shared
> > > > > dir and they want to use file handles where supported. So I am
> sitting
> > > > > on the fence on this one.
> > > > I think at this point I prefer making --inode-file-handles take an
> optional
> > > > argument:
> > > > - no: Default, don’t use file handles.
> > > > - best-effort: Try to generate file handles, fall back to FDs on
> error.
> > > Will be nice if we fallback only if filesystem does not support file
> > > handles. For temporary failures, we should return errors to callers.
> There
> > > is no good reason to fallback to O_PATH fd in that case.
> > >
> > > > - mandatory: Always use file handles, return errors to the guest.
> > > This sounds reasonable. Another naming scheme could be "no, fallback,
> > > always".
> >
> > Naming things is hard :)
>
> Agreed. I am fine with the naming scheme you have proposed. Not sure what
> will you do with "mandatory" when rootfs supports file handles but one of
> the submounts does not. You will not come to know about it at startup
> time and hard failures are better detected at startup time.
>
> Vivek
>
>

-- 
German

[-- Attachment #2: Type: text/html, Size: 16649 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-07 15:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01 12:06 [Virtio-fs] [virtiofsd][virtiofsd-rs] unlink an openfile over NFS German Maglione
2021-12-01 22:10 ` Vivek Goyal
2021-12-02 15:03   ` German Maglione
2021-12-02 15:51     ` Vivek Goyal
2021-12-02 17:52       ` Hanna Reitz
2021-12-02 20:14         ` Vivek Goyal
2021-12-03  9:39           ` Hanna Reitz
2021-12-03 14:09             ` Vivek Goyal
2021-12-07 15:41               ` German Maglione

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.