All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: virtiofs uuid and file handles
Date: Mon, 31 May 2021 16:11:45 +0200	[thread overview]
Message-ID: <CAJfpeguOLLV94Bzs7_JNOdZZ+6p-tcP7b1PXrQY4qWPxXKosnA@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxjNcWCfKLvdq2=TM5fE5RaBf+XvnsP6v_Q6u3b1_mxazw@mail.gmail.com>

On Sat, 29 May 2021 at 18:05, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Wed, Sep 23, 2020 at 2:12 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Wed, Sep 23, 2020 at 11:57 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Wed, Sep 23, 2020 at 10:44 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Wed, Sep 23, 2020 at 4:49 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > > I think that the proper was to implement reliable persistent file
> > > > > handles in fuse/virtiofs would be to add ENCODE/DECODE to
> > > > > FUSE protocol and allow the server to handle this.
> > > >
> > > > Max Reitz (Cc-d) is currently looking into this.
> > > >
> > > > One proposal was to add  LOOKUP_HANDLE operation that is similar to
> > > > LOOKUP except it takes a {variable length handle, name} as input and
> > > > returns a variable length handle *and* a u64 node_id that can be used
> > > > normally for all other operations.
> > > >
>
> Miklos, Max,
>
> Any updates on LOOKUP_HANDLE work?
>
> > > > The advantage of such a scheme for virtio-fs (and possibly other fuse
> > > > based fs) would be that userspace need not keep a refcounted object
> > > > around until the kernel sends a FORGET, but can prune its node ID
> > > > based cache at any time.   If that happens and a request from the
> > > > client (kernel) comes in with a stale node ID, the server will return
> > > > -ESTALE and the client can ask for a new node ID with a special
> > > > lookup_handle(fh, NULL).
> > > >
> > > > Disadvantages being:
> > > >
> > > >  - cost of generating a file handle on all lookups
> > >
> > > I never ran into a local fs implementation where this was expensive.
> > >
> > > >  - cost of storing file handle in kernel icache
> > > >
> > > > I don't think either of those are problematic in the virtiofs case.
> > > > The cost of having to keep fds open while the client has them in its
> > > > cache is much higher.
> > > >
> > >
> > > Sounds good.
> > > I suppose flock() does need to keep the open fd on server.
> >
> > Open files are a separate issue and do need an active object in the server.
> >
> > The issue this solves  is synchronizing "released" and "evicted"
> > states of objects between  server and client.  I.e. when a file is
> > closed (and no more open files exist referencing the same object) the
> > dentry refcount goes to zero but it remains in the cache.   In this
> > state the server could really evict it's own cached object, but can't
> > because the client can gain an active reference at any time  via
> > cached path lookup.
> >
> > One other solution would be for the server to send a notification
> > (NOTIFY_EVICT) that would try to clean out the object from the server
> > cache and respond with a FORGET if successful.   But I sort of like
> > the file handle one better, since it solves multiple problems.
> >
>
> Even with LOOKUP_HANDLE, I am struggling to understand how we
> intend to invalidate all fuse dentries referring to ino X in case the server
> replies with reused ino X with a different generation that the one stored
> in fuse inode cache.
>
> This is an issue that I encountered when running the passthrough_hp test,
> on my filesystem. In tst_readdir_big() for example, underlying files are being
> unlinked and new files created reusing the old inode numbers.
>
> This creates a situation where server gets a lookup request
> for file B that uses the reused inode number X, while old file A is
> still in fuse dentry cache using the older generation of real inode
> number X which is still in fuse inode cache.
>
> Now the server knows that the real inode has been rused, because
> the server caches the old generation value, but it cannot reply to
> the lookup request before the old fuse inode has been invalidated.
> IIUC, fuse_lowlevel_notify_inval_inode() is not enough(?).
> We would also need to change fuse_dentry_revalidate() to
> detect the case of reused/invalidated inode.
>
> The straightforward way I can think of is to store inode generation
> in fuse_dentry. It won't even grow the size of the struct.
>
> Am I over complicating this?

In this scheme the generation number is already embedded in the file
handle.  If LOOKUP_HANDLE returns a nodeid that can be found in the
icache, but which doesn't match the new file handle, then the old
inode will be marked bad and a new one allocated.

Does that answer your worries?  Or am I missing something?

Thanks,
Miklos

  reply	other threads:[~2021-05-31 15:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <a8828676-210a-99e8-30d7-6076f334ed71@virtuozzo.com>
     [not found] ` <CAOQ4uxgZ08ePA5WFOYFoLZaq_-Kjr-haNzBN5Aj3MfF=f9pjdg@mail.gmail.com>
     [not found]   ` <1bb71cbf-0a10-34c7-409d-914058e102f6@virtuozzo.com>
     [not found]     ` <CAOQ4uxieqnKENV_kJYwfcnPjNdVuqH3BnKVx_zLz=N_PdAguNg@mail.gmail.com>
     [not found]       ` <dc696835-bbb5-ed4e-8708-bc828d415a2b@virtuozzo.com>
     [not found]         ` <CAOQ4uxg0XVEEzc+HyyC63WWZuA2AsRjJmbZBuNimtj=t+quVyg@mail.gmail.com>
     [not found]           ` <20200922210445.GG57620@redhat.com>
2020-09-23  2:49             ` virtiofs uuid and file handles Amir Goldstein
2020-09-23  7:44               ` Miklos Szeredi
2020-09-23  9:56                 ` Amir Goldstein
2020-09-23 11:12                   ` Miklos Szeredi
2021-05-29 16:05                     ` Amir Goldstein
2021-05-31 14:11                       ` Miklos Szeredi [this message]
2021-05-31 18:12                         ` Amir Goldstein
2021-06-01 14:49                           ` Vivek Goyal
2021-06-01 15:42                             ` Amir Goldstein
2021-06-01 16:08                               ` Max Reitz
2021-06-01 18:23                                 ` Amir Goldstein
2022-09-11 10:14                 ` Persistent FUSE file handles (Was: virtiofs uuid and file handles) Amir Goldstein
2022-09-11 15:16                   ` Bernd Schubert
2022-09-11 15:29                     ` Amir Goldstein
2022-09-11 15:55                       ` Bernd Schubert
2022-09-12 13:16                   ` Vivek Goyal
2022-09-12 13:38                     ` Amir Goldstein
2022-09-12 14:35                       ` Vivek Goyal
2022-09-12 15:07                         ` Amir Goldstein
2022-09-12 19:56                           ` Vivek Goyal
2022-09-13  2:07                             ` Amir Goldstein
     [not found]           ` <20200922212534.GH57620@redhat.com>
     [not found]             ` <CAOQ4uxjp6NpF_Q0QqUTzE5=YiKz9w6JbUVyROG+rNFcHPAThFg@mail.gmail.com>
2020-09-23 12:53               ` Copying overlayfs directories with index=on Pavel Tikhomirov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJfpeguOLLV94Bzs7_JNOdZZ+6p-tcP7b1PXrQY4qWPxXKosnA@mail.gmail.com \
    --to=miklos@szeredi.hu \
    --cc=amir73il@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=mreitz@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.