All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Ioannis Angelakopoulos <iangelak@redhat.com>,
	Stef Bon <stefbon@gmail.com>, Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	virtio-fs-list <virtio-fs@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Steve French <sfrench@samba.org>,
	Nathan Youngman <git@nathany.com>,
	Ioannis Angelakopoulos <jaggel@bu.edu>
Subject: Re: [RFC PATCH 0/7] Inotify support in FUSE and virtiofs
Date: Mon, 20 Dec 2021 20:22:11 +0200	[thread overview]
Message-ID: <CAOQ4uxiHB8khGU7ti7XLXriSqMU+Wn3uMk=Vq4C2ZvzAWu_w9g@mail.gmail.com> (raw)
In-Reply-To: <YcCySSZC3ZmN8+q1@redhat.com>

On Mon, Dec 20, 2021 at 6:42 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Sat, Dec 18, 2021 at 10:28:35AM +0200, Amir Goldstein wrote:
> > > > >
> > > > > > 2. For FS_RENAME, will we be able to pass 4 buffers in iov?
> > > > > >     src_fuse_notify_fsnotify_out, src_name,
> > > > > >     dst_fuse_notify_fsnotify_out, dst_name
> > > > >
> > > > > So it is sort of two fsnotify events travelling in same event. We will
> > > > > have to have some sort of information in the src_fuse_notify_fsnotify_out
> > > > > which signals that another fsnotify_out is following. May be that's
> > > > > where fsnotify_flags->field can be used. Set a bit to signal another
> > > > > fsnotify_out is part of same event and this will also mean first one
> > > > > is src and second one is dst.
> > > > >
> > > > > Challenge I see is "src_name" and "dst_name", especially in the context
> > > > > of virtio queues.
> > > > >
> > > > > So we have a notification queue and for each notification, driver
> > > > > allocates a fixed amount of memory for each element and queues these
> > > > > elements in virtqueue. Server side pops these elements, places
> > > > > notification info in this vq element and sends back.
> > > > >
> > > > > So basically size of notification buffer needs to be known in advance,
> > > > > because these are allocated by driver (and not device/server). And
> > > > > that's the reason virtio spec has added a new filed "notify_buf_size"
> > > > > in device configuration space. Using this field device lets the driver
> > > > > know how much memory to allocate for each notification element.
> > > > >
> > > > > IOW, we can put variable sized elements in notifiation but max size
> > > > > of that variable length needs to be fixed in advance and needs to
> > > > > be told to driver at device initialization time.
> > > > >
> > > > > So what can be the max length of "src_name" and "dst_name"? Is it fair
> > > > > to use NAME_MAX to determine max length of name. So say "255" bytes
> > > > > (not including null) for each name. That means notify_buf_size will
> > > > > be.
> > > > >
> > > > > notify_buf_size = 2 * 255 + 2 * sizeof(struct fuse_notify_fsnotify_out);
> > > > >
> > > >
> > > > Can you push two subsequent elements to the events queue atomically?
> > > > If you can, then it is not a problem to queue the FS_MOVED_FROM
> > > > event (with one name) followed by FS_MOVED_TO event with
> > > > a self generated cookie in response to FAN_RENAME event on virtiofsd
> > > > server and rejoin them in virtiofs client.
> > >
> > > Hmm..., so basically break down FAN_RENAME event into two events joined
> > > by cookie and send them separately. I guess sounds reasonable because
> > > it helps reduce the max size of event.
> > >
> > > What do you mean by "atomically"? Do you mean strict ordering and these
> > > two events are right after each other. But if they are joined by cookie,
> > > we don't have to enforce it. Driver should be able to maintain an internal
> > > list and queue the event and wait for pair event to arrive. This also
> >
> > This is what I would consider repeating mistakes of past APIs (i.e. inotify).
> > Why should the driver work hard to join events that were already joint before
> > the queue? Is there really a technical challenge in queueing the two events
> > together?
>
> We can try queuing these together but it might not be that easy. If two
> elements are not available at the time of the queuing, then you will
> to let go the lock, put that element back in the queue and retry later.
>
> To me, it is much simpler and more flexible to not guarantee strict
> ordering and let the events be joined by cookie. BTW, we are using
> cookie anyway. So strict ordering should not be required.
>
> All I am saying is that implementation can still choose to send two
> events together one after the other, but this probably should not be
> a requirement on the part of the protocol.
>
> So what's the concern with joining the event with cookie API? I am
> not aware what went wrong in the past.
>
> One thing simpler with atomic event is that if second event does not
> come right away, we can probably discard first event saying some
> error happened. But if we join them by cookie and second event does
> not come, its not clear how do to error handling and how long guest
> driver should wait for second event to arrive.
>
> With the notion of enforcing atomicity, I am concerned about some
> deadlock/starvation possibility where you can't get two elements
> together at all. For example, some guest driver decides to instantiate
> queue only with 1 element.
>
> So is it doable, it most likely is. Still, I am not sure we should try to
> enforce atomicity. Apart from error handling, I am unable to seek
> what other issues exist with non-atomic joined events. Maybe you can
> help me understand better why requiring atomicity is a better option.
>

As UAPI the cookie interface is non reliable and users don't know how long
they would need to wait for the other part and how many rename states
to maintain and generally, most applications got it wrong and assumed to
many things about ordering and such.

As an internal virtio protocol you can do whatever ends up being more
easy for you as long as you are able to call the vfs fsnotify() API with
both src and dst dirs.

My feeling is that it is easier to queue two elements together.
I think it is possible to prevent starvation, by some reservation.
For example, if you allocate N+1 slots, but treat the queue as
"full" when there are N elements in the queue -
only a 2-elements push can dip into the +1 reservation if the
queue is not "full".

You can try this way or the other and choose for yourself.

Thanks,
Amir.

WARNING: multiple messages have this Message-ID (diff)
From: Amir Goldstein <amir73il@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Steve French <sfrench@samba.org>, Jan Kara <jack@suse.cz>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Nathan Youngman <git@nathany.com>,
	virtio-fs-list <virtio-fs@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Stef Bon <stefbon@gmail.com>
Subject: Re: [Virtio-fs] [RFC PATCH 0/7] Inotify support in FUSE and virtiofs
Date: Mon, 20 Dec 2021 20:22:11 +0200	[thread overview]
Message-ID: <CAOQ4uxiHB8khGU7ti7XLXriSqMU+Wn3uMk=Vq4C2ZvzAWu_w9g@mail.gmail.com> (raw)
In-Reply-To: <YcCySSZC3ZmN8+q1@redhat.com>

On Mon, Dec 20, 2021 at 6:42 PM Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Sat, Dec 18, 2021 at 10:28:35AM +0200, Amir Goldstein wrote:
> > > > >
> > > > > > 2. For FS_RENAME, will we be able to pass 4 buffers in iov?
> > > > > >     src_fuse_notify_fsnotify_out, src_name,
> > > > > >     dst_fuse_notify_fsnotify_out, dst_name
> > > > >
> > > > > So it is sort of two fsnotify events travelling in same event. We will
> > > > > have to have some sort of information in the src_fuse_notify_fsnotify_out
> > > > > which signals that another fsnotify_out is following. May be that's
> > > > > where fsnotify_flags->field can be used. Set a bit to signal another
> > > > > fsnotify_out is part of same event and this will also mean first one
> > > > > is src and second one is dst.
> > > > >
> > > > > Challenge I see is "src_name" and "dst_name", especially in the context
> > > > > of virtio queues.
> > > > >
> > > > > So we have a notification queue and for each notification, driver
> > > > > allocates a fixed amount of memory for each element and queues these
> > > > > elements in virtqueue. Server side pops these elements, places
> > > > > notification info in this vq element and sends back.
> > > > >
> > > > > So basically size of notification buffer needs to be known in advance,
> > > > > because these are allocated by driver (and not device/server). And
> > > > > that's the reason virtio spec has added a new filed "notify_buf_size"
> > > > > in device configuration space. Using this field device lets the driver
> > > > > know how much memory to allocate for each notification element.
> > > > >
> > > > > IOW, we can put variable sized elements in notifiation but max size
> > > > > of that variable length needs to be fixed in advance and needs to
> > > > > be told to driver at device initialization time.
> > > > >
> > > > > So what can be the max length of "src_name" and "dst_name"? Is it fair
> > > > > to use NAME_MAX to determine max length of name. So say "255" bytes
> > > > > (not including null) for each name. That means notify_buf_size will
> > > > > be.
> > > > >
> > > > > notify_buf_size = 2 * 255 + 2 * sizeof(struct fuse_notify_fsnotify_out);
> > > > >
> > > >
> > > > Can you push two subsequent elements to the events queue atomically?
> > > > If you can, then it is not a problem to queue the FS_MOVED_FROM
> > > > event (with one name) followed by FS_MOVED_TO event with
> > > > a self generated cookie in response to FAN_RENAME event on virtiofsd
> > > > server and rejoin them in virtiofs client.
> > >
> > > Hmm..., so basically break down FAN_RENAME event into two events joined
> > > by cookie and send them separately. I guess sounds reasonable because
> > > it helps reduce the max size of event.
> > >
> > > What do you mean by "atomically"? Do you mean strict ordering and these
> > > two events are right after each other. But if they are joined by cookie,
> > > we don't have to enforce it. Driver should be able to maintain an internal
> > > list and queue the event and wait for pair event to arrive. This also
> >
> > This is what I would consider repeating mistakes of past APIs (i.e. inotify).
> > Why should the driver work hard to join events that were already joint before
> > the queue? Is there really a technical challenge in queueing the two events
> > together?
>
> We can try queuing these together but it might not be that easy. If two
> elements are not available at the time of the queuing, then you will
> to let go the lock, put that element back in the queue and retry later.
>
> To me, it is much simpler and more flexible to not guarantee strict
> ordering and let the events be joined by cookie. BTW, we are using
> cookie anyway. So strict ordering should not be required.
>
> All I am saying is that implementation can still choose to send two
> events together one after the other, but this probably should not be
> a requirement on the part of the protocol.
>
> So what's the concern with joining the event with cookie API? I am
> not aware what went wrong in the past.
>
> One thing simpler with atomic event is that if second event does not
> come right away, we can probably discard first event saying some
> error happened. But if we join them by cookie and second event does
> not come, its not clear how do to error handling and how long guest
> driver should wait for second event to arrive.
>
> With the notion of enforcing atomicity, I am concerned about some
> deadlock/starvation possibility where you can't get two elements
> together at all. For example, some guest driver decides to instantiate
> queue only with 1 element.
>
> So is it doable, it most likely is. Still, I am not sure we should try to
> enforce atomicity. Apart from error handling, I am unable to seek
> what other issues exist with non-atomic joined events. Maybe you can
> help me understand better why requiring atomicity is a better option.
>

As UAPI the cookie interface is non reliable and users don't know how long
they would need to wait for the other part and how many rename states
to maintain and generally, most applications got it wrong and assumed to
many things about ordering and such.

As an internal virtio protocol you can do whatever ends up being more
easy for you as long as you are able to call the vfs fsnotify() API with
both src and dst dirs.

My feeling is that it is easier to queue two elements together.
I think it is possible to prevent starvation, by some reservation.
For example, if you allocate N+1 slots, but treat the queue as
"full" when there are N elements in the queue -
only a 2-elements push can dip into the +1 reservation if the
queue is not "full".

You can try this way or the other and choose for yourself.

Thanks,
Amir.


  reply	other threads:[~2021-12-20 18:22 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 20:46 [RFC PATCH 0/7] Inotify support in FUSE and virtiofs Ioannis Angelakopoulos
2021-10-25 20:46 ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-25 20:46 ` [RFC PATCH 1/7] FUSE: Add the fsnotify opcode and in/out structs to FUSE Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-26 14:56   ` Amir Goldstein
2021-10-26 14:56     ` [Virtio-fs] " Amir Goldstein
2021-10-26 18:28     ` Amir Goldstein
2021-10-26 18:28       ` [Virtio-fs] " Amir Goldstein
2021-10-26 20:47     ` Ioannis Angelakopoulos
2021-10-27  5:46       ` Amir Goldstein
2021-10-27  5:46         ` [Virtio-fs] " Amir Goldstein
2021-10-27 21:46     ` Vivek Goyal
2021-10-27 21:46       ` [Virtio-fs] " Vivek Goyal
2021-10-28  4:13       ` Amir Goldstein
2021-10-28  4:13         ` [Virtio-fs] " Amir Goldstein
2021-10-28 14:20         ` Vivek Goyal
2021-10-28 14:20           ` [Virtio-fs] " Vivek Goyal
2021-10-25 20:46 ` [RFC PATCH 2/7] FUSE: Add the remote inotify support capability " Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-25 20:46 ` [RFC PATCH 3/7] FUSE,Inotify,Fsnotify,VFS: Add the fuse_fsnotify_update_mark inode operation Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] [RFC PATCH 3/7] FUSE, Inotify, Fsnotify, VFS: " Ioannis Angelakopoulos
2021-10-26 15:06   ` [RFC PATCH 3/7] FUSE,Inotify,Fsnotify,VFS: " Amir Goldstein
2021-10-26 15:06     ` [Virtio-fs] [RFC PATCH 3/7] FUSE, Inotify, Fsnotify, VFS: " Amir Goldstein
2021-11-01 17:49     ` [RFC PATCH 3/7] FUSE,Inotify,Fsnotify,VFS: " Vivek Goyal
2021-11-01 17:49       ` [Virtio-fs] [RFC PATCH 3/7] FUSE, Inotify, Fsnotify, VFS: " Vivek Goyal
2021-11-02  7:34       ` [RFC PATCH 3/7] FUSE,Inotify,Fsnotify,VFS: " Amir Goldstein
2021-11-02  7:34         ` [Virtio-fs] [RFC PATCH 3/7] FUSE, Inotify, Fsnotify, VFS: " Amir Goldstein
2021-10-25 20:46 ` [RFC PATCH 4/7] FUSE: Add the fuse_fsnotify_send_request to FUSE Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-25 20:46 ` [RFC PATCH 5/7] Fsnotify: Add a wrapper around the fsnotify function Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-26 14:37   ` Amir Goldstein
2021-10-26 14:37     ` [Virtio-fs] " Amir Goldstein
2021-10-26 15:38     ` Vivek Goyal
2021-10-26 15:38       ` [Virtio-fs] " Vivek Goyal
2021-10-25 20:46 ` [RFC PATCH 6/7] FUSE,Fsnotify: Add the fuse_fsnotify_event inode operation Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] [RFC PATCH 6/7] FUSE, Fsnotify: " Ioannis Angelakopoulos
2021-10-25 20:46 ` [RFC PATCH 7/7] virtiofs: Add support for handling the remote fsnotify notifications Ioannis Angelakopoulos
2021-10-25 20:46   ` [Virtio-fs] " Ioannis Angelakopoulos
2021-10-26 15:23 ` [RFC PATCH 0/7] Inotify support in FUSE and virtiofs Amir Goldstein
2021-10-26 15:23   ` [Virtio-fs] " Amir Goldstein
2021-10-26 15:52   ` Vivek Goyal
2021-10-26 15:52     ` [Virtio-fs] " Vivek Goyal
2021-10-26 18:19     ` Amir Goldstein
2021-10-26 18:19       ` [Virtio-fs] " Amir Goldstein
2021-10-26 16:18   ` Vivek Goyal
2021-10-26 16:18     ` [Virtio-fs] " Vivek Goyal
2021-10-26 17:59     ` Amir Goldstein
2021-10-26 17:59       ` [Virtio-fs] " Amir Goldstein
2021-10-26 18:27       ` Vivek Goyal
2021-10-26 18:27         ` [Virtio-fs] " Vivek Goyal
2021-10-26 19:04         ` Amir Goldstein
2021-10-26 19:04           ` [Virtio-fs] " Amir Goldstein
2021-10-26 19:14         ` Ioannis Angelakopoulos
2021-10-27  5:59           ` Amir Goldstein
2021-10-27  5:59             ` [Virtio-fs] " Amir Goldstein
2021-10-27 13:23             ` Jan Kara
2021-10-27 13:23               ` [Virtio-fs] " Jan Kara
2021-10-27 20:29               ` Vivek Goyal
2021-10-27 20:29                 ` [Virtio-fs] " Vivek Goyal
2021-10-27 20:37                 ` Vivek Goyal
2021-10-27 20:37                   ` [Virtio-fs] " Vivek Goyal
2021-11-02 11:09                 ` Jan Kara
2021-11-02 11:09                   ` [Virtio-fs] " Jan Kara
2021-11-02 12:54                   ` Amir Goldstein
2021-11-02 12:54                     ` [Virtio-fs] " Amir Goldstein
2021-11-02 20:34                     ` Vivek Goyal
2021-11-02 20:34                       ` [Virtio-fs] " Vivek Goyal
2021-11-03  7:31                       ` Amir Goldstein
2021-11-03  7:31                         ` [Virtio-fs] " Amir Goldstein
2021-11-03 22:29                         ` Vivek Goyal
2021-11-03 22:29                           ` [Virtio-fs] " Vivek Goyal
2021-11-04  5:19                           ` Amir Goldstein
2021-11-04  5:19                             ` [Virtio-fs] " Amir Goldstein
2021-11-03 10:09                     ` Jan Kara
2021-11-03 10:09                       ` [Virtio-fs] " Jan Kara
2021-11-03 11:17                       ` Amir Goldstein
2021-11-03 11:17                         ` [Virtio-fs] " Amir Goldstein
2021-11-03 22:36                         ` Vivek Goyal
2021-11-03 22:36                           ` [Virtio-fs] " Vivek Goyal
2021-11-04  5:29                           ` Amir Goldstein
2021-11-04  5:29                             ` [Virtio-fs] " Amir Goldstein
2021-11-04 10:03                           ` Jan Kara
2021-11-04 10:03                             ` [Virtio-fs] " Jan Kara
2021-11-05 14:30                             ` Vivek Goyal
2021-11-05 14:30                               ` [Virtio-fs] " Vivek Goyal
2021-11-10  6:28                               ` Amir Goldstein
2021-11-10  6:28                                 ` [Virtio-fs] " Amir Goldstein
2021-11-11 17:30                                 ` Jan Kara
2021-11-11 17:30                                   ` [Virtio-fs] " Jan Kara
2021-11-11 20:52                                   ` Amir Goldstein
2021-11-11 20:52                                     ` [Virtio-fs] " Amir Goldstein
2021-11-16  5:09                                     ` Stef Bon
2021-11-16  5:09                                       ` [Virtio-fs] " Stef Bon
2021-11-16 18:00                                       ` Ioannis Angelakopoulos
2021-11-16 22:12                                       ` Ioannis Angelakopoulos
2021-11-17  6:40                                         ` Amir Goldstein
2021-11-17  6:40                                           ` [Virtio-fs] " Amir Goldstein
2021-11-30 15:27                                           ` Vivek Goyal
2021-11-30 15:27                                             ` [Virtio-fs] " Vivek Goyal
2021-12-14 23:21                                             ` Ioannis Angelakopoulos
2021-12-15  7:10                                               ` Amir Goldstein
2021-12-15  7:10                                                 ` [Virtio-fs] " Amir Goldstein
2021-12-15 16:44                                                 ` Vivek Goyal
2021-12-15 16:44                                                   ` [Virtio-fs] " Vivek Goyal
2021-12-15 17:29                                                   ` Amir Goldstein
2021-12-15 17:29                                                     ` [Virtio-fs] " Amir Goldstein
2021-12-15 19:20                                                     ` Vivek Goyal
2021-12-15 19:20                                                       ` [Virtio-fs] " Vivek Goyal
2021-12-15 19:54                                                       ` Amir Goldstein
2021-12-15 19:54                                                         ` [Virtio-fs] " Amir Goldstein
2021-12-16 11:03                                                         ` Amir Goldstein
2021-12-16 11:03                                                           ` [Virtio-fs] " Amir Goldstein
2021-12-16 16:24                                                           ` Vivek Goyal
2021-12-16 16:24                                                             ` [Virtio-fs] " Vivek Goyal
2021-12-16 18:22                                                             ` Amir Goldstein
2021-12-16 18:22                                                               ` [Virtio-fs] " Amir Goldstein
2021-12-16 22:24                                                               ` Vivek Goyal
2021-12-16 22:24                                                                 ` [Virtio-fs] " Vivek Goyal
2021-12-17  4:21                                                                 ` Amir Goldstein
2021-12-17  4:21                                                                   ` [Virtio-fs] " Amir Goldstein
2021-12-17 14:15                                                                   ` Vivek Goyal
2021-12-17 14:15                                                                     ` [Virtio-fs] " Vivek Goyal
2021-12-18  8:28                                                                     ` Amir Goldstein
2021-12-18  8:28                                                                       ` [Virtio-fs] " Amir Goldstein
2021-12-20 16:41                                                                       ` Vivek Goyal
2021-12-20 16:41                                                                         ` [Virtio-fs] " Vivek Goyal
2021-12-20 18:22                                                                         ` Amir Goldstein [this message]
2021-12-20 18:22                                                                           ` Amir Goldstein
2022-01-06 23:37                                             ` Steve French
2022-01-06 23:37                                               ` [Virtio-fs] " Steve French
2021-11-30 15:36                                       ` Vivek Goyal
2021-11-30 15:36                                         ` [Virtio-fs] " Vivek Goyal
2021-10-27 20:24             ` Vivek Goyal
2021-10-27 20:24               ` [Virtio-fs] " Vivek Goyal
2021-10-28  5:11               ` Amir Goldstein
2021-10-28  5:11                 ` [Virtio-fs] " Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxiHB8khGU7ti7XLXriSqMU+Wn3uMk=Vq4C2ZvzAWu_w9g@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=git@nathany.com \
    --cc=iangelak@redhat.com \
    --cc=jack@suse.cz \
    --cc=jaggel@bu.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=sfrench@samba.org \
    --cc=stefbon@gmail.com \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.