All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] virtiofs: Notification queue and blocking posix locks
@ 2021-09-30 14:38 ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Hi,

As of now we do not support blocking remote posix locks with virtiofs.
Well fuse client does not care but server returns -EOPNOTSUPP.

There are couple of reasons to not support it yet.

- If virtiofsd is single threaded or does not have a thread pool just
  to handle requests which can block for a long time, virtiofsd will
  stop processing new requests and virtiofs will come to a halt.
  To the extent that further unlock request will not make progress
  and deadlock will result. This can be taken care of by creating
  a custom thread pool in virtiofsd just to hanlde lock requests.

- If client sends a blocking lock request and blocks, then it will
  consume descriptors in vring. If enough processes block, it is
  possible that vring does not have capacity to send more requests
  till some response comes back and descriptors are free. This can
  also lead to deadlock where an unlock request can't be sent to
  virtiofsd now. Also this will stop virtiofs operation as well as
  new filesystem requests can't be sent.

To avoid this issue, idea was suggested thatn when a blocking
lock request is sent by client, do not block it. Immediately
send a reply saying client process should wait for a notification
which will let it know once lock is available. This will make
sure descriptors in virtqueue are not kept busy while we are
waiting for lock and future unlock and other file system requests
can continue to make progress.

This first requires a notion of notification queue and virtiosfd
being able to send notifications to client. This patch series
implements that as well.

As of now only one notification type has been implemented but now
infrastructure is in place and other use cases should be easily
add more type of notifications as need be.

We don't yet have the capability to interrupt the process which
is waiting for the posix lock. And reason for that is that virtiofs
does not support capability to interrupt yet. That's a TODO item
for later.

Please have a look.

Thanks
Vivek

Vivek Goyal (8):
  virtiofs: Disable interrupt requests properly
  virtiofs: Fix a comment about fuse_dev allocation
  virtiofs: Add an index to keep track of first request queue
  virtiofs: Decouple queue index and queue type
  virtiofs: Add a virtqueue for notifications
  virtiofs: Add a helper to end request and decrement inflight number
  virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  virtiofs: Handle reordering of reply and notification event

 fs/fuse/virtio_fs.c            | 438 ++++++++++++++++++++++++++++++---
 include/uapi/linux/fuse.h      |  11 +-
 include/uapi/linux/virtio_fs.h |   5 +
 3 files changed, 412 insertions(+), 42 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 0/8] virtiofs: Notification queue and blocking posix locks
@ 2021-09-30 14:38 ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Hi,

As of now we do not support blocking remote posix locks with virtiofs.
Well fuse client does not care but server returns -EOPNOTSUPP.

There are couple of reasons to not support it yet.

- If virtiofsd is single threaded or does not have a thread pool just
  to handle requests which can block for a long time, virtiofsd will
  stop processing new requests and virtiofs will come to a halt.
  To the extent that further unlock request will not make progress
  and deadlock will result. This can be taken care of by creating
  a custom thread pool in virtiofsd just to hanlde lock requests.

- If client sends a blocking lock request and blocks, then it will
  consume descriptors in vring. If enough processes block, it is
  possible that vring does not have capacity to send more requests
  till some response comes back and descriptors are free. This can
  also lead to deadlock where an unlock request can't be sent to
  virtiofsd now. Also this will stop virtiofs operation as well as
  new filesystem requests can't be sent.

To avoid this issue, idea was suggested thatn when a blocking
lock request is sent by client, do not block it. Immediately
send a reply saying client process should wait for a notification
which will let it know once lock is available. This will make
sure descriptors in virtqueue are not kept busy while we are
waiting for lock and future unlock and other file system requests
can continue to make progress.

This first requires a notion of notification queue and virtiosfd
being able to send notifications to client. This patch series
implements that as well.

As of now only one notification type has been implemented but now
infrastructure is in place and other use cases should be easily
add more type of notifications as need be.

We don't yet have the capability to interrupt the process which
is waiting for the posix lock. And reason for that is that virtiofs
does not support capability to interrupt yet. That's a TODO item
for later.

Please have a look.

Thanks
Vivek

Vivek Goyal (8):
  virtiofs: Disable interrupt requests properly
  virtiofs: Fix a comment about fuse_dev allocation
  virtiofs: Add an index to keep track of first request queue
  virtiofs: Decouple queue index and queue type
  virtiofs: Add a virtqueue for notifications
  virtiofs: Add a helper to end request and decrement inflight number
  virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  virtiofs: Handle reordering of reply and notification event

 fs/fuse/virtio_fs.c            | 438 ++++++++++++++++++++++++++++++---
 include/uapi/linux/fuse.h      |  11 +-
 include/uapi/linux/virtio_fs.h |   5 +
 3 files changed, 412 insertions(+), 42 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/8] virtiofs: Disable interrupt requests properly
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

virtiofs does not support dealing with fuse INTERRUPT requests at all.
But still we set can clear FR_SENT bit which is needed only if INTERRUPT
requests are being handled.

Also, given current code it is possible that virtiofs server is handling
a request and in guest a signal comes, it will wake up process and
queue existing request to fiq->interrupts and never remove it.

request_wait_answer()
{
	if (!fc->no_interupt) {
                if (test_bit(FR_SENT, &req->flags))
                        queue_interrupt(req);
	}
}

Given virtiofs does not support interrupt requests at this point of
time, disable it (Set fc->no_interrupt = 1). This should make sure
requests can't be queued on fiq->interrupts.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 0ad89c6629d7..b9256b8f277f 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -545,7 +545,6 @@ static void copy_args_from_argbuf(struct fuse_args *args, struct fuse_req *req)
 static void virtio_fs_request_complete(struct fuse_req *req,
 				       struct virtio_fs_vq *fsvq)
 {
-	struct fuse_pqueue *fpq = &fsvq->fud->pq;
 	struct fuse_args *args;
 	struct fuse_args_pages *ap;
 	unsigned int len, i, thislen;
@@ -574,10 +573,6 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 		}
 	}
 
-	spin_lock(&fpq->lock);
-	clear_bit(FR_SENT, &req->flags);
-	spin_unlock(&fpq->lock);
-
 	fuse_request_end(req);
 	spin_lock(&fsvq->lock);
 	dec_in_flight_req(fsvq);
@@ -1196,9 +1191,6 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 	spin_lock(&fpq->lock);
 	list_add_tail(&req->list, fpq->processing);
 	spin_unlock(&fpq->lock);
-	set_bit(FR_SENT, &req->flags);
-	/* matches barrier in request_wait_answer() */
-	smp_mb__after_atomic();
 
 	if (!in_flight)
 		inc_in_flight_req(fsvq);
@@ -1448,6 +1440,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 	fc->delete_stale = true;
 	fc->auto_submounts = true;
 	fc->sync_fs = true;
+	fc->no_interrupt = true;
 
 	/* Tell FUSE to split requests that exceed the virtqueue's size */
 	fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 1/8] virtiofs: Disable interrupt requests properly
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

virtiofs does not support dealing with fuse INTERRUPT requests at all.
But still we set can clear FR_SENT bit which is needed only if INTERRUPT
requests are being handled.

Also, given current code it is possible that virtiofs server is handling
a request and in guest a signal comes, it will wake up process and
queue existing request to fiq->interrupts and never remove it.

request_wait_answer()
{
	if (!fc->no_interupt) {
                if (test_bit(FR_SENT, &req->flags))
                        queue_interrupt(req);
	}
}

Given virtiofs does not support interrupt requests at this point of
time, disable it (Set fc->no_interrupt = 1). This should make sure
requests can't be queued on fiq->interrupts.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 0ad89c6629d7..b9256b8f277f 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -545,7 +545,6 @@ static void copy_args_from_argbuf(struct fuse_args *args, struct fuse_req *req)
 static void virtio_fs_request_complete(struct fuse_req *req,
 				       struct virtio_fs_vq *fsvq)
 {
-	struct fuse_pqueue *fpq = &fsvq->fud->pq;
 	struct fuse_args *args;
 	struct fuse_args_pages *ap;
 	unsigned int len, i, thislen;
@@ -574,10 +573,6 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 		}
 	}
 
-	spin_lock(&fpq->lock);
-	clear_bit(FR_SENT, &req->flags);
-	spin_unlock(&fpq->lock);
-
 	fuse_request_end(req);
 	spin_lock(&fsvq->lock);
 	dec_in_flight_req(fsvq);
@@ -1196,9 +1191,6 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 	spin_lock(&fpq->lock);
 	list_add_tail(&req->list, fpq->processing);
 	spin_unlock(&fpq->lock);
-	set_bit(FR_SENT, &req->flags);
-	/* matches barrier in request_wait_answer() */
-	smp_mb__after_atomic();
 
 	if (!in_flight)
 		inc_in_flight_req(fsvq);
@@ -1448,6 +1440,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 	fc->delete_stale = true;
 	fc->auto_submounts = true;
 	fc->sync_fs = true;
+	fc->no_interrupt = true;
 
 	/* Tell FUSE to split requests that exceed the virtqueue's size */
 	fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/8] virtiofs: Fix a comment about fuse_dev allocation
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Now virtiofs allocates fuse_dev for all queues and does not rely on fuse
common code to allocate for request queue. Fix the comment.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index b9256b8f277f..f7c58a4b996d 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -1307,7 +1307,7 @@ static int virtio_fs_fill_super(struct super_block *sb, struct fs_context *fsc)
 	}
 
 	err = -ENOMEM;
-	/* Allocate fuse_dev for hiprio and notification queues */
+	/* Allocate fuse_dev for all the queues */
 	for (i = 0; i < fs->nvqs; i++) {
 		struct virtio_fs_vq *fsvq = &fs->vqs[i];
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 2/8] virtiofs: Fix a comment about fuse_dev allocation
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Now virtiofs allocates fuse_dev for all queues and does not rely on fuse
common code to allocate for request queue. Fix the comment.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index b9256b8f277f..f7c58a4b996d 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -1307,7 +1307,7 @@ static int virtio_fs_fill_super(struct super_block *sb, struct fs_context *fsc)
 	}
 
 	err = -ENOMEM;
-	/* Allocate fuse_dev for hiprio and notification queues */
+	/* Allocate fuse_dev for all the queues */
 	for (i = 0; i < fs->nvqs; i++) {
 		struct virtio_fs_vq *fsvq = &fs->vqs[i];
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/8] virtiofs: Add an index to keep track of first request queue
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

We have many virtqueues and first queue which carries fuse normal requests
(except forget requests) has index pointed to by enum VQ_REQUEST. This
works fine as long as number of queues are not dynamic.

I am about to introduce one more virtqueue, called notification queue,
which will be present only if device on host supports it. That means index
of request queue will change depending on if notification queue is present
or not.

So, add a variable to keep track of that index and this will help when
notification queue is added in next patch.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index f7c58a4b996d..cb3c7bf8cce4 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -61,6 +61,7 @@ struct virtio_fs {
 	unsigned int nvqs;               /* number of virtqueues */
 	unsigned int num_request_queues; /* number of request queues */
 	struct dax_device *dax_dev;
+	unsigned int first_reqq_idx;     /* First request queue idx */
 
 	/* DAX memory window where file contents are mapped */
 	void *window_kaddr;
@@ -676,7 +677,9 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	if (fs->num_request_queues == 0)
 		return -EINVAL;
 
-	fs->nvqs = VQ_REQUEST + fs->num_request_queues;
+	/* One hiprio queue and rest are request queues */
+	fs->nvqs = 1 + fs->num_request_queues;
+	fs->first_reqq_idx = 1;
 	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
@@ -696,10 +699,11 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	names[VQ_HIPRIO] = fs->vqs[VQ_HIPRIO].name;
 
 	/* Initialize the requests virtqueues */
-	for (i = VQ_REQUEST; i < fs->nvqs; i++) {
+	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
 		char vq_name[VQ_NAME_LEN];
 
-		snprintf(vq_name, VQ_NAME_LEN, "requests.%u", i - VQ_REQUEST);
+		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
+			 i - fs->first_reqq_idx);
 		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_REQUEST);
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
@@ -1217,7 +1221,7 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq)
 __releases(fiq->lock)
 {
-	unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */
+	unsigned int queue_id;
 	struct virtio_fs *fs;
 	struct fuse_req *req;
 	struct virtio_fs_vq *fsvq;
@@ -1231,6 +1235,7 @@ __releases(fiq->lock)
 	spin_unlock(&fiq->lock);
 
 	fs = fiq->priv;
+	queue_id = fs->first_reqq_idx;
 
 	pr_debug("%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n",
 		  __func__, req->in.h.opcode, req->in.h.unique,
@@ -1411,6 +1416,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 	struct fuse_mount *fm;
 	unsigned int virtqueue_size;
 	int err = -EIO;
+	struct virtio_fs_vq *first_req_fsvq;
 
 	/* This gets a reference on virtio_fs object. This ptr gets installed
 	 * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
@@ -1422,7 +1428,8 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 		return -EINVAL;
 	}
 
-	virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+	first_req_fsvq = &fs->vqs[fs->first_reqq_idx];
+	virtqueue_size = virtqueue_get_vring_size(first_req_fsvq->vq);
 	if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
 		goto out_err;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 3/8] virtiofs: Add an index to keep track of first request queue
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

We have many virtqueues and first queue which carries fuse normal requests
(except forget requests) has index pointed to by enum VQ_REQUEST. This
works fine as long as number of queues are not dynamic.

I am about to introduce one more virtqueue, called notification queue,
which will be present only if device on host supports it. That means index
of request queue will change depending on if notification queue is present
or not.

So, add a variable to keep track of that index and this will help when
notification queue is added in next patch.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index f7c58a4b996d..cb3c7bf8cce4 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -61,6 +61,7 @@ struct virtio_fs {
 	unsigned int nvqs;               /* number of virtqueues */
 	unsigned int num_request_queues; /* number of request queues */
 	struct dax_device *dax_dev;
+	unsigned int first_reqq_idx;     /* First request queue idx */
 
 	/* DAX memory window where file contents are mapped */
 	void *window_kaddr;
@@ -676,7 +677,9 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	if (fs->num_request_queues == 0)
 		return -EINVAL;
 
-	fs->nvqs = VQ_REQUEST + fs->num_request_queues;
+	/* One hiprio queue and rest are request queues */
+	fs->nvqs = 1 + fs->num_request_queues;
+	fs->first_reqq_idx = 1;
 	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
@@ -696,10 +699,11 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	names[VQ_HIPRIO] = fs->vqs[VQ_HIPRIO].name;
 
 	/* Initialize the requests virtqueues */
-	for (i = VQ_REQUEST; i < fs->nvqs; i++) {
+	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
 		char vq_name[VQ_NAME_LEN];
 
-		snprintf(vq_name, VQ_NAME_LEN, "requests.%u", i - VQ_REQUEST);
+		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
+			 i - fs->first_reqq_idx);
 		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_REQUEST);
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
@@ -1217,7 +1221,7 @@ static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq)
 __releases(fiq->lock)
 {
-	unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */
+	unsigned int queue_id;
 	struct virtio_fs *fs;
 	struct fuse_req *req;
 	struct virtio_fs_vq *fsvq;
@@ -1231,6 +1235,7 @@ __releases(fiq->lock)
 	spin_unlock(&fiq->lock);
 
 	fs = fiq->priv;
+	queue_id = fs->first_reqq_idx;
 
 	pr_debug("%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n",
 		  __func__, req->in.h.opcode, req->in.h.unique,
@@ -1411,6 +1416,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 	struct fuse_mount *fm;
 	unsigned int virtqueue_size;
 	int err = -EIO;
+	struct virtio_fs_vq *first_req_fsvq;
 
 	/* This gets a reference on virtio_fs object. This ptr gets installed
 	 * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
@@ -1422,7 +1428,8 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
 		return -EINVAL;
 	}
 
-	virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+	first_req_fsvq = &fs->vqs[fs->first_reqq_idx];
+	virtqueue_size = virtqueue_get_vring_size(first_req_fsvq->vq);
 	if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
 		goto out_err;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 4/8] virtiofs: Decouple queue index and queue type
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Right now we use a single enum {VQ_HIPRIO, VQ_REQUEST} and this seems to be
being used to communicate both virtqueue index as well as virtqueue type.

For example, virtio_fs_init_vq(..,..,..,vq_type) expects queue type in
vq_type parameter. In rest of the code we are also using this enum as
queue index.

This is little confusing. At the same time, queue index situation is about
to become little complicated and dynamic with the introduction of
notification queue. Request queue index is not going to be determined at
compile time. It will be dynamic based on whether notification queue is
offered by device or not.

So do not use this enum for both the purposes. Instead use it only to
denote virtqueue type. For queue index, use macros where queue index is
fixed and use a variable where queue index is not fixed.

In the previous patch we are already using a variable ->first_reqq_idx for
request queue index. This patch defines VQ_HIPRIO_IDX to keep track of
hiprio virtqueue index.

This patch also renames the enum elements to make it explicit that these
representing virtqueue type (and not index).

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index cb3c7bf8cce4..eef9591de640 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -24,6 +24,8 @@
  */
 #define FUSE_HEADER_OVERHEAD    4
 
+#define VQ_HIPRIO_IDX	0
+
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
  */
@@ -31,8 +33,8 @@ static DEFINE_MUTEX(virtio_fs_mutex);
 static LIST_HEAD(virtio_fs_instances);
 
 enum {
-	VQ_HIPRIO,
-	VQ_REQUEST
+	VQ_TYPE_HIPRIO,
+	VQ_TYPE_REQUEST
 };
 
 #define VQ_NAME_LEN	24
@@ -651,7 +653,7 @@ static void virtio_fs_init_vq(struct virtio_fs_vq *fsvq, char *name,
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	init_completion(&fsvq->in_flight_zero);
 
-	if (vq_type == VQ_REQUEST) {
+	if (vq_type == VQ_TYPE_REQUEST) {
 		INIT_WORK(&fsvq->done_work, virtio_fs_requests_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_request_dispatch_work);
@@ -680,23 +682,24 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	/* One hiprio queue and rest are request queues */
 	fs->nvqs = 1 + fs->num_request_queues;
 	fs->first_reqq_idx = 1;
-	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO]), GFP_KERNEL);
+	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
 
-	vqs = kmalloc_array(fs->nvqs, sizeof(vqs[VQ_HIPRIO]), GFP_KERNEL);
-	callbacks = kmalloc_array(fs->nvqs, sizeof(callbacks[VQ_HIPRIO]),
+	vqs = kmalloc_array(fs->nvqs, sizeof(vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
+	callbacks = kmalloc_array(fs->nvqs, sizeof(callbacks[VQ_HIPRIO_IDX]),
 					GFP_KERNEL);
-	names = kmalloc_array(fs->nvqs, sizeof(names[VQ_HIPRIO]), GFP_KERNEL);
+	names = kmalloc_array(fs->nvqs, sizeof(names[VQ_HIPRIO_IDX]),
+			      GFP_KERNEL);
 	if (!vqs || !callbacks || !names) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
 	/* Initialize the hiprio/forget request virtqueue */
-	callbacks[VQ_HIPRIO] = virtio_fs_vq_done;
-	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO], "hiprio", VQ_HIPRIO);
-	names[VQ_HIPRIO] = fs->vqs[VQ_HIPRIO].name;
+	callbacks[VQ_HIPRIO_IDX] = virtio_fs_vq_done;
+	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO_IDX], "hiprio", VQ_TYPE_HIPRIO);
+	names[VQ_HIPRIO_IDX] = fs->vqs[VQ_HIPRIO_IDX].name;
 
 	/* Initialize the requests virtqueues */
 	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
@@ -704,7 +707,7 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
 			 i - fs->first_reqq_idx);
-		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_REQUEST);
+		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_TYPE_REQUEST);
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
 	}
@@ -985,7 +988,7 @@ __releases(fiq->lock)
 	unique = fuse_get_unique(fiq);
 
 	fs = fiq->priv;
-	fsvq = &fs->vqs[VQ_HIPRIO];
+	fsvq = &fs->vqs[VQ_HIPRIO_IDX];
 	spin_unlock(&fiq->lock);
 
 	/* Allocate a buffer for the request */
@@ -1359,7 +1362,7 @@ static void virtio_fs_conn_destroy(struct fuse_mount *fm)
 {
 	struct fuse_conn *fc = fm->fc;
 	struct virtio_fs *vfs = fc->iq.priv;
-	struct virtio_fs_vq *fsvq = &vfs->vqs[VQ_HIPRIO];
+	struct virtio_fs_vq *fsvq = &vfs->vqs[VQ_HIPRIO_IDX];
 
 	/* Stop dax worker. Soon evict_inodes() will be called which
 	 * will free all memory ranges belonging to all inodes.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 4/8] virtiofs: Decouple queue index and queue type
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Right now we use a single enum {VQ_HIPRIO, VQ_REQUEST} and this seems to be
being used to communicate both virtqueue index as well as virtqueue type.

For example, virtio_fs_init_vq(..,..,..,vq_type) expects queue type in
vq_type parameter. In rest of the code we are also using this enum as
queue index.

This is little confusing. At the same time, queue index situation is about
to become little complicated and dynamic with the introduction of
notification queue. Request queue index is not going to be determined at
compile time. It will be dynamic based on whether notification queue is
offered by device or not.

So do not use this enum for both the purposes. Instead use it only to
denote virtqueue type. For queue index, use macros where queue index is
fixed and use a variable where queue index is not fixed.

In the previous patch we are already using a variable ->first_reqq_idx for
request queue index. This patch defines VQ_HIPRIO_IDX to keep track of
hiprio virtqueue index.

This patch also renames the enum elements to make it explicit that these
representing virtqueue type (and not index).

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index cb3c7bf8cce4..eef9591de640 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -24,6 +24,8 @@
  */
 #define FUSE_HEADER_OVERHEAD    4
 
+#define VQ_HIPRIO_IDX	0
+
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
  */
@@ -31,8 +33,8 @@ static DEFINE_MUTEX(virtio_fs_mutex);
 static LIST_HEAD(virtio_fs_instances);
 
 enum {
-	VQ_HIPRIO,
-	VQ_REQUEST
+	VQ_TYPE_HIPRIO,
+	VQ_TYPE_REQUEST
 };
 
 #define VQ_NAME_LEN	24
@@ -651,7 +653,7 @@ static void virtio_fs_init_vq(struct virtio_fs_vq *fsvq, char *name,
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	init_completion(&fsvq->in_flight_zero);
 
-	if (vq_type == VQ_REQUEST) {
+	if (vq_type == VQ_TYPE_REQUEST) {
 		INIT_WORK(&fsvq->done_work, virtio_fs_requests_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_request_dispatch_work);
@@ -680,23 +682,24 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	/* One hiprio queue and rest are request queues */
 	fs->nvqs = 1 + fs->num_request_queues;
 	fs->first_reqq_idx = 1;
-	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO]), GFP_KERNEL);
+	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
 
-	vqs = kmalloc_array(fs->nvqs, sizeof(vqs[VQ_HIPRIO]), GFP_KERNEL);
-	callbacks = kmalloc_array(fs->nvqs, sizeof(callbacks[VQ_HIPRIO]),
+	vqs = kmalloc_array(fs->nvqs, sizeof(vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
+	callbacks = kmalloc_array(fs->nvqs, sizeof(callbacks[VQ_HIPRIO_IDX]),
 					GFP_KERNEL);
-	names = kmalloc_array(fs->nvqs, sizeof(names[VQ_HIPRIO]), GFP_KERNEL);
+	names = kmalloc_array(fs->nvqs, sizeof(names[VQ_HIPRIO_IDX]),
+			      GFP_KERNEL);
 	if (!vqs || !callbacks || !names) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
 	/* Initialize the hiprio/forget request virtqueue */
-	callbacks[VQ_HIPRIO] = virtio_fs_vq_done;
-	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO], "hiprio", VQ_HIPRIO);
-	names[VQ_HIPRIO] = fs->vqs[VQ_HIPRIO].name;
+	callbacks[VQ_HIPRIO_IDX] = virtio_fs_vq_done;
+	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO_IDX], "hiprio", VQ_TYPE_HIPRIO);
+	names[VQ_HIPRIO_IDX] = fs->vqs[VQ_HIPRIO_IDX].name;
 
 	/* Initialize the requests virtqueues */
 	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
@@ -704,7 +707,7 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
 			 i - fs->first_reqq_idx);
-		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_REQUEST);
+		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_TYPE_REQUEST);
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
 	}
@@ -985,7 +988,7 @@ __releases(fiq->lock)
 	unique = fuse_get_unique(fiq);
 
 	fs = fiq->priv;
-	fsvq = &fs->vqs[VQ_HIPRIO];
+	fsvq = &fs->vqs[VQ_HIPRIO_IDX];
 	spin_unlock(&fiq->lock);
 
 	/* Allocate a buffer for the request */
@@ -1359,7 +1362,7 @@ static void virtio_fs_conn_destroy(struct fuse_mount *fm)
 {
 	struct fuse_conn *fc = fm->fc;
 	struct virtio_fs *vfs = fc->iq.priv;
-	struct virtio_fs_vq *fsvq = &vfs->vqs[VQ_HIPRIO];
+	struct virtio_fs_vq *fsvq = &vfs->vqs[VQ_HIPRIO_IDX];
 
 	/* Stop dax worker. Soon evict_inodes() will be called which
 	 * will free all memory ranges belonging to all inodes.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 5/8] virtiofs: Add a virtqueue for notifications
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Add a new virtqueue for notifications. This will allow the device to send
notifications to guest. This queue is created only if the device supports
it. This is negotiated using feature bit VIRTIO_FS_F_NOTIFICATION.

Given the architecture of virtqueue, one needs to queue up pre-allocated
elements in the notification queue and the device can pop these elements
and fill the notification info and send it back. The size of the
notification buffer is negotiable and is specified by the device through
config space. This will allow us to add and support more notification
types without having to change the spec.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c            | 203 +++++++++++++++++++++++++++++++--
 include/uapi/linux/virtio_fs.h |   5 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index eef9591de640..b70a22a79901 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -25,6 +25,7 @@
 #define FUSE_HEADER_OVERHEAD    4
 
 #define VQ_HIPRIO_IDX	0
+#define VQ_NOTIFY_IDX	1
 
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
@@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
 
 enum {
 	VQ_TYPE_HIPRIO,
-	VQ_TYPE_REQUEST
+	VQ_TYPE_REQUEST,
+	VQ_TYPE_NOTIFY
 };
 
 #define VQ_NAME_LEN	24
+#define VQ_NOTIFY_ELEMS 16	/* Number of notification elements */
 
 /* Per-virtqueue state */
 struct virtio_fs_vq {
@@ -46,6 +49,8 @@ struct virtio_fs_vq {
 	struct work_struct done_work;
 	struct list_head queued_reqs;
 	struct list_head end_reqs;	/* End these requests */
+	struct virtio_fs_notify_node *notify_nodes;
+	struct list_head notify_reqs;	/* List for queuing notify requests */
 	struct delayed_work dispatch_work;
 	struct fuse_dev *fud;
 	bool connected;
@@ -64,6 +69,8 @@ struct virtio_fs {
 	unsigned int num_request_queues; /* number of request queues */
 	struct dax_device *dax_dev;
 	unsigned int first_reqq_idx;     /* First request queue idx */
+	bool notify_enabled;
+	unsigned int notify_buf_size;    /* Size of notification buffer */
 
 	/* DAX memory window where file contents are mapped */
 	void *window_kaddr;
@@ -91,6 +98,19 @@ struct virtio_fs_req_work {
 static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 				 struct fuse_req *req, bool in_flight);
 
+/* Size of virtio_fs_notify specified by fs->notify_buf_size. */
+struct virtio_fs_notify {
+	struct fuse_out_header out_hdr;
+	char outarg[];
+};
+
+struct virtio_fs_notify_node {
+	struct list_head list;
+	struct virtio_fs_notify notify;
+};
+
+static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq);
+
 enum {
 	OPT_DAX,
 };
@@ -136,6 +156,11 @@ static inline struct virtio_fs_vq *vq_to_fsvq(struct virtqueue *vq)
 	return &fs->vqs[vq->index];
 }
 
+static inline struct virtio_fs *fsvq_to_fs(struct virtio_fs_vq *fsvq)
+{
+	return (struct virtio_fs *)fsvq->vq->vdev->priv;
+}
+
 /* Should be called with fsvq->lock held. */
 static inline void inc_in_flight_req(struct virtio_fs_vq *fsvq)
 {
@@ -151,10 +176,17 @@ static inline void dec_in_flight_req(struct virtio_fs_vq *fsvq)
 		complete(&fsvq->in_flight_zero);
 }
 
+static void virtio_fs_free_notify_nodes(struct virtio_fs *fs)
+{
+	if (fs->notify_enabled && fs->vqs)
+		kfree(fs->vqs[VQ_NOTIFY_IDX].notify_nodes);
+}
+
 static void release_virtio_fs_obj(struct kref *ref)
 {
 	struct virtio_fs *vfs = container_of(ref, struct virtio_fs, refcount);
 
+	virtio_fs_free_notify_nodes(vfs);
 	kfree(vfs->vqs);
 	kfree(vfs);
 }
@@ -201,6 +233,13 @@ static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
 	int i;
 
 	for (i = 0; i < fs->nvqs; i++) {
+		/*
+		 * Can't wait to drain notification queue as it always
+		 * had pending requests so that server can use those
+		 * to send notifications
+		 */
+		if (fs->notify_enabled && (i == VQ_NOTIFY_IDX))
+			continue;
 		fsvq = &fs->vqs[i];
 		virtio_fs_drain_queue(fsvq);
 	}
@@ -229,6 +268,8 @@ static void virtio_fs_start_all_queues(struct virtio_fs *fs)
 		spin_lock(&fsvq->lock);
 		fsvq->connected = true;
 		spin_unlock(&fsvq->lock);
+		if (fs->notify_enabled && (i == VQ_NOTIFY_IDX))
+			virtio_fs_enqueue_all_notify(fsvq);
 	}
 }
 
@@ -477,6 +518,98 @@ static void virtio_fs_hiprio_dispatch_work(struct work_struct *work)
 	}
 }
 
+static int virtio_fs_init_notify_vq(struct virtio_fs *fs,
+				    struct virtio_fs_vq *fsvq)
+{
+	struct virtio_fs_notify_node *notifyn;
+	unsigned int notify_node_sz = sizeof(struct list_head) +
+				  fs->notify_buf_size;
+	int i;
+
+	fsvq->notify_nodes = kcalloc(VQ_NOTIFY_ELEMS, notify_node_sz,
+				     GFP_KERNEL);
+	if (!fsvq->notify_nodes)
+		return -ENOMEM;
+
+	for (i = 0; i < VQ_NOTIFY_ELEMS; i++) {
+		notifyn = (void *)fsvq->notify_nodes + (i * notify_node_sz);
+		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+	}
+
+	return 0;
+}
+
+static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
+{
+	struct scatterlist sg[1];
+	int ret;
+	bool kick;
+	struct virtio_fs *fs = fsvq_to_fs(fsvq);
+	struct virtio_fs_notify_node *notifyn, *next;
+	unsigned int notify_sz;
+
+	notify_sz = fs->notify_buf_size;
+	spin_lock(&fsvq->lock);
+	list_for_each_entry_safe(notifyn, next, &fsvq->notify_reqs, list) {
+		list_del_init(&notifyn->list);
+		sg_init_one(sg, &notifyn->notify, notify_sz);
+		ret = virtqueue_add_inbuf(fsvq->vq, sg, 1, notifyn, GFP_ATOMIC);
+		if (ret) {
+			list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+			spin_unlock(&fsvq->lock);
+			return ret;
+		}
+		inc_in_flight_req(fsvq);
+	}
+
+	kick = virtqueue_kick_prepare(fsvq->vq);
+	spin_unlock(&fsvq->lock);
+	if (kick)
+		virtqueue_notify(fsvq->vq);
+	return 0;
+}
+
+static void virtio_fs_notify_done_work(struct work_struct *work)
+{
+	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
+						 done_work);
+	struct virtqueue *vq = fsvq->vq;
+	LIST_HEAD(reqs);
+	struct virtio_fs_notify_node *notifyn, *next;
+
+	spin_lock(&fsvq->lock);
+	do {
+		unsigned int len;
+
+		virtqueue_disable_cb(vq);
+
+		while ((notifyn = virtqueue_get_buf(vq, &len)) != NULL)
+			list_add_tail(&notifyn->list, &reqs);
+
+	} while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
+	spin_unlock(&fsvq->lock);
+
+	/* Process notify */
+	list_for_each_entry_safe(notifyn, next, &reqs, list) {
+		spin_lock(&fsvq->lock);
+		dec_in_flight_req(fsvq);
+		list_del_init(&notifyn->list);
+		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+		spin_unlock(&fsvq->lock);
+	}
+
+	/*
+	 * If queue is connected, queue notifications again. If not,
+	 * these will be queued again when virtuqueue is restarted.
+	 */
+	if (fsvq->connected)
+		virtio_fs_enqueue_all_notify(fsvq);
+}
+
+static void virtio_fs_notify_dispatch_work(struct work_struct *work)
+{
+}
+
 /* Allocate and copy args into req->argbuf */
 static int copy_args_to_argbuf(struct fuse_req *req)
 {
@@ -644,24 +777,34 @@ static void virtio_fs_vq_done(struct virtqueue *vq)
 	schedule_work(&fsvq->done_work);
 }
 
-static void virtio_fs_init_vq(struct virtio_fs_vq *fsvq, char *name,
-			      int vq_type)
+static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
+			     char *name, int vq_type)
 {
+	int ret = 0;
+
 	strncpy(fsvq->name, name, VQ_NAME_LEN);
 	spin_lock_init(&fsvq->lock);
 	INIT_LIST_HEAD(&fsvq->queued_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
+	INIT_LIST_HEAD(&fsvq->notify_reqs);
 	init_completion(&fsvq->in_flight_zero);
 
 	if (vq_type == VQ_TYPE_REQUEST) {
 		INIT_WORK(&fsvq->done_work, virtio_fs_requests_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_request_dispatch_work);
+	} else if (vq_type == VQ_TYPE_NOTIFY) {
+		INIT_WORK(&fsvq->done_work, virtio_fs_notify_done_work);
+		INIT_DELAYED_WORK(&fsvq->dispatch_work,
+				  virtio_fs_notify_dispatch_work);
+		ret = virtio_fs_init_notify_vq(fs, fsvq);
 	} else {
 		INIT_WORK(&fsvq->done_work, virtio_fs_hiprio_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_hiprio_dispatch_work);
 	}
+
+	return ret;
 }
 
 /* Initialize virtqueues */
@@ -679,9 +822,28 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	if (fs->num_request_queues == 0)
 		return -EINVAL;
 
-	/* One hiprio queue and rest are request queues */
-	fs->nvqs = 1 + fs->num_request_queues;
-	fs->first_reqq_idx = 1;
+	if (virtio_has_feature(vdev, VIRTIO_FS_F_NOTIFICATION)) {
+		fs->notify_enabled = true;
+		virtio_cread(vdev, struct virtio_fs_config, notify_buf_size,
+			     &fs->notify_buf_size);
+		if (fs->notify_buf_size <= sizeof(struct fuse_out_header)) {
+			pr_err("virtio-fs: Invalid value %d of notification buffer size\n",
+			       fs->notify_buf_size);
+			return -EINVAL;
+		}
+		pr_info("virtio-fs: device supports notification. Notification_buf_size=%u\n",
+			fs->notify_buf_size);
+	}
+
+	if (fs->notify_enabled) {
+		/* One additional queue for hiprio and one for notifications */
+		fs->nvqs = 2 + fs->num_request_queues;
+		fs->first_reqq_idx = VQ_NOTIFY_IDX + 1;
+	} else {
+		fs->nvqs = 1 + fs->num_request_queues;
+		fs->first_reqq_idx = 1;
+	}
+
 	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
@@ -698,16 +860,32 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 	/* Initialize the hiprio/forget request virtqueue */
 	callbacks[VQ_HIPRIO_IDX] = virtio_fs_vq_done;
-	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO_IDX], "hiprio", VQ_TYPE_HIPRIO);
+	ret = virtio_fs_init_vq(fs, &fs->vqs[VQ_HIPRIO_IDX], "hiprio",
+				VQ_TYPE_HIPRIO);
+	if (ret < 0)
+		goto out;
 	names[VQ_HIPRIO_IDX] = fs->vqs[VQ_HIPRIO_IDX].name;
 
+	/* Initialize notification queue */
+	if (fs->notify_enabled) {
+		callbacks[VQ_NOTIFY_IDX] = virtio_fs_vq_done;
+		ret = virtio_fs_init_vq(fs, &fs->vqs[VQ_NOTIFY_IDX],
+					"notification", VQ_TYPE_NOTIFY);
+		if (ret < 0)
+			goto out;
+		names[VQ_NOTIFY_IDX] = fs->vqs[VQ_NOTIFY_IDX].name;
+	}
+
 	/* Initialize the requests virtqueues */
 	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
 		char vq_name[VQ_NAME_LEN];
 
 		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
 			 i - fs->first_reqq_idx);
-		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_TYPE_REQUEST);
+		ret = virtio_fs_init_vq(fs, &fs->vqs[i], vq_name,
+					VQ_TYPE_REQUEST);
+		if (ret < 0)
+			goto out;
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
 	}
@@ -718,14 +896,14 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 	for (i = 0; i < fs->nvqs; i++)
 		fs->vqs[i].vq = vqs[i];
-
-	virtio_fs_start_all_queues(fs);
 out:
 	kfree(names);
 	kfree(callbacks);
 	kfree(vqs);
-	if (ret)
+	if (ret) {
+		virtio_fs_free_notify_nodes(fs);
 		kfree(fs->vqs);
+	}
 	return ret;
 }
 
@@ -889,6 +1067,7 @@ static int virtio_fs_probe(struct virtio_device *vdev)
 	 * requests need to be sent before we return.
 	 */
 	virtio_device_ready(vdev);
+	virtio_fs_start_all_queues(fs);
 
 	ret = virtio_fs_add_instance(fs);
 	if (ret < 0)
@@ -958,7 +1137,7 @@ static const struct virtio_device_id id_table[] = {
 	{},
 };
 
-static const unsigned int feature_table[] = {};
+static const unsigned int feature_table[] = {VIRTIO_FS_F_NOTIFICATION};
 
 static struct virtio_driver virtio_fs_driver = {
 	.driver.name		= KBUILD_MODNAME,
diff --git a/include/uapi/linux/virtio_fs.h b/include/uapi/linux/virtio_fs.h
index bea38291421b..3a9bbccf4115 100644
--- a/include/uapi/linux/virtio_fs.h
+++ b/include/uapi/linux/virtio_fs.h
@@ -8,12 +8,17 @@
 #include <linux/virtio_config.h>
 #include <linux/virtio_types.h>
 
+/* Feature bits */
+#define VIRTIO_FS_F_NOTIFICATION 0	/* Notification queue supported */
+
 struct virtio_fs_config {
 	/* Filesystem name (UTF-8, not NUL-terminated, padded with NULs) */
 	__u8 tag[36];
 
 	/* Number of request queues */
 	__le32 num_request_queues;
+	/* Size of notification buffer */
+	__u32 notify_buf_size;
 } __attribute__((packed));
 
 /* For the id field in virtio_pci_shm_cap */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 5/8] virtiofs: Add a virtqueue for notifications
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Add a new virtqueue for notifications. This will allow the device to send
notifications to guest. This queue is created only if the device supports
it. This is negotiated using feature bit VIRTIO_FS_F_NOTIFICATION.

Given the architecture of virtqueue, one needs to queue up pre-allocated
elements in the notification queue and the device can pop these elements
and fill the notification info and send it back. The size of the
notification buffer is negotiable and is specified by the device through
config space. This will allow us to add and support more notification
types without having to change the spec.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c            | 203 +++++++++++++++++++++++++++++++--
 include/uapi/linux/virtio_fs.h |   5 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index eef9591de640..b70a22a79901 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -25,6 +25,7 @@
 #define FUSE_HEADER_OVERHEAD    4
 
 #define VQ_HIPRIO_IDX	0
+#define VQ_NOTIFY_IDX	1
 
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
@@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
 
 enum {
 	VQ_TYPE_HIPRIO,
-	VQ_TYPE_REQUEST
+	VQ_TYPE_REQUEST,
+	VQ_TYPE_NOTIFY
 };
 
 #define VQ_NAME_LEN	24
+#define VQ_NOTIFY_ELEMS 16	/* Number of notification elements */
 
 /* Per-virtqueue state */
 struct virtio_fs_vq {
@@ -46,6 +49,8 @@ struct virtio_fs_vq {
 	struct work_struct done_work;
 	struct list_head queued_reqs;
 	struct list_head end_reqs;	/* End these requests */
+	struct virtio_fs_notify_node *notify_nodes;
+	struct list_head notify_reqs;	/* List for queuing notify requests */
 	struct delayed_work dispatch_work;
 	struct fuse_dev *fud;
 	bool connected;
@@ -64,6 +69,8 @@ struct virtio_fs {
 	unsigned int num_request_queues; /* number of request queues */
 	struct dax_device *dax_dev;
 	unsigned int first_reqq_idx;     /* First request queue idx */
+	bool notify_enabled;
+	unsigned int notify_buf_size;    /* Size of notification buffer */
 
 	/* DAX memory window where file contents are mapped */
 	void *window_kaddr;
@@ -91,6 +98,19 @@ struct virtio_fs_req_work {
 static int virtio_fs_enqueue_req(struct virtio_fs_vq *fsvq,
 				 struct fuse_req *req, bool in_flight);
 
+/* Size of virtio_fs_notify specified by fs->notify_buf_size. */
+struct virtio_fs_notify {
+	struct fuse_out_header out_hdr;
+	char outarg[];
+};
+
+struct virtio_fs_notify_node {
+	struct list_head list;
+	struct virtio_fs_notify notify;
+};
+
+static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq);
+
 enum {
 	OPT_DAX,
 };
@@ -136,6 +156,11 @@ static inline struct virtio_fs_vq *vq_to_fsvq(struct virtqueue *vq)
 	return &fs->vqs[vq->index];
 }
 
+static inline struct virtio_fs *fsvq_to_fs(struct virtio_fs_vq *fsvq)
+{
+	return (struct virtio_fs *)fsvq->vq->vdev->priv;
+}
+
 /* Should be called with fsvq->lock held. */
 static inline void inc_in_flight_req(struct virtio_fs_vq *fsvq)
 {
@@ -151,10 +176,17 @@ static inline void dec_in_flight_req(struct virtio_fs_vq *fsvq)
 		complete(&fsvq->in_flight_zero);
 }
 
+static void virtio_fs_free_notify_nodes(struct virtio_fs *fs)
+{
+	if (fs->notify_enabled && fs->vqs)
+		kfree(fs->vqs[VQ_NOTIFY_IDX].notify_nodes);
+}
+
 static void release_virtio_fs_obj(struct kref *ref)
 {
 	struct virtio_fs *vfs = container_of(ref, struct virtio_fs, refcount);
 
+	virtio_fs_free_notify_nodes(vfs);
 	kfree(vfs->vqs);
 	kfree(vfs);
 }
@@ -201,6 +233,13 @@ static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
 	int i;
 
 	for (i = 0; i < fs->nvqs; i++) {
+		/*
+		 * Can't wait to drain notification queue as it always
+		 * had pending requests so that server can use those
+		 * to send notifications
+		 */
+		if (fs->notify_enabled && (i == VQ_NOTIFY_IDX))
+			continue;
 		fsvq = &fs->vqs[i];
 		virtio_fs_drain_queue(fsvq);
 	}
@@ -229,6 +268,8 @@ static void virtio_fs_start_all_queues(struct virtio_fs *fs)
 		spin_lock(&fsvq->lock);
 		fsvq->connected = true;
 		spin_unlock(&fsvq->lock);
+		if (fs->notify_enabled && (i == VQ_NOTIFY_IDX))
+			virtio_fs_enqueue_all_notify(fsvq);
 	}
 }
 
@@ -477,6 +518,98 @@ static void virtio_fs_hiprio_dispatch_work(struct work_struct *work)
 	}
 }
 
+static int virtio_fs_init_notify_vq(struct virtio_fs *fs,
+				    struct virtio_fs_vq *fsvq)
+{
+	struct virtio_fs_notify_node *notifyn;
+	unsigned int notify_node_sz = sizeof(struct list_head) +
+				  fs->notify_buf_size;
+	int i;
+
+	fsvq->notify_nodes = kcalloc(VQ_NOTIFY_ELEMS, notify_node_sz,
+				     GFP_KERNEL);
+	if (!fsvq->notify_nodes)
+		return -ENOMEM;
+
+	for (i = 0; i < VQ_NOTIFY_ELEMS; i++) {
+		notifyn = (void *)fsvq->notify_nodes + (i * notify_node_sz);
+		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+	}
+
+	return 0;
+}
+
+static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
+{
+	struct scatterlist sg[1];
+	int ret;
+	bool kick;
+	struct virtio_fs *fs = fsvq_to_fs(fsvq);
+	struct virtio_fs_notify_node *notifyn, *next;
+	unsigned int notify_sz;
+
+	notify_sz = fs->notify_buf_size;
+	spin_lock(&fsvq->lock);
+	list_for_each_entry_safe(notifyn, next, &fsvq->notify_reqs, list) {
+		list_del_init(&notifyn->list);
+		sg_init_one(sg, &notifyn->notify, notify_sz);
+		ret = virtqueue_add_inbuf(fsvq->vq, sg, 1, notifyn, GFP_ATOMIC);
+		if (ret) {
+			list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+			spin_unlock(&fsvq->lock);
+			return ret;
+		}
+		inc_in_flight_req(fsvq);
+	}
+
+	kick = virtqueue_kick_prepare(fsvq->vq);
+	spin_unlock(&fsvq->lock);
+	if (kick)
+		virtqueue_notify(fsvq->vq);
+	return 0;
+}
+
+static void virtio_fs_notify_done_work(struct work_struct *work)
+{
+	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
+						 done_work);
+	struct virtqueue *vq = fsvq->vq;
+	LIST_HEAD(reqs);
+	struct virtio_fs_notify_node *notifyn, *next;
+
+	spin_lock(&fsvq->lock);
+	do {
+		unsigned int len;
+
+		virtqueue_disable_cb(vq);
+
+		while ((notifyn = virtqueue_get_buf(vq, &len)) != NULL)
+			list_add_tail(&notifyn->list, &reqs);
+
+	} while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
+	spin_unlock(&fsvq->lock);
+
+	/* Process notify */
+	list_for_each_entry_safe(notifyn, next, &reqs, list) {
+		spin_lock(&fsvq->lock);
+		dec_in_flight_req(fsvq);
+		list_del_init(&notifyn->list);
+		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
+		spin_unlock(&fsvq->lock);
+	}
+
+	/*
+	 * If queue is connected, queue notifications again. If not,
+	 * these will be queued again when virtuqueue is restarted.
+	 */
+	if (fsvq->connected)
+		virtio_fs_enqueue_all_notify(fsvq);
+}
+
+static void virtio_fs_notify_dispatch_work(struct work_struct *work)
+{
+}
+
 /* Allocate and copy args into req->argbuf */
 static int copy_args_to_argbuf(struct fuse_req *req)
 {
@@ -644,24 +777,34 @@ static void virtio_fs_vq_done(struct virtqueue *vq)
 	schedule_work(&fsvq->done_work);
 }
 
-static void virtio_fs_init_vq(struct virtio_fs_vq *fsvq, char *name,
-			      int vq_type)
+static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
+			     char *name, int vq_type)
 {
+	int ret = 0;
+
 	strncpy(fsvq->name, name, VQ_NAME_LEN);
 	spin_lock_init(&fsvq->lock);
 	INIT_LIST_HEAD(&fsvq->queued_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
+	INIT_LIST_HEAD(&fsvq->notify_reqs);
 	init_completion(&fsvq->in_flight_zero);
 
 	if (vq_type == VQ_TYPE_REQUEST) {
 		INIT_WORK(&fsvq->done_work, virtio_fs_requests_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_request_dispatch_work);
+	} else if (vq_type == VQ_TYPE_NOTIFY) {
+		INIT_WORK(&fsvq->done_work, virtio_fs_notify_done_work);
+		INIT_DELAYED_WORK(&fsvq->dispatch_work,
+				  virtio_fs_notify_dispatch_work);
+		ret = virtio_fs_init_notify_vq(fs, fsvq);
 	} else {
 		INIT_WORK(&fsvq->done_work, virtio_fs_hiprio_done_work);
 		INIT_DELAYED_WORK(&fsvq->dispatch_work,
 				  virtio_fs_hiprio_dispatch_work);
 	}
+
+	return ret;
 }
 
 /* Initialize virtqueues */
@@ -679,9 +822,28 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 	if (fs->num_request_queues == 0)
 		return -EINVAL;
 
-	/* One hiprio queue and rest are request queues */
-	fs->nvqs = 1 + fs->num_request_queues;
-	fs->first_reqq_idx = 1;
+	if (virtio_has_feature(vdev, VIRTIO_FS_F_NOTIFICATION)) {
+		fs->notify_enabled = true;
+		virtio_cread(vdev, struct virtio_fs_config, notify_buf_size,
+			     &fs->notify_buf_size);
+		if (fs->notify_buf_size <= sizeof(struct fuse_out_header)) {
+			pr_err("virtio-fs: Invalid value %d of notification buffer size\n",
+			       fs->notify_buf_size);
+			return -EINVAL;
+		}
+		pr_info("virtio-fs: device supports notification. Notification_buf_size=%u\n",
+			fs->notify_buf_size);
+	}
+
+	if (fs->notify_enabled) {
+		/* One additional queue for hiprio and one for notifications */
+		fs->nvqs = 2 + fs->num_request_queues;
+		fs->first_reqq_idx = VQ_NOTIFY_IDX + 1;
+	} else {
+		fs->nvqs = 1 + fs->num_request_queues;
+		fs->first_reqq_idx = 1;
+	}
+
 	fs->vqs = kcalloc(fs->nvqs, sizeof(fs->vqs[VQ_HIPRIO_IDX]), GFP_KERNEL);
 	if (!fs->vqs)
 		return -ENOMEM;
@@ -698,16 +860,32 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 	/* Initialize the hiprio/forget request virtqueue */
 	callbacks[VQ_HIPRIO_IDX] = virtio_fs_vq_done;
-	virtio_fs_init_vq(&fs->vqs[VQ_HIPRIO_IDX], "hiprio", VQ_TYPE_HIPRIO);
+	ret = virtio_fs_init_vq(fs, &fs->vqs[VQ_HIPRIO_IDX], "hiprio",
+				VQ_TYPE_HIPRIO);
+	if (ret < 0)
+		goto out;
 	names[VQ_HIPRIO_IDX] = fs->vqs[VQ_HIPRIO_IDX].name;
 
+	/* Initialize notification queue */
+	if (fs->notify_enabled) {
+		callbacks[VQ_NOTIFY_IDX] = virtio_fs_vq_done;
+		ret = virtio_fs_init_vq(fs, &fs->vqs[VQ_NOTIFY_IDX],
+					"notification", VQ_TYPE_NOTIFY);
+		if (ret < 0)
+			goto out;
+		names[VQ_NOTIFY_IDX] = fs->vqs[VQ_NOTIFY_IDX].name;
+	}
+
 	/* Initialize the requests virtqueues */
 	for (i = fs->first_reqq_idx; i < fs->nvqs; i++) {
 		char vq_name[VQ_NAME_LEN];
 
 		snprintf(vq_name, VQ_NAME_LEN, "requests.%u",
 			 i - fs->first_reqq_idx);
-		virtio_fs_init_vq(&fs->vqs[i], vq_name, VQ_TYPE_REQUEST);
+		ret = virtio_fs_init_vq(fs, &fs->vqs[i], vq_name,
+					VQ_TYPE_REQUEST);
+		if (ret < 0)
+			goto out;
 		callbacks[i] = virtio_fs_vq_done;
 		names[i] = fs->vqs[i].name;
 	}
@@ -718,14 +896,14 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
 
 	for (i = 0; i < fs->nvqs; i++)
 		fs->vqs[i].vq = vqs[i];
-
-	virtio_fs_start_all_queues(fs);
 out:
 	kfree(names);
 	kfree(callbacks);
 	kfree(vqs);
-	if (ret)
+	if (ret) {
+		virtio_fs_free_notify_nodes(fs);
 		kfree(fs->vqs);
+	}
 	return ret;
 }
 
@@ -889,6 +1067,7 @@ static int virtio_fs_probe(struct virtio_device *vdev)
 	 * requests need to be sent before we return.
 	 */
 	virtio_device_ready(vdev);
+	virtio_fs_start_all_queues(fs);
 
 	ret = virtio_fs_add_instance(fs);
 	if (ret < 0)
@@ -958,7 +1137,7 @@ static const struct virtio_device_id id_table[] = {
 	{},
 };
 
-static const unsigned int feature_table[] = {};
+static const unsigned int feature_table[] = {VIRTIO_FS_F_NOTIFICATION};
 
 static struct virtio_driver virtio_fs_driver = {
 	.driver.name		= KBUILD_MODNAME,
diff --git a/include/uapi/linux/virtio_fs.h b/include/uapi/linux/virtio_fs.h
index bea38291421b..3a9bbccf4115 100644
--- a/include/uapi/linux/virtio_fs.h
+++ b/include/uapi/linux/virtio_fs.h
@@ -8,12 +8,17 @@
 #include <linux/virtio_config.h>
 #include <linux/virtio_types.h>
 
+/* Feature bits */
+#define VIRTIO_FS_F_NOTIFICATION 0	/* Notification queue supported */
+
 struct virtio_fs_config {
 	/* Filesystem name (UTF-8, not NUL-terminated, padded with NULs) */
 	__u8 tag[36];
 
 	/* Number of request queues */
 	__le32 num_request_queues;
+	/* Size of notification buffer */
+	__u32 notify_buf_size;
 } __attribute__((packed));
 
 /* For the id field in virtio_pci_shm_cap */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 6/8] virtiofs: Add a helper to end request and decrement inflight number
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Add a helper function to end fuse request and decrement number of
inflight requests. This pattern is already used at two places and
I am planning to use it two more times in later patches. Adding
a helper reduces number of lines of code and improves readability.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index b70a22a79901..8d33879d62fb 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -380,6 +380,15 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work)
 	spin_unlock(&fsvq->lock);
 }
 
+static void end_req_dec_in_flight(struct fuse_req *req,
+				  struct virtio_fs_vq *fsvq)
+{
+	fuse_request_end(req);
+	spin_lock(&fsvq->lock);
+	dec_in_flight_req(fsvq);
+	spin_unlock(&fsvq->lock);
+}
+
 static void virtio_fs_request_dispatch_work(struct work_struct *work)
 {
 	struct fuse_req *req;
@@ -425,12 +434,9 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
 				return;
 			}
 			req->out.h.error = ret;
-			spin_lock(&fsvq->lock);
-			dec_in_flight_req(fsvq);
-			spin_unlock(&fsvq->lock);
 			pr_err("virtio-fs: virtio_fs_enqueue_req() failed %d\n",
 			       ret);
-			fuse_request_end(req);
+			end_req_dec_in_flight(req, fsvq);
 		}
 	}
 }
@@ -709,10 +715,7 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 		}
 	}
 
-	fuse_request_end(req);
-	spin_lock(&fsvq->lock);
-	dec_in_flight_req(fsvq);
-	spin_unlock(&fsvq->lock);
+	end_req_dec_in_flight(req, fsvq);
 }
 
 static void virtio_fs_complete_req_work(struct work_struct *work)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 6/8] virtiofs: Add a helper to end request and decrement inflight number
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Add a helper function to end fuse request and decrement number of
inflight requests. This pattern is already used at two places and
I am planning to use it two more times in later patches. Adding
a helper reduces number of lines of code and improves readability.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index b70a22a79901..8d33879d62fb 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -380,6 +380,15 @@ static void virtio_fs_hiprio_done_work(struct work_struct *work)
 	spin_unlock(&fsvq->lock);
 }
 
+static void end_req_dec_in_flight(struct fuse_req *req,
+				  struct virtio_fs_vq *fsvq)
+{
+	fuse_request_end(req);
+	spin_lock(&fsvq->lock);
+	dec_in_flight_req(fsvq);
+	spin_unlock(&fsvq->lock);
+}
+
 static void virtio_fs_request_dispatch_work(struct work_struct *work)
 {
 	struct fuse_req *req;
@@ -425,12 +434,9 @@ static void virtio_fs_request_dispatch_work(struct work_struct *work)
 				return;
 			}
 			req->out.h.error = ret;
-			spin_lock(&fsvq->lock);
-			dec_in_flight_req(fsvq);
-			spin_unlock(&fsvq->lock);
 			pr_err("virtio-fs: virtio_fs_enqueue_req() failed %d\n",
 			       ret);
-			fuse_request_end(req);
+			end_req_dec_in_flight(req, fsvq);
 		}
 	}
 }
@@ -709,10 +715,7 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 		}
 	}
 
-	fuse_request_end(req);
-	spin_lock(&fsvq->lock);
-	dec_in_flight_req(fsvq);
-	spin_unlock(&fsvq->lock);
+	end_req_dec_in_flight(req, fsvq);
 }
 
 static void virtio_fs_complete_req_work(struct work_struct *work)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
sent by file server to signifiy that a previous locking request has
completed and actual caller should be woken up.

As of now we don't support blocking variant of posix locks and daemon
returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
Virtqueue size is limited and it is possible we fill virtqueue with
all the requests of fcntl(F_SETLKW) and wait for reply. And later a
subsequent unlock request can't make progress because virtqueue is full.
And that means F_SETLKW can't make progress and we are deadlocked.

This problem is not limited to posix locks only. I think blocking remote
flock and open file description locks should face the same issue. Right
now fuse does not support open file description locks, so its not
a problem. But fuse/virtiofs does support remote flock and they can use
same mechanism too.

Use notification queue to solve this problem. After submitting lock
request device will send a reply asking requester to wait. Once lock is
available, requester will get a notification saying lock is available.
That way we don't keep the request virtueue busy while we are waiting for
lock and further unlock requests can make progress.

When we get a reply in response to lock request, we need a way to know
if we need to wait for notification or not. I have overloaded the
fuse_out_header->error field. If value is ->error is 1, that's a signal
to caller to wait for lock notification. Overloading ->error in this way
is not the best way to do it. But I am running out of ideas.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c       | 73 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/fuse.h | 11 +++++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 8d33879d62fb..1634ea2d0555 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -48,6 +48,7 @@ struct virtio_fs_vq {
 	struct virtqueue *vq;     /* protected by ->lock */
 	struct work_struct done_work;
 	struct list_head queued_reqs;
+	struct list_head wait_reqs;     /* Requests waiting for notification  */
 	struct list_head end_reqs;	/* End these requests */
 	struct virtio_fs_notify_node *notify_nodes;
 	struct list_head notify_reqs;	/* List for queuing notify requests */
@@ -575,13 +576,72 @@ static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
 	return 0;
 }
 
+static int notify_complete_waiting_req(struct virtio_fs *vfs,
+				       struct fuse_notify_lock_out *out_args)
+{
+	/* TODO: Handle multiqueue */
+	struct virtio_fs_vq *fsvq = &vfs->vqs[vfs->first_reqq_idx];
+	struct fuse_req *req, *next;
+	bool found = false;
+
+	/* Find waiting request with the unique number and end it */
+	spin_lock(&fsvq->lock);
+	list_for_each_entry_safe(req, next, &fsvq->wait_reqs, list) {
+		if (req->in.h.unique == out_args->unique) {
+			list_del_init(&req->list);
+			clear_bit(FR_SENT, &req->flags);
+			/* Transfer error code from notify */
+			req->out.h.error = out_args->error;
+			found = true;
+			break;
+		}
+	}
+	spin_unlock(&fsvq->lock);
+
+	/*
+	 * TODO: It is possible that some re-ordering happens in notify
+	 * comes before request is complete. Deal with it.
+	 */
+	if (found) {
+		end_req_dec_in_flight(req, fsvq);
+	} else
+		pr_debug("virtio-fs: Did not find waiting request with unique=0x%llx\n",
+			 out_args->unique);
+
+	return 0;
+}
+
+static int virtio_fs_handle_notify(struct virtio_fs *vfs,
+				   struct virtio_fs_notify *notify)
+{
+	int ret = 0;
+	struct fuse_out_header *oh = &notify->out_hdr;
+	struct fuse_notify_lock_out *lo;
+
+	/*
+	 * For notifications, oh.unique is 0 and oh->error contains code
+	 * for which notification as arrived.
+	 */
+	switch (oh->error) {
+	case FUSE_NOTIFY_LOCK:
+		lo = (struct fuse_notify_lock_out *) &notify->outarg;
+		notify_complete_waiting_req(vfs, lo);
+		break;
+	default:
+		pr_err("virtio-fs: Unexpected notification %d\n", oh->error);
+	}
+	return ret;
+}
+
 static void virtio_fs_notify_done_work(struct work_struct *work)
 {
 	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
 						 done_work);
 	struct virtqueue *vq = fsvq->vq;
+	struct virtio_fs *vfs = vq->vdev->priv;
 	LIST_HEAD(reqs);
 	struct virtio_fs_notify_node *notifyn, *next;
+	struct fuse_out_header *oh;
 
 	spin_lock(&fsvq->lock);
 	do {
@@ -597,6 +657,10 @@ static void virtio_fs_notify_done_work(struct work_struct *work)
 
 	/* Process notify */
 	list_for_each_entry_safe(notifyn, next, &reqs, list) {
+		oh = &notifyn->notify.out_hdr;
+		WARN_ON(oh->unique);
+		/* Handle notification */
+		virtio_fs_handle_notify(vfs, &notifyn->notify);
 		spin_lock(&fsvq->lock);
 		dec_in_flight_req(fsvq);
 		list_del_init(&notifyn->list);
@@ -696,6 +760,14 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 	 * TODO verify that server properly follows FUSE protocol
 	 * (oh.uniq, oh.len)
 	 */
+	if (req->out.h.error == 1) {
+		/* Wait for notification to complete request */
+		spin_lock(&fsvq->lock);
+		list_add_tail(&req->list, &fsvq->wait_reqs);
+		spin_unlock(&fsvq->lock);
+		return;
+	}
+
 	args = req->args;
 	copy_args_from_argbuf(args, req);
 
@@ -788,6 +860,7 @@ static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
 	strncpy(fsvq->name, name, VQ_NAME_LEN);
 	spin_lock_init(&fsvq->lock);
 	INIT_LIST_HEAD(&fsvq->queued_reqs);
+	INIT_LIST_HEAD(&fsvq->wait_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	INIT_LIST_HEAD(&fsvq->notify_reqs);
 	init_completion(&fsvq->in_flight_zero);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 36ed092227fa..46838551ea84 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -184,6 +184,8 @@
  *
  *  7.34
  *  - add FUSE_SYNCFS
+ *  7.35
+ *  - add FUSE_NOTIFY_LOCK
  */
 
 #ifndef _LINUX_FUSE_H
@@ -219,7 +221,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 34
+#define FUSE_KERNEL_MINOR_VERSION 35
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -529,6 +531,7 @@ enum fuse_notify_code {
 	FUSE_NOTIFY_STORE = 4,
 	FUSE_NOTIFY_RETRIEVE = 5,
 	FUSE_NOTIFY_DELETE = 6,
+	FUSE_NOTIFY_LOCK = 7,
 	FUSE_NOTIFY_CODE_MAX,
 };
 
@@ -920,6 +923,12 @@ struct fuse_notify_retrieve_in {
 	uint64_t	dummy4;
 };
 
+struct fuse_notify_lock_out {
+	uint64_t	unique;
+	int32_t		error;
+	int32_t		padding;
+};
+
 /* Device ioctls: */
 #define FUSE_DEV_IOC_MAGIC		229
 #define FUSE_DEV_IOC_CLONE		_IOR(FUSE_DEV_IOC_MAGIC, 0, uint32_t)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
sent by file server to signifiy that a previous locking request has
completed and actual caller should be woken up.

As of now we don't support blocking variant of posix locks and daemon
returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
Virtqueue size is limited and it is possible we fill virtqueue with
all the requests of fcntl(F_SETLKW) and wait for reply. And later a
subsequent unlock request can't make progress because virtqueue is full.
And that means F_SETLKW can't make progress and we are deadlocked.

This problem is not limited to posix locks only. I think blocking remote
flock and open file description locks should face the same issue. Right
now fuse does not support open file description locks, so its not
a problem. But fuse/virtiofs does support remote flock and they can use
same mechanism too.

Use notification queue to solve this problem. After submitting lock
request device will send a reply asking requester to wait. Once lock is
available, requester will get a notification saying lock is available.
That way we don't keep the request virtueue busy while we are waiting for
lock and further unlock requests can make progress.

When we get a reply in response to lock request, we need a way to know
if we need to wait for notification or not. I have overloaded the
fuse_out_header->error field. If value is ->error is 1, that's a signal
to caller to wait for lock notification. Overloading ->error in this way
is not the best way to do it. But I am running out of ideas.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@redhat.com>
---
 fs/fuse/virtio_fs.c       | 73 +++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/fuse.h | 11 +++++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 8d33879d62fb..1634ea2d0555 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -48,6 +48,7 @@ struct virtio_fs_vq {
 	struct virtqueue *vq;     /* protected by ->lock */
 	struct work_struct done_work;
 	struct list_head queued_reqs;
+	struct list_head wait_reqs;     /* Requests waiting for notification  */
 	struct list_head end_reqs;	/* End these requests */
 	struct virtio_fs_notify_node *notify_nodes;
 	struct list_head notify_reqs;	/* List for queuing notify requests */
@@ -575,13 +576,72 @@ static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
 	return 0;
 }
 
+static int notify_complete_waiting_req(struct virtio_fs *vfs,
+				       struct fuse_notify_lock_out *out_args)
+{
+	/* TODO: Handle multiqueue */
+	struct virtio_fs_vq *fsvq = &vfs->vqs[vfs->first_reqq_idx];
+	struct fuse_req *req, *next;
+	bool found = false;
+
+	/* Find waiting request with the unique number and end it */
+	spin_lock(&fsvq->lock);
+	list_for_each_entry_safe(req, next, &fsvq->wait_reqs, list) {
+		if (req->in.h.unique == out_args->unique) {
+			list_del_init(&req->list);
+			clear_bit(FR_SENT, &req->flags);
+			/* Transfer error code from notify */
+			req->out.h.error = out_args->error;
+			found = true;
+			break;
+		}
+	}
+	spin_unlock(&fsvq->lock);
+
+	/*
+	 * TODO: It is possible that some re-ordering happens in notify
+	 * comes before request is complete. Deal with it.
+	 */
+	if (found) {
+		end_req_dec_in_flight(req, fsvq);
+	} else
+		pr_debug("virtio-fs: Did not find waiting request with unique=0x%llx\n",
+			 out_args->unique);
+
+	return 0;
+}
+
+static int virtio_fs_handle_notify(struct virtio_fs *vfs,
+				   struct virtio_fs_notify *notify)
+{
+	int ret = 0;
+	struct fuse_out_header *oh = &notify->out_hdr;
+	struct fuse_notify_lock_out *lo;
+
+	/*
+	 * For notifications, oh.unique is 0 and oh->error contains code
+	 * for which notification as arrived.
+	 */
+	switch (oh->error) {
+	case FUSE_NOTIFY_LOCK:
+		lo = (struct fuse_notify_lock_out *) &notify->outarg;
+		notify_complete_waiting_req(vfs, lo);
+		break;
+	default:
+		pr_err("virtio-fs: Unexpected notification %d\n", oh->error);
+	}
+	return ret;
+}
+
 static void virtio_fs_notify_done_work(struct work_struct *work)
 {
 	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
 						 done_work);
 	struct virtqueue *vq = fsvq->vq;
+	struct virtio_fs *vfs = vq->vdev->priv;
 	LIST_HEAD(reqs);
 	struct virtio_fs_notify_node *notifyn, *next;
+	struct fuse_out_header *oh;
 
 	spin_lock(&fsvq->lock);
 	do {
@@ -597,6 +657,10 @@ static void virtio_fs_notify_done_work(struct work_struct *work)
 
 	/* Process notify */
 	list_for_each_entry_safe(notifyn, next, &reqs, list) {
+		oh = &notifyn->notify.out_hdr;
+		WARN_ON(oh->unique);
+		/* Handle notification */
+		virtio_fs_handle_notify(vfs, &notifyn->notify);
 		spin_lock(&fsvq->lock);
 		dec_in_flight_req(fsvq);
 		list_del_init(&notifyn->list);
@@ -696,6 +760,14 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 	 * TODO verify that server properly follows FUSE protocol
 	 * (oh.uniq, oh.len)
 	 */
+	if (req->out.h.error == 1) {
+		/* Wait for notification to complete request */
+		spin_lock(&fsvq->lock);
+		list_add_tail(&req->list, &fsvq->wait_reqs);
+		spin_unlock(&fsvq->lock);
+		return;
+	}
+
 	args = req->args;
 	copy_args_from_argbuf(args, req);
 
@@ -788,6 +860,7 @@ static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
 	strncpy(fsvq->name, name, VQ_NAME_LEN);
 	spin_lock_init(&fsvq->lock);
 	INIT_LIST_HEAD(&fsvq->queued_reqs);
+	INIT_LIST_HEAD(&fsvq->wait_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	INIT_LIST_HEAD(&fsvq->notify_reqs);
 	init_completion(&fsvq->in_flight_zero);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 36ed092227fa..46838551ea84 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -184,6 +184,8 @@
  *
  *  7.34
  *  - add FUSE_SYNCFS
+ *  7.35
+ *  - add FUSE_NOTIFY_LOCK
  */
 
 #ifndef _LINUX_FUSE_H
@@ -219,7 +221,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 34
+#define FUSE_KERNEL_MINOR_VERSION 35
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -529,6 +531,7 @@ enum fuse_notify_code {
 	FUSE_NOTIFY_STORE = 4,
 	FUSE_NOTIFY_RETRIEVE = 5,
 	FUSE_NOTIFY_DELETE = 6,
+	FUSE_NOTIFY_LOCK = 7,
 	FUSE_NOTIFY_CODE_MAX,
 };
 
@@ -920,6 +923,12 @@ struct fuse_notify_retrieve_in {
 	uint64_t	dummy4;
 };
 
+struct fuse_notify_lock_out {
+	uint64_t	unique;
+	int32_t		error;
+	int32_t		padding;
+};
+
 /* Device ioctls: */
 #define FUSE_DEV_IOC_MAGIC		229
 #define FUSE_DEV_IOC_CLONE		_IOR(FUSE_DEV_IOC_MAGIC, 0, uint32_t)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 8/8] virtiofs: Handle reordering of reply and notification event
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 14:38   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha
  Cc: vgoyal, iangelak, jaggel, dgilbert

So far we are relying that for a locking requrest reply always comes
first and then notification comes in. Both of these travel on different
virtqueues and there is no guarantee that client will always see/process
them in this order.

Though it is a very unlikely event, but it is possible that notification
comes first and then reply to lock request comes. Hence take care
of this possibility. If notification arrives and does not find a
corresponding request waiting, queue up notification in a list. When
request reply will come, it will check if corresponding notification
has already arrived. If yes, it will finish the request otherwise
wait for notification to arrive.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 134 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 116 insertions(+), 18 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 1634ea2d0555..028ca9198bc4 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -42,6 +42,17 @@ enum {
 #define VQ_NAME_LEN	24
 #define VQ_NOTIFY_ELEMS 16	/* Number of notification elements */
 
+/*
+ * virtio_fs_vq->lock spinlock nesting subclasses:
+ *
+ * 0: normal
+ * 1: nested
+ */
+enum {
+	VQ_LOCK_NORMAL,
+	VQ_LOCK_NESTED
+};
+
 /* Per-virtqueue state */
 struct virtio_fs_vq {
 	spinlock_t lock;
@@ -52,6 +63,8 @@ struct virtio_fs_vq {
 	struct list_head end_reqs;	/* End these requests */
 	struct virtio_fs_notify_node *notify_nodes;
 	struct list_head notify_reqs;	/* List for queuing notify requests */
+	/* Notifications to be processed when request reply arrives. */
+	struct list_head wait_notify;
 	struct delayed_work dispatch_work;
 	struct fuse_dev *fud;
 	bool connected;
@@ -576,13 +589,32 @@ static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
 	return 0;
 }
 
+/*
+ * Find a waiting process/request and complete it. If request and
+ * notification got re-ordered, it is possible that there is no
+ * waiting request yet. In that case queue up notification and
+ * return 1 so that caller does not try to reuse notifiy node yet.
+ *
+ * Return:
+ * negative value :  in case of error
+ * 0		  : If request processed successfully
+ * 1		  : Could not find waiting processed. Request got queued.
+ *		    Do not reuse this notify node yet.
+ */
 static int notify_complete_waiting_req(struct virtio_fs *vfs,
 				       struct fuse_notify_lock_out *out_args)
 {
 	/* TODO: Handle multiqueue */
 	struct virtio_fs_vq *fsvq = &vfs->vqs[vfs->first_reqq_idx];
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
 	struct fuse_req *req, *next;
+	struct virtio_fs_notify *notify;
+	struct virtio_fs_notify_node *notifyn;
 	bool found = false;
+	int ret = 0;
+
+	notify = container_of((void *)out_args, struct virtio_fs_notify, outarg);
+	notifyn = container_of(notify, struct virtio_fs_notify_node, notify);
 
 	/* Find waiting request with the unique number and end it */
 	spin_lock(&fsvq->lock);
@@ -596,25 +628,39 @@ static int notify_complete_waiting_req(struct virtio_fs *vfs,
 			break;
 		}
 	}
-	spin_unlock(&fsvq->lock);
-
 	/*
-	 * TODO: It is possible that some re-ordering happens in notify
-	 * comes before request is complete. Deal with it.
+	 * It probably is a rare case of re-ordering where notification
+	 * arrived/processed before reply. Queue up the notification.
 	 */
+	spin_lock_nested(&notify_fsvq->lock, VQ_LOCK_NESTED);
+	if (!found) {
+		list_add_tail(&notifyn->list, &notify_fsvq->wait_notify);
+		ret = 1;
+	}
+	spin_unlock(&notify_fsvq->lock);
+	spin_unlock(&fsvq->lock);
+
 	if (found) {
 		end_req_dec_in_flight(req, fsvq);
-	} else
-		pr_debug("virtio-fs: Did not find waiting request with unique=0x%llx\n",
-			 out_args->unique);
+	}
 
-	return 0;
+	return ret;
+}
+
+static void notify_node_reuse(struct virtio_fs_vq *notify_fsvq,
+			      struct virtio_fs_notify_node *notifyn)
+{
+	spin_lock(&notify_fsvq->lock);
+	list_add_tail(&notifyn->list, &notify_fsvq->notify_reqs);
+	spin_unlock(&notify_fsvq->lock);
 }
 
 static int virtio_fs_handle_notify(struct virtio_fs *vfs,
-				   struct virtio_fs_notify *notify)
+				   struct virtio_fs_notify_node *notifyn)
 {
-	int ret = 0;
+	int ret = 0, no_reuse = 0;
+	struct virtio_fs_notify *notify = &notifyn->notify;
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
 	struct fuse_out_header *oh = &notify->out_hdr;
 	struct fuse_notify_lock_out *lo;
 
@@ -625,11 +671,15 @@ static int virtio_fs_handle_notify(struct virtio_fs *vfs,
 	switch (oh->error) {
 	case FUSE_NOTIFY_LOCK:
 		lo = (struct fuse_notify_lock_out *) &notify->outarg;
-		notify_complete_waiting_req(vfs, lo);
+		no_reuse = notify_complete_waiting_req(vfs, lo);
 		break;
 	default:
 		pr_err("virtio-fs: Unexpected notification %d\n", oh->error);
 	}
+
+	if (!no_reuse)
+		notify_node_reuse(notify_fsvq, notifyn);
+
 	return ret;
 }
 
@@ -659,12 +709,11 @@ static void virtio_fs_notify_done_work(struct work_struct *work)
 	list_for_each_entry_safe(notifyn, next, &reqs, list) {
 		oh = &notifyn->notify.out_hdr;
 		WARN_ON(oh->unique);
+		list_del_init(&notifyn->list);
 		/* Handle notification */
-		virtio_fs_handle_notify(vfs, &notifyn->notify);
+		virtio_fs_handle_notify(vfs, notifyn);
 		spin_lock(&fsvq->lock);
 		dec_in_flight_req(fsvq);
-		list_del_init(&notifyn->list);
-		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
 		spin_unlock(&fsvq->lock);
 	}
 
@@ -747,6 +796,57 @@ static void copy_args_from_argbuf(struct fuse_args *args, struct fuse_req *req)
 	req->argbuf = NULL;
 }
 
+static void virtio_fs_wait_or_complete_request(struct virtio_fs_vq *fsvq,
+					       struct fuse_req *req)
+{
+	struct virtqueue *vq = fsvq->vq;
+	struct virtio_fs *vfs = vq->vdev->priv;
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
+	bool found = false;
+	struct virtio_fs_notify_node *notifyn, *next;
+
+	/*
+	 * Note, notification queue lock nests inside request queue
+	 * lock and not otherwise.
+	 */
+	spin_lock(&fsvq->lock);
+	spin_lock_nested(&notify_fsvq->lock, VQ_LOCK_NESTED);
+	/*
+	 * Check if there is already a notification event in case of
+	 * reply and notification event got re-ordered. Very unlikley.
+	 */
+	list_for_each_entry_safe(notifyn, next, &notify_fsvq->wait_notify,
+				 list) {
+		struct fuse_notify_lock_out *lo;
+
+		lo = (struct fuse_notify_lock_out *) &notifyn->notify.outarg;
+
+		if (req->in.h.unique == lo->unique) {
+			list_del_init(&notifyn->list);
+			clear_bit(FR_SENT, &req->flags);
+			/* Transfer error code from notify */
+			req->out.h.error = lo->error;
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		/* Wait for notification to arrive. */
+		list_add_tail(&req->list, &fsvq->wait_reqs);
+	}
+
+	spin_unlock(&notify_fsvq->lock);
+	spin_unlock(&fsvq->lock);
+
+	if (found) {
+		end_req_dec_in_flight(req, fsvq);
+		notify_node_reuse(notify_fsvq, notifyn);
+		if (fsvq->connected)
+			virtio_fs_enqueue_all_notify(fsvq);
+	}
+}
+
 /* Work function for request completion */
 static void virtio_fs_request_complete(struct fuse_req *req,
 				       struct virtio_fs_vq *fsvq)
@@ -762,10 +862,7 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 	 */
 	if (req->out.h.error == 1) {
 		/* Wait for notification to complete request */
-		spin_lock(&fsvq->lock);
-		list_add_tail(&req->list, &fsvq->wait_reqs);
-		spin_unlock(&fsvq->lock);
-		return;
+		return virtio_fs_wait_or_complete_request(fsvq, req);
 	}
 
 	args = req->args;
@@ -863,6 +960,7 @@ static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
 	INIT_LIST_HEAD(&fsvq->wait_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	INIT_LIST_HEAD(&fsvq->notify_reqs);
+	INIT_LIST_HEAD(&fsvq->wait_notify);
 	init_completion(&fsvq->in_flight_zero);
 
 	if (vq_type == VQ_TYPE_REQUEST) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Virtio-fs] [PATCH 8/8] virtiofs: Handle reordering of reply and notification event
@ 2021-09-30 14:38   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 14:38 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: vgoyal

So far we are relying that for a locking requrest reply always comes
first and then notification comes in. Both of these travel on different
virtqueues and there is no guarantee that client will always see/process
them in this order.

Though it is a very unlikely event, but it is possible that notification
comes first and then reply to lock request comes. Hence take care
of this possibility. If notification arrives and does not find a
corresponding request waiting, queue up notification in a list. When
request reply will come, it will check if corresponding notification
has already arrived. If yes, it will finish the request otherwise
wait for notification to arrive.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 fs/fuse/virtio_fs.c | 134 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 116 insertions(+), 18 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 1634ea2d0555..028ca9198bc4 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -42,6 +42,17 @@ enum {
 #define VQ_NAME_LEN	24
 #define VQ_NOTIFY_ELEMS 16	/* Number of notification elements */
 
+/*
+ * virtio_fs_vq->lock spinlock nesting subclasses:
+ *
+ * 0: normal
+ * 1: nested
+ */
+enum {
+	VQ_LOCK_NORMAL,
+	VQ_LOCK_NESTED
+};
+
 /* Per-virtqueue state */
 struct virtio_fs_vq {
 	spinlock_t lock;
@@ -52,6 +63,8 @@ struct virtio_fs_vq {
 	struct list_head end_reqs;	/* End these requests */
 	struct virtio_fs_notify_node *notify_nodes;
 	struct list_head notify_reqs;	/* List for queuing notify requests */
+	/* Notifications to be processed when request reply arrives. */
+	struct list_head wait_notify;
 	struct delayed_work dispatch_work;
 	struct fuse_dev *fud;
 	bool connected;
@@ -576,13 +589,32 @@ static int virtio_fs_enqueue_all_notify(struct virtio_fs_vq *fsvq)
 	return 0;
 }
 
+/*
+ * Find a waiting process/request and complete it. If request and
+ * notification got re-ordered, it is possible that there is no
+ * waiting request yet. In that case queue up notification and
+ * return 1 so that caller does not try to reuse notifiy node yet.
+ *
+ * Return:
+ * negative value :  in case of error
+ * 0		  : If request processed successfully
+ * 1		  : Could not find waiting processed. Request got queued.
+ *		    Do not reuse this notify node yet.
+ */
 static int notify_complete_waiting_req(struct virtio_fs *vfs,
 				       struct fuse_notify_lock_out *out_args)
 {
 	/* TODO: Handle multiqueue */
 	struct virtio_fs_vq *fsvq = &vfs->vqs[vfs->first_reqq_idx];
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
 	struct fuse_req *req, *next;
+	struct virtio_fs_notify *notify;
+	struct virtio_fs_notify_node *notifyn;
 	bool found = false;
+	int ret = 0;
+
+	notify = container_of((void *)out_args, struct virtio_fs_notify, outarg);
+	notifyn = container_of(notify, struct virtio_fs_notify_node, notify);
 
 	/* Find waiting request with the unique number and end it */
 	spin_lock(&fsvq->lock);
@@ -596,25 +628,39 @@ static int notify_complete_waiting_req(struct virtio_fs *vfs,
 			break;
 		}
 	}
-	spin_unlock(&fsvq->lock);
-
 	/*
-	 * TODO: It is possible that some re-ordering happens in notify
-	 * comes before request is complete. Deal with it.
+	 * It probably is a rare case of re-ordering where notification
+	 * arrived/processed before reply. Queue up the notification.
 	 */
+	spin_lock_nested(&notify_fsvq->lock, VQ_LOCK_NESTED);
+	if (!found) {
+		list_add_tail(&notifyn->list, &notify_fsvq->wait_notify);
+		ret = 1;
+	}
+	spin_unlock(&notify_fsvq->lock);
+	spin_unlock(&fsvq->lock);
+
 	if (found) {
 		end_req_dec_in_flight(req, fsvq);
-	} else
-		pr_debug("virtio-fs: Did not find waiting request with unique=0x%llx\n",
-			 out_args->unique);
+	}
 
-	return 0;
+	return ret;
+}
+
+static void notify_node_reuse(struct virtio_fs_vq *notify_fsvq,
+			      struct virtio_fs_notify_node *notifyn)
+{
+	spin_lock(&notify_fsvq->lock);
+	list_add_tail(&notifyn->list, &notify_fsvq->notify_reqs);
+	spin_unlock(&notify_fsvq->lock);
 }
 
 static int virtio_fs_handle_notify(struct virtio_fs *vfs,
-				   struct virtio_fs_notify *notify)
+				   struct virtio_fs_notify_node *notifyn)
 {
-	int ret = 0;
+	int ret = 0, no_reuse = 0;
+	struct virtio_fs_notify *notify = &notifyn->notify;
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
 	struct fuse_out_header *oh = &notify->out_hdr;
 	struct fuse_notify_lock_out *lo;
 
@@ -625,11 +671,15 @@ static int virtio_fs_handle_notify(struct virtio_fs *vfs,
 	switch (oh->error) {
 	case FUSE_NOTIFY_LOCK:
 		lo = (struct fuse_notify_lock_out *) &notify->outarg;
-		notify_complete_waiting_req(vfs, lo);
+		no_reuse = notify_complete_waiting_req(vfs, lo);
 		break;
 	default:
 		pr_err("virtio-fs: Unexpected notification %d\n", oh->error);
 	}
+
+	if (!no_reuse)
+		notify_node_reuse(notify_fsvq, notifyn);
+
 	return ret;
 }
 
@@ -659,12 +709,11 @@ static void virtio_fs_notify_done_work(struct work_struct *work)
 	list_for_each_entry_safe(notifyn, next, &reqs, list) {
 		oh = &notifyn->notify.out_hdr;
 		WARN_ON(oh->unique);
+		list_del_init(&notifyn->list);
 		/* Handle notification */
-		virtio_fs_handle_notify(vfs, &notifyn->notify);
+		virtio_fs_handle_notify(vfs, notifyn);
 		spin_lock(&fsvq->lock);
 		dec_in_flight_req(fsvq);
-		list_del_init(&notifyn->list);
-		list_add_tail(&notifyn->list, &fsvq->notify_reqs);
 		spin_unlock(&fsvq->lock);
 	}
 
@@ -747,6 +796,57 @@ static void copy_args_from_argbuf(struct fuse_args *args, struct fuse_req *req)
 	req->argbuf = NULL;
 }
 
+static void virtio_fs_wait_or_complete_request(struct virtio_fs_vq *fsvq,
+					       struct fuse_req *req)
+{
+	struct virtqueue *vq = fsvq->vq;
+	struct virtio_fs *vfs = vq->vdev->priv;
+	struct virtio_fs_vq *notify_fsvq = &vfs->vqs[VQ_NOTIFY_IDX];
+	bool found = false;
+	struct virtio_fs_notify_node *notifyn, *next;
+
+	/*
+	 * Note, notification queue lock nests inside request queue
+	 * lock and not otherwise.
+	 */
+	spin_lock(&fsvq->lock);
+	spin_lock_nested(&notify_fsvq->lock, VQ_LOCK_NESTED);
+	/*
+	 * Check if there is already a notification event in case of
+	 * reply and notification event got re-ordered. Very unlikley.
+	 */
+	list_for_each_entry_safe(notifyn, next, &notify_fsvq->wait_notify,
+				 list) {
+		struct fuse_notify_lock_out *lo;
+
+		lo = (struct fuse_notify_lock_out *) &notifyn->notify.outarg;
+
+		if (req->in.h.unique == lo->unique) {
+			list_del_init(&notifyn->list);
+			clear_bit(FR_SENT, &req->flags);
+			/* Transfer error code from notify */
+			req->out.h.error = lo->error;
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		/* Wait for notification to arrive. */
+		list_add_tail(&req->list, &fsvq->wait_reqs);
+	}
+
+	spin_unlock(&notify_fsvq->lock);
+	spin_unlock(&fsvq->lock);
+
+	if (found) {
+		end_req_dec_in_flight(req, fsvq);
+		notify_node_reuse(notify_fsvq, notifyn);
+		if (fsvq->connected)
+			virtio_fs_enqueue_all_notify(fsvq);
+	}
+}
+
 /* Work function for request completion */
 static void virtio_fs_request_complete(struct fuse_req *req,
 				       struct virtio_fs_vq *fsvq)
@@ -762,10 +862,7 @@ static void virtio_fs_request_complete(struct fuse_req *req,
 	 */
 	if (req->out.h.error == 1) {
 		/* Wait for notification to complete request */
-		spin_lock(&fsvq->lock);
-		list_add_tail(&req->list, &fsvq->wait_reqs);
-		spin_unlock(&fsvq->lock);
-		return;
+		return virtio_fs_wait_or_complete_request(fsvq, req);
 	}
 
 	args = req->args;
@@ -863,6 +960,7 @@ static int virtio_fs_init_vq(struct virtio_fs *fs, struct virtio_fs_vq *fsvq,
 	INIT_LIST_HEAD(&fsvq->wait_reqs);
 	INIT_LIST_HEAD(&fsvq->end_reqs);
 	INIT_LIST_HEAD(&fsvq->notify_reqs);
+	INIT_LIST_HEAD(&fsvq->wait_notify);
 	init_completion(&fsvq->in_flight_zero);
 
 	if (vq_type == VQ_TYPE_REQUEST) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/8] virtiofs: Notification queue and blocking posix locks
  2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
@ 2021-09-30 15:43   ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 15:43 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha; +Cc: iangelak, jaggel, dgilbert

On Thu, Sep 30, 2021 at 10:38:42AM -0400, Vivek Goyal wrote:
> Hi,
> 
> As of now we do not support blocking remote posix locks with virtiofs.
> Well fuse client does not care but server returns -EOPNOTSUPP.

Posted corresponding qemu/virtiofsd changes here.

https://listman.redhat.com/archives/virtio-fs/2021-September/msg00153.html

Thanks
Vivek

> 
> There are couple of reasons to not support it yet.
> 
> - If virtiofsd is single threaded or does not have a thread pool just
>   to handle requests which can block for a long time, virtiofsd will
>   stop processing new requests and virtiofs will come to a halt.
>   To the extent that further unlock request will not make progress
>   and deadlock will result. This can be taken care of by creating
>   a custom thread pool in virtiofsd just to hanlde lock requests.
> 
> - If client sends a blocking lock request and blocks, then it will
>   consume descriptors in vring. If enough processes block, it is
>   possible that vring does not have capacity to send more requests
>   till some response comes back and descriptors are free. This can
>   also lead to deadlock where an unlock request can't be sent to
>   virtiofsd now. Also this will stop virtiofs operation as well as
>   new filesystem requests can't be sent.
> 
> To avoid this issue, idea was suggested thatn when a blocking
> lock request is sent by client, do not block it. Immediately
> send a reply saying client process should wait for a notification
> which will let it know once lock is available. This will make
> sure descriptors in virtqueue are not kept busy while we are
> waiting for lock and future unlock and other file system requests
> can continue to make progress.
> 
> This first requires a notion of notification queue and virtiosfd
> being able to send notifications to client. This patch series
> implements that as well.
> 
> As of now only one notification type has been implemented but now
> infrastructure is in place and other use cases should be easily
> add more type of notifications as need be.
> 
> We don't yet have the capability to interrupt the process which
> is waiting for the posix lock. And reason for that is that virtiofs
> does not support capability to interrupt yet. That's a TODO item
> for later.
> 
> Please have a look.
> 
> Thanks
> Vivek
> 
> Vivek Goyal (8):
>   virtiofs: Disable interrupt requests properly
>   virtiofs: Fix a comment about fuse_dev allocation
>   virtiofs: Add an index to keep track of first request queue
>   virtiofs: Decouple queue index and queue type
>   virtiofs: Add a virtqueue for notifications
>   virtiofs: Add a helper to end request and decrement inflight number
>   virtiofs: Add new notification type FUSE_NOTIFY_LOCK
>   virtiofs: Handle reordering of reply and notification event
> 
>  fs/fuse/virtio_fs.c            | 438 ++++++++++++++++++++++++++++++---
>  include/uapi/linux/fuse.h      |  11 +-
>  include/uapi/linux/virtio_fs.h |   5 +
>  3 files changed, 412 insertions(+), 42 deletions(-)
> 
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 0/8] virtiofs: Notification queue and blocking posix locks
@ 2021-09-30 15:43   ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-09-30 15:43 UTC (permalink / raw)
  To: linux-fsdevel, virtio-fs, miklos, stefanha

On Thu, Sep 30, 2021 at 10:38:42AM -0400, Vivek Goyal wrote:
> Hi,
> 
> As of now we do not support blocking remote posix locks with virtiofs.
> Well fuse client does not care but server returns -EOPNOTSUPP.

Posted corresponding qemu/virtiofsd changes here.

https://listman.redhat.com/archives/virtio-fs/2021-September/msg00153.html

Thanks
Vivek

> 
> There are couple of reasons to not support it yet.
> 
> - If virtiofsd is single threaded or does not have a thread pool just
>   to handle requests which can block for a long time, virtiofsd will
>   stop processing new requests and virtiofs will come to a halt.
>   To the extent that further unlock request will not make progress
>   and deadlock will result. This can be taken care of by creating
>   a custom thread pool in virtiofsd just to hanlde lock requests.
> 
> - If client sends a blocking lock request and blocks, then it will
>   consume descriptors in vring. If enough processes block, it is
>   possible that vring does not have capacity to send more requests
>   till some response comes back and descriptors are free. This can
>   also lead to deadlock where an unlock request can't be sent to
>   virtiofsd now. Also this will stop virtiofs operation as well as
>   new filesystem requests can't be sent.
> 
> To avoid this issue, idea was suggested thatn when a blocking
> lock request is sent by client, do not block it. Immediately
> send a reply saying client process should wait for a notification
> which will let it know once lock is available. This will make
> sure descriptors in virtqueue are not kept busy while we are
> waiting for lock and future unlock and other file system requests
> can continue to make progress.
> 
> This first requires a notion of notification queue and virtiosfd
> being able to send notifications to client. This patch series
> implements that as well.
> 
> As of now only one notification type has been implemented but now
> infrastructure is in place and other use cases should be easily
> add more type of notifications as need be.
> 
> We don't yet have the capability to interrupt the process which
> is waiting for the posix lock. And reason for that is that virtiofs
> does not support capability to interrupt yet. That's a TODO item
> for later.
> 
> Please have a look.
> 
> Thanks
> Vivek
> 
> Vivek Goyal (8):
>   virtiofs: Disable interrupt requests properly
>   virtiofs: Fix a comment about fuse_dev allocation
>   virtiofs: Add an index to keep track of first request queue
>   virtiofs: Decouple queue index and queue type
>   virtiofs: Add a virtqueue for notifications
>   virtiofs: Add a helper to end request and decrement inflight number
>   virtiofs: Add new notification type FUSE_NOTIFY_LOCK
>   virtiofs: Handle reordering of reply and notification event
> 
>  fs/fuse/virtio_fs.c            | 438 ++++++++++++++++++++++++++++++---
>  include/uapi/linux/fuse.h      |  11 +-
>  include/uapi/linux/virtio_fs.h |   5 +
>  3 files changed, 412 insertions(+), 42 deletions(-)
> 
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 5/8] virtiofs: Add a virtqueue for notifications
  2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
@ 2021-10-06 12:46     ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 12:46 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:

> @@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
>
>  enum {
>         VQ_TYPE_HIPRIO,
> -       VQ_TYPE_REQUEST
> +       VQ_TYPE_REQUEST,
> +       VQ_TYPE_NOTIFY
>  };
>
>  #define VQ_NAME_LEN    24
> +#define VQ_NOTIFY_ELEMS 16     /* Number of notification elements */

Where does this number come from?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 5/8] virtiofs: Add a virtqueue for notifications
@ 2021-10-06 12:46     ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 12:46 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:

> @@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
>
>  enum {
>         VQ_TYPE_HIPRIO,
> -       VQ_TYPE_REQUEST
> +       VQ_TYPE_REQUEST,
> +       VQ_TYPE_NOTIFY
>  };
>
>  #define VQ_NAME_LEN    24
> +#define VQ_NOTIFY_ELEMS 16     /* Number of notification elements */

Where does this number come from?

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 5/8] virtiofs: Add a virtqueue for notifications
  2021-10-06 12:46     ` [Virtio-fs] " Miklos Szeredi
@ 2021-10-06 12:54       ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 12:54 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Wed, Oct 06, 2021 at 02:46:46PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > @@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
> >
> >  enum {
> >         VQ_TYPE_HIPRIO,
> > -       VQ_TYPE_REQUEST
> > +       VQ_TYPE_REQUEST,
> > +       VQ_TYPE_NOTIFY
> >  };
> >
> >  #define VQ_NAME_LEN    24
> > +#define VQ_NOTIFY_ELEMS 16     /* Number of notification elements */
> 
> Where does this number come from?

I just chose an arbitrary number. Not sure what's a good number and
how to decide that. Good thing is that its not part of the protocol
so guest should be able to change it if need be.

Stefan, do you have any thoughts on this depending on what other
virtio drivers have done w.r.t this.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 5/8] virtiofs: Add a virtqueue for notifications
@ 2021-10-06 12:54       ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 12:54 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, linux-fsdevel

On Wed, Oct 06, 2021 at 02:46:46PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > @@ -34,10 +35,12 @@ static LIST_HEAD(virtio_fs_instances);
> >
> >  enum {
> >         VQ_TYPE_HIPRIO,
> > -       VQ_TYPE_REQUEST
> > +       VQ_TYPE_REQUEST,
> > +       VQ_TYPE_NOTIFY
> >  };
> >
> >  #define VQ_NAME_LEN    24
> > +#define VQ_NOTIFY_ELEMS 16     /* Number of notification elements */
> 
> Where does this number come from?

I just chose an arbitrary number. Not sure what's a good number and
how to decide that. Good thing is that its not part of the protocol
so guest should be able to change it if need be.

Stefan, do you have any thoughts on this depending on what other
virtio drivers have done w.r.t this.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
@ 2021-10-06 12:55     ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 12:55 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> sent by file server to signifiy that a previous locking request has
> completed and actual caller should be woken up.
>
> As of now we don't support blocking variant of posix locks and daemon
> returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
> Virtqueue size is limited and it is possible we fill virtqueue with
> all the requests of fcntl(F_SETLKW) and wait for reply. And later a
> subsequent unlock request can't make progress because virtqueue is full.
> And that means F_SETLKW can't make progress and we are deadlocked.
>
> This problem is not limited to posix locks only. I think blocking remote
> flock and open file description locks should face the same issue. Right
> now fuse does not support open file description locks, so its not
> a problem. But fuse/virtiofs does support remote flock and they can use
> same mechanism too.
>
> Use notification queue to solve this problem. After submitting lock
> request device will send a reply asking requester to wait. Once lock is
> available, requester will get a notification saying lock is available.
> That way we don't keep the request virtueue busy while we are waiting for
> lock and further unlock requests can make progress.
>
> When we get a reply in response to lock request, we need a way to know
> if we need to wait for notification or not. I have overloaded the
> fuse_out_header->error field. If value is ->error is 1, that's a signal
> to caller to wait for lock notification. Overloading ->error in this way
> is not the best way to do it. But I am running out of ideas.

Since all values besides -511..0 are reserved this seems okay.
However this needs to be a named value and added to uapi/linux/fuse.h.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-06 12:55     ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 12:55 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> sent by file server to signifiy that a previous locking request has
> completed and actual caller should be woken up.
>
> As of now we don't support blocking variant of posix locks and daemon
> returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
> Virtqueue size is limited and it is possible we fill virtqueue with
> all the requests of fcntl(F_SETLKW) and wait for reply. And later a
> subsequent unlock request can't make progress because virtqueue is full.
> And that means F_SETLKW can't make progress and we are deadlocked.
>
> This problem is not limited to posix locks only. I think blocking remote
> flock and open file description locks should face the same issue. Right
> now fuse does not support open file description locks, so its not
> a problem. But fuse/virtiofs does support remote flock and they can use
> same mechanism too.
>
> Use notification queue to solve this problem. After submitting lock
> request device will send a reply asking requester to wait. Once lock is
> available, requester will get a notification saying lock is available.
> That way we don't keep the request virtueue busy while we are waiting for
> lock and further unlock requests can make progress.
>
> When we get a reply in response to lock request, we need a way to know
> if we need to wait for notification or not. I have overloaded the
> fuse_out_header->error field. If value is ->error is 1, that's a signal
> to caller to wait for lock notification. Overloading ->error in this way
> is not the best way to do it. But I am running out of ideas.

Since all values besides -511..0 are reserved this seems okay.
However this needs to be a named value and added to uapi/linux/fuse.h.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
@ 2021-10-06 13:02     ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 13:02 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> sent by file server to signifiy that a previous locking request has
> completed and actual caller should be woken up.

Shouldn't this also be generic instead of lock specific?

I.e. generic header  + original outarg.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-06 13:02     ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-06 13:02 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> sent by file server to signifiy that a previous locking request has
> completed and actual caller should be woken up.

Shouldn't this also be generic instead of lock specific?

I.e. generic header  + original outarg.

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-06 12:55     ` [Virtio-fs] " Miklos Szeredi
@ 2021-10-06 15:01       ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 15:01 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Wed, Oct 06, 2021 at 02:55:16PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > sent by file server to signifiy that a previous locking request has
> > completed and actual caller should be woken up.
> >
> > As of now we don't support blocking variant of posix locks and daemon
> > returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
> > Virtqueue size is limited and it is possible we fill virtqueue with
> > all the requests of fcntl(F_SETLKW) and wait for reply. And later a
> > subsequent unlock request can't make progress because virtqueue is full.
> > And that means F_SETLKW can't make progress and we are deadlocked.
> >
> > This problem is not limited to posix locks only. I think blocking remote
> > flock and open file description locks should face the same issue. Right
> > now fuse does not support open file description locks, so its not
> > a problem. But fuse/virtiofs does support remote flock and they can use
> > same mechanism too.
> >
> > Use notification queue to solve this problem. After submitting lock
> > request device will send a reply asking requester to wait. Once lock is
> > available, requester will get a notification saying lock is available.
> > That way we don't keep the request virtueue busy while we are waiting for
> > lock and further unlock requests can make progress.
> >
> > When we get a reply in response to lock request, we need a way to know
> > if we need to wait for notification or not. I have overloaded the
> > fuse_out_header->error field. If value is ->error is 1, that's a signal
> > to caller to wait for lock notification. Overloading ->error in this way
> > is not the best way to do it. But I am running out of ideas.
> 
> Since all values besides -511..0 are reserved this seems okay.
> However this needs to be a named value and added to uapi/linux/fuse.h.

Ok, I will define this value and put in uapi/linux/fuse.h. 

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-06 15:01       ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 15:01 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, linux-fsdevel

On Wed, Oct 06, 2021 at 02:55:16PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > sent by file server to signifiy that a previous locking request has
> > completed and actual caller should be woken up.
> >
> > As of now we don't support blocking variant of posix locks and daemon
> > returns -EOPNOTSUPP. Reason being that it can lead to deadlocks.
> > Virtqueue size is limited and it is possible we fill virtqueue with
> > all the requests of fcntl(F_SETLKW) and wait for reply. And later a
> > subsequent unlock request can't make progress because virtqueue is full.
> > And that means F_SETLKW can't make progress and we are deadlocked.
> >
> > This problem is not limited to posix locks only. I think blocking remote
> > flock and open file description locks should face the same issue. Right
> > now fuse does not support open file description locks, so its not
> > a problem. But fuse/virtiofs does support remote flock and they can use
> > same mechanism too.
> >
> > Use notification queue to solve this problem. After submitting lock
> > request device will send a reply asking requester to wait. Once lock is
> > available, requester will get a notification saying lock is available.
> > That way we don't keep the request virtueue busy while we are waiting for
> > lock and further unlock requests can make progress.
> >
> > When we get a reply in response to lock request, we need a way to know
> > if we need to wait for notification or not. I have overloaded the
> > fuse_out_header->error field. If value is ->error is 1, that's a signal
> > to caller to wait for lock notification. Overloading ->error in this way
> > is not the best way to do it. But I am running out of ideas.
> 
> Since all values besides -511..0 are reserved this seems okay.
> However this needs to be a named value and added to uapi/linux/fuse.h.

Ok, I will define this value and put in uapi/linux/fuse.h. 

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-06 13:02     ` [Virtio-fs] " Miklos Szeredi
@ 2021-10-06 16:12       ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 16:12 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > sent by file server to signifiy that a previous locking request has
> > completed and actual caller should be woken up.
> 
> Shouldn't this also be generic instead of lock specific?
> 
> I.e. generic header  + original outarg.

Hi Miklos,

I am not sure I understand the idea. Can you please elaborate a bit more.

IIUC, "fuse_out_header + original outarg"  is format for responding
to regular fuse requests. If we use that it will become like responding
to same request twice. First time we responded with ->error=1 so that
caller can wait and second time we respond with actual outarg (if
there is one depending on the type of request).

IOW, this will become more like implementing blocking of request in
client in a more generic manner.

But outarg, depends on type of request (In case of locking there is
none). And outarg memory is allocated by driver and filled by server.
In case of notifications, driver is allocating the memory but it
does not know what will come in notification and how much memory
to allocate. So it relies on device telling it how much memory
to allocate in general so that bunch of pre-defined notification
types can fit in (fs->notify_buf_size).

I modeled this on the same lines as other fuse notifications where
server sends notifications with following format.

fuse_out_header + <structure based on notification type>

out_header->unique is 0 for notifications to differentiate notifications
from request reply.

out_header->error contains the code of actual notification being sent.
ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE. 
Right now virtiofs supports only one notification type. But in future
we can introduce more types (to support inotify stuff etc).

In short, I modeled this on existing notion of fuse notifications
(and not fuse reply). And given notifications are asynchronous,
we don't know what were original outarg. In fact they might
be generated not necessarily in response to a request. And that's
why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
and then let driver decide how to handle this notification.

I might have completely misunderstood your suggestion. Please help
me understand.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-06 16:12       ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-06 16:12 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, linux-fsdevel

On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > sent by file server to signifiy that a previous locking request has
> > completed and actual caller should be woken up.
> 
> Shouldn't this also be generic instead of lock specific?
> 
> I.e. generic header  + original outarg.

Hi Miklos,

I am not sure I understand the idea. Can you please elaborate a bit more.

IIUC, "fuse_out_header + original outarg"  is format for responding
to regular fuse requests. If we use that it will become like responding
to same request twice. First time we responded with ->error=1 so that
caller can wait and second time we respond with actual outarg (if
there is one depending on the type of request).

IOW, this will become more like implementing blocking of request in
client in a more generic manner.

But outarg, depends on type of request (In case of locking there is
none). And outarg memory is allocated by driver and filled by server.
In case of notifications, driver is allocating the memory but it
does not know what will come in notification and how much memory
to allocate. So it relies on device telling it how much memory
to allocate in general so that bunch of pre-defined notification
types can fit in (fs->notify_buf_size).

I modeled this on the same lines as other fuse notifications where
server sends notifications with following format.

fuse_out_header + <structure based on notification type>

out_header->unique is 0 for notifications to differentiate notifications
from request reply.

out_header->error contains the code of actual notification being sent.
ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE. 
Right now virtiofs supports only one notification type. But in future
we can introduce more types (to support inotify stuff etc).

In short, I modeled this on existing notion of fuse notifications
(and not fuse reply). And given notifications are asynchronous,
we don't know what were original outarg. In fact they might
be generated not necessarily in response to a request. And that's
why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
and then let driver decide how to handle this notification.

I might have completely misunderstood your suggestion. Please help
me understand.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-06 16:12       ` [Virtio-fs] " Vivek Goyal
@ 2021-10-07 13:45         ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 13:45 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > sent by file server to signifiy that a previous locking request has
> > > completed and actual caller should be woken up.
> >
> > Shouldn't this also be generic instead of lock specific?
> >
> > I.e. generic header  + original outarg.
>
> Hi Miklos,
>
> I am not sure I understand the idea. Can you please elaborate a bit more.
>
> IIUC, "fuse_out_header + original outarg"  is format for responding
> to regular fuse requests. If we use that it will become like responding
> to same request twice. First time we responded with ->error=1 so that
> caller can wait and second time we respond with actual outarg (if
> there is one depending on the type of request).
>
> IOW, this will become more like implementing blocking of request in
> client in a more generic manner.
>
> But outarg, depends on type of request (In case of locking there is
> none). And outarg memory is allocated by driver and filled by server.
> In case of notifications, driver is allocating the memory but it
> does not know what will come in notification and how much memory
> to allocate. So it relies on device telling it how much memory
> to allocate in general so that bunch of pre-defined notification
> types can fit in (fs->notify_buf_size).
>
> I modeled this on the same lines as other fuse notifications where
> server sends notifications with following format.
>
> fuse_out_header + <structure based on notification type>
>
> out_header->unique is 0 for notifications to differentiate notifications
> from request reply.
>
> out_header->error contains the code of actual notification being sent.
> ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> Right now virtiofs supports only one notification type. But in future
> we can introduce more types (to support inotify stuff etc).
>
> In short, I modeled this on existing notion of fuse notifications
> (and not fuse reply). And given notifications are asynchronous,
> we don't know what were original outarg. In fact they might
> be generated not necessarily in response to a request. And that's
> why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> and then let driver decide how to handle this notification.
>
> I might have completely misunderstood your suggestion. Please help
> me understand.

Okay, so we are expecting this mechanism to be only used for blocking
locks.  That makes sense, but then locking ops should be setting a
flag indicating that this is locking op.  I.e. in fuse_setlk():

    args.blocking_lock = true;

And this should be verified when the reply with the positive error comes back.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-07 13:45         ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 13:45 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > sent by file server to signifiy that a previous locking request has
> > > completed and actual caller should be woken up.
> >
> > Shouldn't this also be generic instead of lock specific?
> >
> > I.e. generic header  + original outarg.
>
> Hi Miklos,
>
> I am not sure I understand the idea. Can you please elaborate a bit more.
>
> IIUC, "fuse_out_header + original outarg"  is format for responding
> to regular fuse requests. If we use that it will become like responding
> to same request twice. First time we responded with ->error=1 so that
> caller can wait and second time we respond with actual outarg (if
> there is one depending on the type of request).
>
> IOW, this will become more like implementing blocking of request in
> client in a more generic manner.
>
> But outarg, depends on type of request (In case of locking there is
> none). And outarg memory is allocated by driver and filled by server.
> In case of notifications, driver is allocating the memory but it
> does not know what will come in notification and how much memory
> to allocate. So it relies on device telling it how much memory
> to allocate in general so that bunch of pre-defined notification
> types can fit in (fs->notify_buf_size).
>
> I modeled this on the same lines as other fuse notifications where
> server sends notifications with following format.
>
> fuse_out_header + <structure based on notification type>
>
> out_header->unique is 0 for notifications to differentiate notifications
> from request reply.
>
> out_header->error contains the code of actual notification being sent.
> ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> Right now virtiofs supports only one notification type. But in future
> we can introduce more types (to support inotify stuff etc).
>
> In short, I modeled this on existing notion of fuse notifications
> (and not fuse reply). And given notifications are asynchronous,
> we don't know what were original outarg. In fact they might
> be generated not necessarily in response to a request. And that's
> why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> and then let driver decide how to handle this notification.
>
> I might have completely misunderstood your suggestion. Please help
> me understand.

Okay, so we are expecting this mechanism to be only used for blocking
locks.  That makes sense, but then locking ops should be setting a
flag indicating that this is locking op.  I.e. in fuse_setlk():

    args.blocking_lock = true;

And this should be verified when the reply with the positive error comes back.

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-07 13:45         ` [Virtio-fs] " Miklos Szeredi
@ 2021-10-07 14:21           ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-07 14:21 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > >
> > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > sent by file server to signifiy that a previous locking request has
> > > > completed and actual caller should be woken up.
> > >
> > > Shouldn't this also be generic instead of lock specific?
> > >
> > > I.e. generic header  + original outarg.
> >
> > Hi Miklos,
> >
> > I am not sure I understand the idea. Can you please elaborate a bit more.
> >
> > IIUC, "fuse_out_header + original outarg"  is format for responding
> > to regular fuse requests. If we use that it will become like responding
> > to same request twice. First time we responded with ->error=1 so that
> > caller can wait and second time we respond with actual outarg (if
> > there is one depending on the type of request).
> >
> > IOW, this will become more like implementing blocking of request in
> > client in a more generic manner.
> >
> > But outarg, depends on type of request (In case of locking there is
> > none). And outarg memory is allocated by driver and filled by server.
> > In case of notifications, driver is allocating the memory but it
> > does not know what will come in notification and how much memory
> > to allocate. So it relies on device telling it how much memory
> > to allocate in general so that bunch of pre-defined notification
> > types can fit in (fs->notify_buf_size).
> >
> > I modeled this on the same lines as other fuse notifications where
> > server sends notifications with following format.
> >
> > fuse_out_header + <structure based on notification type>
> >
> > out_header->unique is 0 for notifications to differentiate notifications
> > from request reply.
> >
> > out_header->error contains the code of actual notification being sent.
> > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > Right now virtiofs supports only one notification type. But in future
> > we can introduce more types (to support inotify stuff etc).
> >
> > In short, I modeled this on existing notion of fuse notifications
> > (and not fuse reply). And given notifications are asynchronous,
> > we don't know what were original outarg. In fact they might
> > be generated not necessarily in response to a request. And that's
> > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > and then let driver decide how to handle this notification.
> >
> > I might have completely misunderstood your suggestion. Please help
> > me understand.
> 
> Okay, so we are expecting this mechanism to be only used for blocking
> locks.

Yes, as of now it is only being used only for blocking locks. So there
are two parts to it.

A. For a blocking operation, server can reply with error=1, and that's
   a signal to client to wait for a notification to arrive later. And
   fuse client will not complete the request and instead will queue it
   in one of the internal lists.

B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
   when it has acquired the lock. This notification will have unique
   number of request for which this notification has been generated.
   Fuse client will search for the request with expected unique number
   in the list and complete the request.

I think part A is generic in the sense it could be used for other
kind of blocking requests as well down the line, where server is
doing the blocking operation on behalf of client and will send
notification later. Part B is very specific to blocking locks though.

> That makes sense, but then locking ops should be setting a
> flag indicating that this is locking op.  I.e. in fuse_setlk():
> 
>     args.blocking_lock = true;
> 
> And this should be verified when the reply with the positive error comes back.

So this args.blocking_lock, goes to server as well? Or this is something
internal to fuse client so that client can decide whether ->error=1 is
a valid response or not. IOW, client is trying to do verification
whether server should have generated ->error=1 or not for this specific
request.

In case of virtiofs server is trusted entity. So this kind of verification
might be nice to have but not necessary.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-07 14:21           ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-07 14:21 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, linux-fsdevel

On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > >
> > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > sent by file server to signifiy that a previous locking request has
> > > > completed and actual caller should be woken up.
> > >
> > > Shouldn't this also be generic instead of lock specific?
> > >
> > > I.e. generic header  + original outarg.
> >
> > Hi Miklos,
> >
> > I am not sure I understand the idea. Can you please elaborate a bit more.
> >
> > IIUC, "fuse_out_header + original outarg"  is format for responding
> > to regular fuse requests. If we use that it will become like responding
> > to same request twice. First time we responded with ->error=1 so that
> > caller can wait and second time we respond with actual outarg (if
> > there is one depending on the type of request).
> >
> > IOW, this will become more like implementing blocking of request in
> > client in a more generic manner.
> >
> > But outarg, depends on type of request (In case of locking there is
> > none). And outarg memory is allocated by driver and filled by server.
> > In case of notifications, driver is allocating the memory but it
> > does not know what will come in notification and how much memory
> > to allocate. So it relies on device telling it how much memory
> > to allocate in general so that bunch of pre-defined notification
> > types can fit in (fs->notify_buf_size).
> >
> > I modeled this on the same lines as other fuse notifications where
> > server sends notifications with following format.
> >
> > fuse_out_header + <structure based on notification type>
> >
> > out_header->unique is 0 for notifications to differentiate notifications
> > from request reply.
> >
> > out_header->error contains the code of actual notification being sent.
> > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > Right now virtiofs supports only one notification type. But in future
> > we can introduce more types (to support inotify stuff etc).
> >
> > In short, I modeled this on existing notion of fuse notifications
> > (and not fuse reply). And given notifications are asynchronous,
> > we don't know what were original outarg. In fact they might
> > be generated not necessarily in response to a request. And that's
> > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > and then let driver decide how to handle this notification.
> >
> > I might have completely misunderstood your suggestion. Please help
> > me understand.
> 
> Okay, so we are expecting this mechanism to be only used for blocking
> locks.

Yes, as of now it is only being used only for blocking locks. So there
are two parts to it.

A. For a blocking operation, server can reply with error=1, and that's
   a signal to client to wait for a notification to arrive later. And
   fuse client will not complete the request and instead will queue it
   in one of the internal lists.

B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
   when it has acquired the lock. This notification will have unique
   number of request for which this notification has been generated.
   Fuse client will search for the request with expected unique number
   in the list and complete the request.

I think part A is generic in the sense it could be used for other
kind of blocking requests as well down the line, where server is
doing the blocking operation on behalf of client and will send
notification later. Part B is very specific to blocking locks though.

> That makes sense, but then locking ops should be setting a
> flag indicating that this is locking op.  I.e. in fuse_setlk():
> 
>     args.blocking_lock = true;
> 
> And this should be verified when the reply with the positive error comes back.

So this args.blocking_lock, goes to server as well? Or this is something
internal to fuse client so that client can decide whether ->error=1 is
a valid response or not. IOW, client is trying to do verification
whether server should have generated ->error=1 or not for this specific
request.

In case of virtiofs server is trusted entity. So this kind of verification
might be nice to have but not necessary.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-07 14:21           ` [Virtio-fs] " Vivek Goyal
@ 2021-10-07 18:11             ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 18:11 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > >
> > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > sent by file server to signifiy that a previous locking request has
> > > > > completed and actual caller should be woken up.
> > > >
> > > > Shouldn't this also be generic instead of lock specific?
> > > >
> > > > I.e. generic header  + original outarg.
> > >
> > > Hi Miklos,
> > >
> > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > >
> > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > to regular fuse requests. If we use that it will become like responding
> > > to same request twice. First time we responded with ->error=1 so that
> > > caller can wait and second time we respond with actual outarg (if
> > > there is one depending on the type of request).
> > >
> > > IOW, this will become more like implementing blocking of request in
> > > client in a more generic manner.
> > >
> > > But outarg, depends on type of request (In case of locking there is
> > > none). And outarg memory is allocated by driver and filled by server.
> > > In case of notifications, driver is allocating the memory but it
> > > does not know what will come in notification and how much memory
> > > to allocate. So it relies on device telling it how much memory
> > > to allocate in general so that bunch of pre-defined notification
> > > types can fit in (fs->notify_buf_size).
> > >
> > > I modeled this on the same lines as other fuse notifications where
> > > server sends notifications with following format.
> > >
> > > fuse_out_header + <structure based on notification type>
> > >
> > > out_header->unique is 0 for notifications to differentiate notifications
> > > from request reply.
> > >
> > > out_header->error contains the code of actual notification being sent.
> > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > Right now virtiofs supports only one notification type. But in future
> > > we can introduce more types (to support inotify stuff etc).
> > >
> > > In short, I modeled this on existing notion of fuse notifications
> > > (and not fuse reply). And given notifications are asynchronous,
> > > we don't know what were original outarg. In fact they might
> > > be generated not necessarily in response to a request. And that's
> > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > and then let driver decide how to handle this notification.
> > >
> > > I might have completely misunderstood your suggestion. Please help
> > > me understand.
> >
> > Okay, so we are expecting this mechanism to be only used for blocking
> > locks.
>
> Yes, as of now it is only being used only for blocking locks. So there
> are two parts to it.
>
> A. For a blocking operation, server can reply with error=1, and that's
>    a signal to client to wait for a notification to arrive later. And
>    fuse client will not complete the request and instead will queue it
>    in one of the internal lists.
>
> B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
>    when it has acquired the lock. This notification will have unique
>    number of request for which this notification has been generated.
>    Fuse client will search for the request with expected unique number
>    in the list and complete the request.
>
> I think part A is generic in the sense it could be used for other
> kind of blocking requests as well down the line, where server is
> doing the blocking operation on behalf of client and will send
> notification later. Part B is very specific to blocking locks though.

I don't really get why B is specific to blocking locks. But anyway...
we are only implementing it for blocking locks for now.

>
> > That makes sense, but then locking ops should be setting a
> > flag indicating that this is locking op.  I.e. in fuse_setlk():
> >
> >     args.blocking_lock = true;
> >
> > And this should be verified when the reply with the positive error comes back.
>
> So this args.blocking_lock, goes to server as well? Or this is something
> internal to fuse client so that client can decide whether ->error=1 is
> a valid response or not. IOW, client is trying to do verification
> whether server should have generated ->error=1 or not for this specific
> request.

Right, it's for the client.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-07 18:11             ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 18:11 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > >
> > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > sent by file server to signifiy that a previous locking request has
> > > > > completed and actual caller should be woken up.
> > > >
> > > > Shouldn't this also be generic instead of lock specific?
> > > >
> > > > I.e. generic header  + original outarg.
> > >
> > > Hi Miklos,
> > >
> > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > >
> > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > to regular fuse requests. If we use that it will become like responding
> > > to same request twice. First time we responded with ->error=1 so that
> > > caller can wait and second time we respond with actual outarg (if
> > > there is one depending on the type of request).
> > >
> > > IOW, this will become more like implementing blocking of request in
> > > client in a more generic manner.
> > >
> > > But outarg, depends on type of request (In case of locking there is
> > > none). And outarg memory is allocated by driver and filled by server.
> > > In case of notifications, driver is allocating the memory but it
> > > does not know what will come in notification and how much memory
> > > to allocate. So it relies on device telling it how much memory
> > > to allocate in general so that bunch of pre-defined notification
> > > types can fit in (fs->notify_buf_size).
> > >
> > > I modeled this on the same lines as other fuse notifications where
> > > server sends notifications with following format.
> > >
> > > fuse_out_header + <structure based on notification type>
> > >
> > > out_header->unique is 0 for notifications to differentiate notifications
> > > from request reply.
> > >
> > > out_header->error contains the code of actual notification being sent.
> > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > Right now virtiofs supports only one notification type. But in future
> > > we can introduce more types (to support inotify stuff etc).
> > >
> > > In short, I modeled this on existing notion of fuse notifications
> > > (and not fuse reply). And given notifications are asynchronous,
> > > we don't know what were original outarg. In fact they might
> > > be generated not necessarily in response to a request. And that's
> > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > and then let driver decide how to handle this notification.
> > >
> > > I might have completely misunderstood your suggestion. Please help
> > > me understand.
> >
> > Okay, so we are expecting this mechanism to be only used for blocking
> > locks.
>
> Yes, as of now it is only being used only for blocking locks. So there
> are two parts to it.
>
> A. For a blocking operation, server can reply with error=1, and that's
>    a signal to client to wait for a notification to arrive later. And
>    fuse client will not complete the request and instead will queue it
>    in one of the internal lists.
>
> B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
>    when it has acquired the lock. This notification will have unique
>    number of request for which this notification has been generated.
>    Fuse client will search for the request with expected unique number
>    in the list and complete the request.
>
> I think part A is generic in the sense it could be used for other
> kind of blocking requests as well down the line, where server is
> doing the blocking operation on behalf of client and will send
> notification later. Part B is very specific to blocking locks though.

I don't really get why B is specific to blocking locks. But anyway...
we are only implementing it for blocking locks for now.

>
> > That makes sense, but then locking ops should be setting a
> > flag indicating that this is locking op.  I.e. in fuse_setlk():
> >
> >     args.blocking_lock = true;
> >
> > And this should be verified when the reply with the positive error comes back.
>
> So this args.blocking_lock, goes to server as well? Or this is something
> internal to fuse client so that client can decide whether ->error=1 is
> a valid response or not. IOW, client is trying to do verification
> whether server should have generated ->error=1 or not for this specific
> request.

Right, it's for the client.

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-07 18:11             ` [Virtio-fs] " Miklos Szeredi
@ 2021-10-07 18:32               ` Vivek Goyal
  -1 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-07 18:32 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, Oct 07, 2021 at 08:11:13PM +0200, Miklos Szeredi wrote:
> On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > >
> > > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > >
> > > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > > sent by file server to signifiy that a previous locking request has
> > > > > > completed and actual caller should be woken up.
> > > > >
> > > > > Shouldn't this also be generic instead of lock specific?
> > > > >
> > > > > I.e. generic header  + original outarg.
> > > >
> > > > Hi Miklos,
> > > >
> > > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > > >
> > > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > > to regular fuse requests. If we use that it will become like responding
> > > > to same request twice. First time we responded with ->error=1 so that
> > > > caller can wait and second time we respond with actual outarg (if
> > > > there is one depending on the type of request).
> > > >
> > > > IOW, this will become more like implementing blocking of request in
> > > > client in a more generic manner.
> > > >
> > > > But outarg, depends on type of request (In case of locking there is
> > > > none). And outarg memory is allocated by driver and filled by server.
> > > > In case of notifications, driver is allocating the memory but it
> > > > does not know what will come in notification and how much memory
> > > > to allocate. So it relies on device telling it how much memory
> > > > to allocate in general so that bunch of pre-defined notification
> > > > types can fit in (fs->notify_buf_size).
> > > >
> > > > I modeled this on the same lines as other fuse notifications where
> > > > server sends notifications with following format.
> > > >
> > > > fuse_out_header + <structure based on notification type>
> > > >
> > > > out_header->unique is 0 for notifications to differentiate notifications
> > > > from request reply.
> > > >
> > > > out_header->error contains the code of actual notification being sent.
> > > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > > Right now virtiofs supports only one notification type. But in future
> > > > we can introduce more types (to support inotify stuff etc).
> > > >
> > > > In short, I modeled this on existing notion of fuse notifications
> > > > (and not fuse reply). And given notifications are asynchronous,
> > > > we don't know what were original outarg. In fact they might
> > > > be generated not necessarily in response to a request. And that's
> > > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > > and then let driver decide how to handle this notification.
> > > >
> > > > I might have completely misunderstood your suggestion. Please help
> > > > me understand.
> > >
> > > Okay, so we are expecting this mechanism to be only used for blocking
> > > locks.
> >
> > Yes, as of now it is only being used only for blocking locks. So there
> > are two parts to it.
> >
> > A. For a blocking operation, server can reply with error=1, and that's
> >    a signal to client to wait for a notification to arrive later. And
> >    fuse client will not complete the request and instead will queue it
> >    in one of the internal lists.
> >
> > B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
> >    when it has acquired the lock. This notification will have unique
> >    number of request for which this notification has been generated.
> >    Fuse client will search for the request with expected unique number
> >    in the list and complete the request.
> >
> > I think part A is generic in the sense it could be used for other
> > kind of blocking requests as well down the line, where server is
> > doing the blocking operation on behalf of client and will send
> > notification later. Part B is very specific to blocking locks though.
> 
> I don't really get why B is specific to blocking locks. But anyway...
> we are only implementing it for blocking locks for now.

Hmm.., I am wondering do I have to make notification specific to
lock. All it is doing is returning an "error" code which signifies
either operation completed successfully or represents error code
if error occurred.

This probably could be more generic. May be I can call this
notification FUSE_NOTIFY_OP_COMPLETE. This notification is
just signalling that a previously issued request has completed.
Request is identified by notify->unique and result of blocking operation
is in notify->error. So that way it is more generic and can be
used for other kind of operations too (and not just locking).

> 
> >
> > > That makes sense, but then locking ops should be setting a
> > > flag indicating that this is locking op.  I.e. in fuse_setlk():
> > >
> > >     args.blocking_lock = true;
> > >
> > > And this should be verified when the reply with the positive error comes back.
> >
> > So this args.blocking_lock, goes to server as well? Or this is something
> > internal to fuse client so that client can decide whether ->error=1 is
> > a valid response or not. IOW, client is trying to do verification
> > whether server should have generated ->error=1 or not for this specific
> > request.
> 
> Right, it's for the client.

Got it. Will implement it.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-07 18:32               ` Vivek Goyal
  0 siblings, 0 replies; 42+ messages in thread
From: Vivek Goyal @ 2021-10-07 18:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: virtio-fs-list, linux-fsdevel

On Thu, Oct 07, 2021 at 08:11:13PM +0200, Miklos Szeredi wrote:
> On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
> >
> > On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > >
> > > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > >
> > > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > > sent by file server to signifiy that a previous locking request has
> > > > > > completed and actual caller should be woken up.
> > > > >
> > > > > Shouldn't this also be generic instead of lock specific?
> > > > >
> > > > > I.e. generic header  + original outarg.
> > > >
> > > > Hi Miklos,
> > > >
> > > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > > >
> > > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > > to regular fuse requests. If we use that it will become like responding
> > > > to same request twice. First time we responded with ->error=1 so that
> > > > caller can wait and second time we respond with actual outarg (if
> > > > there is one depending on the type of request).
> > > >
> > > > IOW, this will become more like implementing blocking of request in
> > > > client in a more generic manner.
> > > >
> > > > But outarg, depends on type of request (In case of locking there is
> > > > none). And outarg memory is allocated by driver and filled by server.
> > > > In case of notifications, driver is allocating the memory but it
> > > > does not know what will come in notification and how much memory
> > > > to allocate. So it relies on device telling it how much memory
> > > > to allocate in general so that bunch of pre-defined notification
> > > > types can fit in (fs->notify_buf_size).
> > > >
> > > > I modeled this on the same lines as other fuse notifications where
> > > > server sends notifications with following format.
> > > >
> > > > fuse_out_header + <structure based on notification type>
> > > >
> > > > out_header->unique is 0 for notifications to differentiate notifications
> > > > from request reply.
> > > >
> > > > out_header->error contains the code of actual notification being sent.
> > > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > > Right now virtiofs supports only one notification type. But in future
> > > > we can introduce more types (to support inotify stuff etc).
> > > >
> > > > In short, I modeled this on existing notion of fuse notifications
> > > > (and not fuse reply). And given notifications are asynchronous,
> > > > we don't know what were original outarg. In fact they might
> > > > be generated not necessarily in response to a request. And that's
> > > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > > and then let driver decide how to handle this notification.
> > > >
> > > > I might have completely misunderstood your suggestion. Please help
> > > > me understand.
> > >
> > > Okay, so we are expecting this mechanism to be only used for blocking
> > > locks.
> >
> > Yes, as of now it is only being used only for blocking locks. So there
> > are two parts to it.
> >
> > A. For a blocking operation, server can reply with error=1, and that's
> >    a signal to client to wait for a notification to arrive later. And
> >    fuse client will not complete the request and instead will queue it
> >    in one of the internal lists.
> >
> > B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
> >    when it has acquired the lock. This notification will have unique
> >    number of request for which this notification has been generated.
> >    Fuse client will search for the request with expected unique number
> >    in the list and complete the request.
> >
> > I think part A is generic in the sense it could be used for other
> > kind of blocking requests as well down the line, where server is
> > doing the blocking operation on behalf of client and will send
> > notification later. Part B is very specific to blocking locks though.
> 
> I don't really get why B is specific to blocking locks. But anyway...
> we are only implementing it for blocking locks for now.

Hmm.., I am wondering do I have to make notification specific to
lock. All it is doing is returning an "error" code which signifies
either operation completed successfully or represents error code
if error occurred.

This probably could be more generic. May be I can call this
notification FUSE_NOTIFY_OP_COMPLETE. This notification is
just signalling that a previously issued request has completed.
Request is identified by notify->unique and result of blocking operation
is in notify->error. So that way it is more generic and can be
used for other kind of operations too (and not just locking).

> 
> >
> > > That makes sense, but then locking ops should be setting a
> > > flag indicating that this is locking op.  I.e. in fuse_setlk():
> > >
> > >     args.blocking_lock = true;
> > >
> > > And this should be verified when the reply with the positive error comes back.
> >
> > So this args.blocking_lock, goes to server as well? Or this is something
> > internal to fuse client so that client can decide whether ->error=1 is
> > a valid response or not. IOW, client is trying to do verification
> > whether server should have generated ->error=1 or not for this specific
> > request.
> 
> Right, it's for the client.

Got it. Will implement it.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
  2021-10-07 18:32               ` [Virtio-fs] " Vivek Goyal
@ 2021-10-07 18:46                 ` Miklos Szeredi
  -1 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 18:46 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux-fsdevel, virtio-fs-list, Stefan Hajnoczi,
	Ioannis Angelakopoulos, jaggel, Dr. David Alan Gilbert

On Thu, 7 Oct 2021 at 20:32, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Thu, Oct 07, 2021 at 08:11:13PM +0200, Miklos Szeredi wrote:
> > On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > > > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > >
> > > > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > > >
> > > > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > > > sent by file server to signifiy that a previous locking request has
> > > > > > > completed and actual caller should be woken up.
> > > > > >
> > > > > > Shouldn't this also be generic instead of lock specific?
> > > > > >
> > > > > > I.e. generic header  + original outarg.
> > > > >
> > > > > Hi Miklos,
> > > > >
> > > > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > > > >
> > > > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > > > to regular fuse requests. If we use that it will become like responding
> > > > > to same request twice. First time we responded with ->error=1 so that
> > > > > caller can wait and second time we respond with actual outarg (if
> > > > > there is one depending on the type of request).
> > > > >
> > > > > IOW, this will become more like implementing blocking of request in
> > > > > client in a more generic manner.
> > > > >
> > > > > But outarg, depends on type of request (In case of locking there is
> > > > > none). And outarg memory is allocated by driver and filled by server.
> > > > > In case of notifications, driver is allocating the memory but it
> > > > > does not know what will come in notification and how much memory
> > > > > to allocate. So it relies on device telling it how much memory
> > > > > to allocate in general so that bunch of pre-defined notification
> > > > > types can fit in (fs->notify_buf_size).
> > > > >
> > > > > I modeled this on the same lines as other fuse notifications where
> > > > > server sends notifications with following format.
> > > > >
> > > > > fuse_out_header + <structure based on notification type>
> > > > >
> > > > > out_header->unique is 0 for notifications to differentiate notifications
> > > > > from request reply.
> > > > >
> > > > > out_header->error contains the code of actual notification being sent.
> > > > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > > > Right now virtiofs supports only one notification type. But in future
> > > > > we can introduce more types (to support inotify stuff etc).
> > > > >
> > > > > In short, I modeled this on existing notion of fuse notifications
> > > > > (and not fuse reply). And given notifications are asynchronous,
> > > > > we don't know what were original outarg. In fact they might
> > > > > be generated not necessarily in response to a request. And that's
> > > > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > > > and then let driver decide how to handle this notification.
> > > > >
> > > > > I might have completely misunderstood your suggestion. Please help
> > > > > me understand.
> > > >
> > > > Okay, so we are expecting this mechanism to be only used for blocking
> > > > locks.
> > >
> > > Yes, as of now it is only being used only for blocking locks. So there
> > > are two parts to it.
> > >
> > > A. For a blocking operation, server can reply with error=1, and that's
> > >    a signal to client to wait for a notification to arrive later. And
> > >    fuse client will not complete the request and instead will queue it
> > >    in one of the internal lists.
> > >
> > > B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
> > >    when it has acquired the lock. This notification will have unique
> > >    number of request for which this notification has been generated.
> > >    Fuse client will search for the request with expected unique number
> > >    in the list and complete the request.
> > >
> > > I think part A is generic in the sense it could be used for other
> > > kind of blocking requests as well down the line, where server is
> > > doing the blocking operation on behalf of client and will send
> > > notification later. Part B is very specific to blocking locks though.
> >
> > I don't really get why B is specific to blocking locks. But anyway...
> > we are only implementing it for blocking locks for now.
>
> Hmm.., I am wondering do I have to make notification specific to
> lock. All it is doing is returning an "error" code which signifies
> either operation completed successfully or represents error code
> if error occurred.
>
> This probably could be more generic. May be I can call this
> notification FUSE_NOTIFY_OP_COMPLETE. This notification is
> just signalling that a previously issued request has completed.
> Request is identified by notify->unique and result of blocking operation
> is in notify->error. So that way it is more generic and can be
> used for other kind of operations too (and not just locking).

That's exactly what I was thinking.   The implementation can remain
the same for now...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Virtio-fs] [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK
@ 2021-10-07 18:46                 ` Miklos Szeredi
  0 siblings, 0 replies; 42+ messages in thread
From: Miklos Szeredi @ 2021-10-07 18:46 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list, linux-fsdevel

On Thu, 7 Oct 2021 at 20:32, Vivek Goyal <vgoyal@redhat.com> wrote:
>
> On Thu, Oct 07, 2021 at 08:11:13PM +0200, Miklos Szeredi wrote:
> > On Thu, 7 Oct 2021 at 16:22, Vivek Goyal <vgoyal@redhat.com> wrote:
> > >
> > > On Thu, Oct 07, 2021 at 03:45:40PM +0200, Miklos Szeredi wrote:
> > > > On Wed, 6 Oct 2021 at 18:13, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > >
> > > > > On Wed, Oct 06, 2021 at 03:02:36PM +0200, Miklos Szeredi wrote:
> > > > > > On Thu, 30 Sept 2021 at 16:39, Vivek Goyal <vgoyal@redhat.com> wrote:
> > > > > > >
> > > > > > > Add a new notification type FUSE_NOTIFY_LOCK. This notification can be
> > > > > > > sent by file server to signifiy that a previous locking request has
> > > > > > > completed and actual caller should be woken up.
> > > > > >
> > > > > > Shouldn't this also be generic instead of lock specific?
> > > > > >
> > > > > > I.e. generic header  + original outarg.
> > > > >
> > > > > Hi Miklos,
> > > > >
> > > > > I am not sure I understand the idea. Can you please elaborate a bit more.
> > > > >
> > > > > IIUC, "fuse_out_header + original outarg"  is format for responding
> > > > > to regular fuse requests. If we use that it will become like responding
> > > > > to same request twice. First time we responded with ->error=1 so that
> > > > > caller can wait and second time we respond with actual outarg (if
> > > > > there is one depending on the type of request).
> > > > >
> > > > > IOW, this will become more like implementing blocking of request in
> > > > > client in a more generic manner.
> > > > >
> > > > > But outarg, depends on type of request (In case of locking there is
> > > > > none). And outarg memory is allocated by driver and filled by server.
> > > > > In case of notifications, driver is allocating the memory but it
> > > > > does not know what will come in notification and how much memory
> > > > > to allocate. So it relies on device telling it how much memory
> > > > > to allocate in general so that bunch of pre-defined notification
> > > > > types can fit in (fs->notify_buf_size).
> > > > >
> > > > > I modeled this on the same lines as other fuse notifications where
> > > > > server sends notifications with following format.
> > > > >
> > > > > fuse_out_header + <structure based on notification type>
> > > > >
> > > > > out_header->unique is 0 for notifications to differentiate notifications
> > > > > from request reply.
> > > > >
> > > > > out_header->error contains the code of actual notification being sent.
> > > > > ex. FUSE_NOTIFY_INVAL_INODE or FUSE_NOTIFY_LOCK or FUSE_NOTIFY_DELETE.
> > > > > Right now virtiofs supports only one notification type. But in future
> > > > > we can introduce more types (to support inotify stuff etc).
> > > > >
> > > > > In short, I modeled this on existing notion of fuse notifications
> > > > > (and not fuse reply). And given notifications are asynchronous,
> > > > > we don't know what were original outarg. In fact they might
> > > > > be generated not necessarily in response to a request. And that's
> > > > > why this notion of defining a type of notification (FUSE_NOTIFY_LOCK)
> > > > > and then let driver decide how to handle this notification.
> > > > >
> > > > > I might have completely misunderstood your suggestion. Please help
> > > > > me understand.
> > > >
> > > > Okay, so we are expecting this mechanism to be only used for blocking
> > > > locks.
> > >
> > > Yes, as of now it is only being used only for blocking locks. So there
> > > are two parts to it.
> > >
> > > A. For a blocking operation, server can reply with error=1, and that's
> > >    a signal to client to wait for a notification to arrive later. And
> > >    fuse client will not complete the request and instead will queue it
> > >    in one of the internal lists.
> > >
> > > B. Later server will send a fuse notification event (FUSE_NOTIFY_LOCK)
> > >    when it has acquired the lock. This notification will have unique
> > >    number of request for which this notification has been generated.
> > >    Fuse client will search for the request with expected unique number
> > >    in the list and complete the request.
> > >
> > > I think part A is generic in the sense it could be used for other
> > > kind of blocking requests as well down the line, where server is
> > > doing the blocking operation on behalf of client and will send
> > > notification later. Part B is very specific to blocking locks though.
> >
> > I don't really get why B is specific to blocking locks. But anyway...
> > we are only implementing it for blocking locks for now.
>
> Hmm.., I am wondering do I have to make notification specific to
> lock. All it is doing is returning an "error" code which signifies
> either operation completed successfully or represents error code
> if error occurred.
>
> This probably could be more generic. May be I can call this
> notification FUSE_NOTIFY_OP_COMPLETE. This notification is
> just signalling that a previously issued request has completed.
> Request is identified by notify->unique and result of blocking operation
> is in notify->error. So that way it is more generic and can be
> used for other kind of operations too (and not just locking).

That's exactly what I was thinking.   The implementation can remain
the same for now...

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-10-07 18:46 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 14:38 [PATCH 0/8] virtiofs: Notification queue and blocking posix locks Vivek Goyal
2021-09-30 14:38 ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 1/8] virtiofs: Disable interrupt requests properly Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 2/8] virtiofs: Fix a comment about fuse_dev allocation Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 3/8] virtiofs: Add an index to keep track of first request queue Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 4/8] virtiofs: Decouple queue index and queue type Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 5/8] virtiofs: Add a virtqueue for notifications Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-10-06 12:46   ` Miklos Szeredi
2021-10-06 12:46     ` [Virtio-fs] " Miklos Szeredi
2021-10-06 12:54     ` Vivek Goyal
2021-10-06 12:54       ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 6/8] virtiofs: Add a helper to end request and decrement inflight number Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 14:38 ` [PATCH 7/8] virtiofs: Add new notification type FUSE_NOTIFY_LOCK Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-10-06 12:55   ` Miklos Szeredi
2021-10-06 12:55     ` [Virtio-fs] " Miklos Szeredi
2021-10-06 15:01     ` Vivek Goyal
2021-10-06 15:01       ` [Virtio-fs] " Vivek Goyal
2021-10-06 13:02   ` Miklos Szeredi
2021-10-06 13:02     ` [Virtio-fs] " Miklos Szeredi
2021-10-06 16:12     ` Vivek Goyal
2021-10-06 16:12       ` [Virtio-fs] " Vivek Goyal
2021-10-07 13:45       ` Miklos Szeredi
2021-10-07 13:45         ` [Virtio-fs] " Miklos Szeredi
2021-10-07 14:21         ` Vivek Goyal
2021-10-07 14:21           ` [Virtio-fs] " Vivek Goyal
2021-10-07 18:11           ` Miklos Szeredi
2021-10-07 18:11             ` [Virtio-fs] " Miklos Szeredi
2021-10-07 18:32             ` Vivek Goyal
2021-10-07 18:32               ` [Virtio-fs] " Vivek Goyal
2021-10-07 18:46               ` Miklos Szeredi
2021-10-07 18:46                 ` [Virtio-fs] " Miklos Szeredi
2021-09-30 14:38 ` [PATCH 8/8] virtiofs: Handle reordering of reply and notification event Vivek Goyal
2021-09-30 14:38   ` [Virtio-fs] " Vivek Goyal
2021-09-30 15:43 ` [PATCH 0/8] virtiofs: Notification queue and blocking posix locks Vivek Goyal
2021-09-30 15:43   ` [Virtio-fs] " Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.