All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Connor Kuehl <ckuehl@redhat.com>
Cc: virtio-fs@redhat.com, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, stefanha@redhat.com,
	vgoyal@redhat.com, jasowang@redhat.com, mst@redhat.com
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size
Date: Thu, 18 Mar 2021 16:17:51 +0100	[thread overview]
Message-ID: <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> (raw)
In-Reply-To: <20210318135223.1342795-3-ckuehl@redhat.com>

On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
> 
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of
> the virtqueue (even if the virtqueue is empty), it will be doomed to a
> life of being placed on the workqueue, removed, discovered it won't fit,
> and placed on the workqueue yet again.
> 
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
> 
>   "A driver MUST NOT create a descriptor chain longer than the Queue
>   Size of the device."
> 
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating
> section 2.6.5.3.1 of the virtio spec.

I removed the conditional compilation and renamed the limit.  Also made
virtio_fs_get_tree() bail out if it hit the WARN_ON().  Updated patch below.

The virtio_ring patch in this series should probably go through the respective
subsystem tree.


Thanks,
Miklos

---
From: Connor Kuehl <ckuehl@redhat.com>
Subject: virtiofs: split requests that exceed virtqueue size
Date: Thu, 18 Mar 2021 08:52:22 -0500

If an incoming FUSE request can't fit on the virtqueue, the request is
placed onto a workqueue so a worker can try to resubmit it later where
there will (hopefully) be space for it next time.

This is fine for requests that aren't larger than a virtqueue's maximum
capacity.  However, if a request's size exceeds the maximum capacity of the
virtqueue (even if the virtqueue is empty), it will be doomed to a life of
being placed on the workqueue, removed, discovered it won't fit, and placed
on the workqueue yet again.

Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
Descriptors) of the virtio spec:

  "A driver MUST NOT create a descriptor chain longer than the Queue
  Size of the device."

To fix this, limit the number of pages FUSE will use for an overall
request.  This way, each request can realistically fit on the virtqueue
when it is decomposed into a scattergather list and avoid violating section
2.6.5.3.1 of the virtio spec.

Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fuse/fuse_i.h    |    3 +++
 fs/fuse/inode.c     |    3 ++-
 fs/fuse/virtio_fs.c |   19 +++++++++++++++++--
 3 files changed, 22 insertions(+), 3 deletions(-)

--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -555,6 +555,9 @@ struct fuse_conn {
 	/** Maxmum number of pages that can be used in a single request */
 	unsigned int max_pages;
 
+	/** Constrain ->max_pages to this value during feature negotiation */
+	unsigned int max_pages_limit;
+
 	/** Input queue */
 	struct fuse_iqueue iq;
 
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc
 	fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
 	fc->user_ns = get_user_ns(user_ns);
 	fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
+	fc->max_pages_limit = FUSE_MAX_MAX_PAGES;
 
 	INIT_LIST_HEAD(&fc->mounts);
 	list_add(&fm->fc_entry, &fc->mounts);
@@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu
 				fc->abort_err = 1;
 			if (arg->flags & FUSE_MAX_PAGES) {
 				fc->max_pages =
-					min_t(unsigned int, FUSE_MAX_MAX_PAGES,
+					min_t(unsigned int, fc->max_pages_limit,
 					max_t(unsigned int, arg->max_pages, 1));
 			}
 			if (IS_ENABLED(CONFIG_FUSE_DAX) &&
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -18,6 +18,12 @@
 #include <linux/uio.h>
 #include "fuse_i.h"
 
+/* Used to help calculate the FUSE connection's max_pages limit for a request's
+ * size. Parts of the struct fuse_req are sliced into scattergather lists in
+ * addition to the pages used, so this can help account for that overhead.
+ */
+#define FUSE_HEADER_OVERHEAD    4
+
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
  */
@@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_
 {
 	struct virtio_fs *fs;
 	struct super_block *sb;
-	struct fuse_conn *fc;
+	struct fuse_conn *fc = NULL;
 	struct fuse_mount *fm;
-	int err;
+	unsigned int virtqueue_size;
+	int err = -EIO;
 
 	/* This gets a reference on virtio_fs object. This ptr gets installed
 	 * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
@@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_
 		return -EINVAL;
 	}
 
+	virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+	if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
+		goto out_err;
+
 	err = -ENOMEM;
 	fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL);
 	if (!fc)
@@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_
 	fc->delete_stale = true;
 	fc->auto_submounts = true;
 
+	/* Tell FUSE to split requests that exceed the virtqueue's size */
+	fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
+				    virtqueue_size - FUSE_HEADER_OVERHEAD);
+
 	fsc->s_fs_info = fm;
 	sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
 	if (fsc->s_fs_info) {

WARNING: multiple messages have this Message-ID (diff)
From: Miklos Szeredi <miklos@szeredi.hu>
To: Connor Kuehl <ckuehl@redhat.com>
Cc: mst@redhat.com, jasowang@redhat.com,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, virtio-fs@redhat.com,
	linux-fsdevel@vger.kernel.org, vgoyal@redhat.com
Subject: Re: [Virtio-fs] [PATCH 2/3] virtiofs: split requests that exceed virtqueue size
Date: Thu, 18 Mar 2021 16:17:51 +0100	[thread overview]
Message-ID: <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> (raw)
In-Reply-To: <20210318135223.1342795-3-ckuehl@redhat.com>

On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
> 
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of
> the virtqueue (even if the virtqueue is empty), it will be doomed to a
> life of being placed on the workqueue, removed, discovered it won't fit,
> and placed on the workqueue yet again.
> 
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
> 
>   "A driver MUST NOT create a descriptor chain longer than the Queue
>   Size of the device."
> 
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating
> section 2.6.5.3.1 of the virtio spec.

I removed the conditional compilation and renamed the limit.  Also made
virtio_fs_get_tree() bail out if it hit the WARN_ON().  Updated patch below.

The virtio_ring patch in this series should probably go through the respective
subsystem tree.


Thanks,
Miklos

---
From: Connor Kuehl <ckuehl@redhat.com>
Subject: virtiofs: split requests that exceed virtqueue size
Date: Thu, 18 Mar 2021 08:52:22 -0500

If an incoming FUSE request can't fit on the virtqueue, the request is
placed onto a workqueue so a worker can try to resubmit it later where
there will (hopefully) be space for it next time.

This is fine for requests that aren't larger than a virtqueue's maximum
capacity.  However, if a request's size exceeds the maximum capacity of the
virtqueue (even if the virtqueue is empty), it will be doomed to a life of
being placed on the workqueue, removed, discovered it won't fit, and placed
on the workqueue yet again.

Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
Descriptors) of the virtio spec:

  "A driver MUST NOT create a descriptor chain longer than the Queue
  Size of the device."

To fix this, limit the number of pages FUSE will use for an overall
request.  This way, each request can realistically fit on the virtqueue
when it is decomposed into a scattergather list and avoid violating section
2.6.5.3.1 of the virtio spec.

Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fuse/fuse_i.h    |    3 +++
 fs/fuse/inode.c     |    3 ++-
 fs/fuse/virtio_fs.c |   19 +++++++++++++++++--
 3 files changed, 22 insertions(+), 3 deletions(-)

--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -555,6 +555,9 @@ struct fuse_conn {
 	/** Maxmum number of pages that can be used in a single request */
 	unsigned int max_pages;
 
+	/** Constrain ->max_pages to this value during feature negotiation */
+	unsigned int max_pages_limit;
+
 	/** Input queue */
 	struct fuse_iqueue iq;
 
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc
 	fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
 	fc->user_ns = get_user_ns(user_ns);
 	fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
+	fc->max_pages_limit = FUSE_MAX_MAX_PAGES;
 
 	INIT_LIST_HEAD(&fc->mounts);
 	list_add(&fm->fc_entry, &fc->mounts);
@@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu
 				fc->abort_err = 1;
 			if (arg->flags & FUSE_MAX_PAGES) {
 				fc->max_pages =
-					min_t(unsigned int, FUSE_MAX_MAX_PAGES,
+					min_t(unsigned int, fc->max_pages_limit,
 					max_t(unsigned int, arg->max_pages, 1));
 			}
 			if (IS_ENABLED(CONFIG_FUSE_DAX) &&
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -18,6 +18,12 @@
 #include <linux/uio.h>
 #include "fuse_i.h"
 
+/* Used to help calculate the FUSE connection's max_pages limit for a request's
+ * size. Parts of the struct fuse_req are sliced into scattergather lists in
+ * addition to the pages used, so this can help account for that overhead.
+ */
+#define FUSE_HEADER_OVERHEAD    4
+
 /* List of virtio-fs device instances and a lock for the list. Also provides
  * mutual exclusion in device removal and mounting path
  */
@@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_
 {
 	struct virtio_fs *fs;
 	struct super_block *sb;
-	struct fuse_conn *fc;
+	struct fuse_conn *fc = NULL;
 	struct fuse_mount *fm;
-	int err;
+	unsigned int virtqueue_size;
+	int err = -EIO;
 
 	/* This gets a reference on virtio_fs object. This ptr gets installed
 	 * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
@@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_
 		return -EINVAL;
 	}
 
+	virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+	if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
+		goto out_err;
+
 	err = -ENOMEM;
 	fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL);
 	if (!fc)
@@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_
 	fc->delete_stale = true;
 	fc->auto_submounts = true;
 
+	/* Tell FUSE to split requests that exceed the virtqueue's size */
+	fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
+				    virtqueue_size - FUSE_HEADER_OVERHEAD);
+
 	fsc->s_fs_info = fm;
 	sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
 	if (fsc->s_fs_info) {


  reply	other threads:[~2021-03-18 15:19 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18 13:52 [PATCH 0/3] virtiofs: split requests that exceed virtqueue size Connor Kuehl
2021-03-18 13:52 ` [Virtio-fs] " Connor Kuehl
2021-03-18 13:52 ` [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size Connor Kuehl
2021-03-18 13:52   ` [Virtio-fs] " Connor Kuehl
2021-03-22  3:22   ` Jason Wang
2021-03-22  3:22     ` [Virtio-fs] " Jason Wang
2021-03-22  3:22     ` Jason Wang
2021-03-22  3:41     ` Jason Wang
2021-03-22  8:17     ` Michael S. Tsirkin
2021-03-22  8:17       ` [Virtio-fs] " Michael S. Tsirkin
2021-03-22  8:17       ` Michael S. Tsirkin
2021-03-23  2:38       ` Jason Wang
2021-03-23  2:38         ` [Virtio-fs] " Jason Wang
2021-03-23  2:38         ` Jason Wang
2021-03-18 13:52 ` [PATCH 2/3] virtiofs: split requests that exceed virtqueue size Connor Kuehl
2021-03-18 13:52   ` [Virtio-fs] " Connor Kuehl
2021-03-18 15:17   ` Miklos Szeredi [this message]
2021-03-18 15:17     ` Miklos Szeredi
2021-03-18 15:52     ` Connor Kuehl
2021-03-18 15:52       ` [Virtio-fs] " Connor Kuehl
2021-03-20 20:04       ` Michael S. Tsirkin
2021-03-20 20:04         ` [Virtio-fs] " Michael S. Tsirkin
2021-03-20 20:04         ` Michael S. Tsirkin
2021-03-22 19:01     ` Vivek Goyal
2021-03-22 19:01       ` [Virtio-fs] " Vivek Goyal
2021-03-22 19:01       ` Vivek Goyal
2021-03-24 15:09     ` Connor Kuehl
2021-03-24 15:09       ` [Virtio-fs] " Connor Kuehl
2021-03-24 15:09       ` Connor Kuehl
2021-03-24 15:30       ` Miklos Szeredi
2021-03-24 15:30         ` [Virtio-fs] " Miklos Szeredi
2021-03-24 15:31         ` Connor Kuehl
2021-03-24 15:31           ` [Virtio-fs] " Connor Kuehl
2021-03-24 15:31           ` Connor Kuehl
2021-03-19 13:49   ` Vivek Goyal
2021-03-19 13:49     ` [Virtio-fs] " Vivek Goyal
2021-03-19 13:49     ` Vivek Goyal
2021-03-19 14:16     ` Connor Kuehl
2021-03-19 14:16       ` [Virtio-fs] " Connor Kuehl
2021-03-19 14:16       ` Connor Kuehl
2021-03-22 15:47   ` Stefan Hajnoczi
2021-03-22 15:47     ` [Virtio-fs] " Stefan Hajnoczi
2021-03-22 15:47     ` Stefan Hajnoczi
2021-03-18 13:52 ` [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment Connor Kuehl
2021-03-18 13:52   ` [Virtio-fs] " Connor Kuehl
2021-03-22  3:42   ` Jason Wang
2021-03-22  3:42     ` [Virtio-fs] " Jason Wang
2021-03-22  3:42     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com \
    --to=miklos@szeredi.hu \
    --cc=ckuehl@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtio-fs@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.